{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Memoization, Code Optimization and Time Complexity\n",
"- based on the Latin word memorandum, meaning \"to be remembered\" \n",
"- similar to word memorization, its a technique used in coding to improve program runtime by memorizing intermediate solutions\n",
"- if an intermediate results (from some computations) are used over and again, the results can be remembered/stored to avoid unnecessary repeating calculations\n",
"- using dict type datastructure, one can memoize intermediate results from functions esp. in recurisive solutions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## naive recursive fib function"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"fib at 30th position = 1346269\n",
"fib function count = 2692537\n"
]
}
],
"source": [
"count = 0\n",
"def fib(n):\n",
" global count\n",
" count += 1\n",
" if n <= 1:\n",
" return 1\n",
" f = fib(n-1) + fib(n-2)\n",
" return f\n",
"\n",
"n=30 #40, 50? takes a while\n",
"print(\"fib at {}th position = {}\".format(n, fib(n)))\n",
"print(\"fib function count = {}\".format(count))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## theoritical computational complexity\n",
"- Worst case Big-Oh Time Complexity: O(n) = time to calculate Fib(n-1) + Fib(n-2) + time to add them: O(1)\n",
" - T(n) = T(n-1) + T(n-2) + O(1)\n",
" - T(n) = O(2n-1) + O(2n-2) + O(1)\n",
" - T(n) = O(2n)\n",
"- Space Complexity = O(n) due to call stack"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## finding actual time - timeit\n",
"- timeit - measures execution time of small code snippets\n",
"- timeit.timeit(stmt='pass', setup='pass', timer=[default timer], number=1000000, globals=None)\n",
"- returns time in seconds"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Help on module timeit:\n",
"\n",
"NAME\n",
" timeit - Tool for measuring execution time of small code snippets.\n",
"\n",
"MODULE REFERENCE\n",
" https://docs.python.org/3.6/library/timeit\n",
" \n",
" The following documentation is automatically generated from the Python\n",
" source files. It may be incomplete, incorrect or include features that\n",
" are considered implementation detail and may vary between Python\n",
" implementations. When in doubt, consult the module reference at the\n",
" location listed above.\n",
"\n",
"DESCRIPTION\n",
" This module avoids a number of common traps for measuring execution\n",
" times. See also Tim Peters' introduction to the Algorithms chapter in\n",
" the Python Cookbook, published by O'Reilly.\n",
" \n",
" Library usage: see the Timer class.\n",
" \n",
" Command line usage:\n",
" python timeit.py [-n N] [-r N] [-s S] [-t] [-c] [-p] [-h] [--] [statement]\n",
" \n",
" Options:\n",
" -n/--number N: how many times to execute 'statement' (default: see below)\n",
" -r/--repeat N: how many times to repeat the timer (default 3)\n",
" -s/--setup S: statement to be executed once initially (default 'pass').\n",
" Execution time of this setup statement is NOT timed.\n",
" -p/--process: use time.process_time() (default is time.perf_counter())\n",
" -t/--time: use time.time() (deprecated)\n",
" -c/--clock: use time.clock() (deprecated)\n",
" -v/--verbose: print raw timing results; repeat for more digits precision\n",
" -u/--unit: set the output time unit (usec, msec, or sec)\n",
" -h/--help: print this usage message and exit\n",
" --: separate options from statement, use when statement starts with -\n",
" statement: statement to be timed (default 'pass')\n",
" \n",
" A multi-line statement may be given by specifying each line as a\n",
" separate argument; indented lines are possible by enclosing an\n",
" argument in quotes and using leading spaces. Multiple -s options are\n",
" treated similarly.\n",
" \n",
" If -n is not given, a suitable number of loops is calculated by trying\n",
" successive powers of 10 until the total time is at least 0.2 seconds.\n",
" \n",
" Note: there is a certain baseline overhead associated with executing a\n",
" pass statement. It differs between versions. The code here doesn't try\n",
" to hide it, but you should be aware of it. The baseline overhead can be\n",
" measured by invoking the program without arguments.\n",
" \n",
" Classes:\n",
" \n",
" Timer\n",
" \n",
" Functions:\n",
" \n",
" timeit(string, string) -> float\n",
" repeat(string, string) -> list\n",
" default_timer() -> float\n",
"\n",
"CLASSES\n",
" builtins.object\n",
" Timer\n",
" \n",
" class Timer(builtins.object)\n",
" | Class for timing execution speed of small code snippets.\n",
" | \n",
" | The constructor takes a statement to be timed, an additional\n",
" | statement used for setup, and a timer function. Both statements\n",
" | default to 'pass'; the timer function is platform-dependent (see\n",
" | module doc string). If 'globals' is specified, the code will be\n",
" | executed within that namespace (as opposed to inside timeit's\n",
" | namespace).\n",
" | \n",
" | To measure the execution time of the first statement, use the\n",
" | timeit() method. The repeat() method is a convenience to call\n",
" | timeit() multiple times and return a list of results.\n",
" | \n",
" | The statements may contain newlines, as long as they don't contain\n",
" | multi-line string literals.\n",
" | \n",
" | Methods defined here:\n",
" | \n",
" | __init__(self, stmt='pass', setup='pass', timer=, globals=None)\n",
" | Constructor. See class doc string.\n",
" | \n",
" | autorange(self, callback=None)\n",
" | Return the number of loops and time taken so that total time >= 0.2.\n",
" | \n",
" | Calls the timeit method with *number* set to successive powers of\n",
" | ten (10, 100, 1000, ...) up to a maximum of one billion, until\n",
" | the time taken is at least 0.2 second, or the maximum is reached.\n",
" | Returns ``(number, time_taken)``.\n",
" | \n",
" | If *callback* is given and is not None, it will be called after\n",
" | each trial with two arguments: ``callback(number, time_taken)``.\n",
" | \n",
" | print_exc(self, file=None)\n",
" | Helper to print a traceback from the timed code.\n",
" | \n",
" | Typical use:\n",
" | \n",
" | t = Timer(...) # outside the try/except\n",
" | try:\n",
" | t.timeit(...) # or t.repeat(...)\n",
" | except:\n",
" | t.print_exc()\n",
" | \n",
" | The advantage over the standard traceback is that source lines\n",
" | in the compiled template will be displayed.\n",
" | \n",
" | The optional file argument directs where the traceback is\n",
" | sent; it defaults to sys.stderr.\n",
" | \n",
" | repeat(self, repeat=3, number=1000000)\n",
" | Call timeit() a few times.\n",
" | \n",
" | This is a convenience function that calls the timeit()\n",
" | repeatedly, returning a list of results. The first argument\n",
" | specifies how many times to call timeit(), defaulting to 3;\n",
" | the second argument specifies the timer argument, defaulting\n",
" | to one million.\n",
" | \n",
" | Note: it's tempting to calculate mean and standard deviation\n",
" | from the result vector and report these. However, this is not\n",
" | very useful. In a typical case, the lowest value gives a\n",
" | lower bound for how fast your machine can run the given code\n",
" | snippet; higher values in the result vector are typically not\n",
" | caused by variability in Python's speed, but by other\n",
" | processes interfering with your timing accuracy. So the min()\n",
" | of the result is probably the only number you should be\n",
" | interested in. After that, you should look at the entire\n",
" | vector and apply common sense rather than statistics.\n",
" | \n",
" | timeit(self, number=1000000)\n",
" | Time 'number' executions of the main statement.\n",
" | \n",
" | To be precise, this executes the setup statement once, and\n",
" | then returns the time it takes to execute the main statement\n",
" | a number of times, as a float measured in seconds. The\n",
" | argument is the number of times through the loop, defaulting\n",
" | to one million. The main statement, the setup statement and\n",
" | the timer function to be used are passed to the constructor.\n",
" | \n",
" | ----------------------------------------------------------------------\n",
" | Data descriptors defined here:\n",
" | \n",
" | __dict__\n",
" | dictionary for instance variables (if defined)\n",
" | \n",
" | __weakref__\n",
" | list of weak references to the object (if defined)\n",
"\n",
"FUNCTIONS\n",
" default_timer = perf_counter(...)\n",
" perf_counter() -> float\n",
" \n",
" Performance counter for benchmarking.\n",
" \n",
" repeat(stmt='pass', setup='pass', timer=, repeat=3, number=1000000, globals=None)\n",
" Convenience function to create Timer object and call repeat method.\n",
" \n",
" timeit(stmt='pass', setup='pass', timer=, number=1000000, globals=None)\n",
" Convenience function to create Timer object and call timeit method.\n",
"\n",
"DATA\n",
" __all__ = ['Timer', 'timeit', 'repeat', 'default_timer']\n",
"\n",
"FILE\n",
" /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/timeit.py\n",
"\n",
"\n"
]
}
],
"source": [
"import timeit\n",
"help(timeit)"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.7455664590088418\n"
]
}
],
"source": [
"#print(globals())\n",
"import timeit\n",
"print(timeit.timeit('fib(30)', number=1, globals=globals()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## memoized recursive fib function"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"fib at 60th position = 2504730781961\n",
"fib function count = 119\n"
]
}
],
"source": [
"memo = {}\n",
"count = 0\n",
"def MemoizedFib(n):\n",
" global memo, count\n",
" count += 1\n",
" if n <= 1:\n",
" return 1\n",
" if n in memo:\n",
" return memo[n]\n",
" memo[n] = MemoizedFib(n-1) + MemoizedFib(n-2)\n",
" return memo[n]\n",
"\n",
"n=60 #try 40, 50, 60\n",
"print(\"fib at {}th position = {}\".format(n, MemoizedFib(n)))\n",
"print(\"fib function count = {}\".format(count)) "
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.002014205005252734\n"
]
}
],
"source": [
"import timeit\n",
"print(timeit.timeit('MemoizedFib(1000)', number=1, globals=globals()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## computational complexity of memoized fib\n",
"- Time Complexity - O(n)\n",
"- Space Complexity - O(n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## exercise\n",
"- Write a program that finds factorials of a bunch of positive integer numbers. Would memoization improve time complexity of the program? "
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]\n"
]
}
],
"source": [
"# find the factorials of the first 20 positive numbers\n",
"# optimize the program so it finds the factorials in the \n",
"# shortest time possible\n",
"nums = list(range(1, 21))\n",
"print(nums)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# Python code optimization\n",
"- https://wiki.python.org/moin/PythonSpeed/PerformanceTips"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summary\n",
"\n",
"- use functions and local variables instead of globals, eventhough function calls are relatively high\n",
"- avoid dot . member access specifier\n",
"- user built-in list.sort and sorted with key using itemgetter function if required\n",
"- avoid string concatenation; use \"\".join(somelist) instead\n",
"- use list comprehension and map() instead of loops\n",
"- while working with dict (esp. initializing), use try catch instead of key test\n",
"- use defaultdict class from collections\n",
"- doing stuff less often - change sys.setswitchinterval(sys.maxsize) to as high as possible if not running threads and catching signals (default is 100?)\n",
"- use addition and subtraction instead of multiplicaiton and divisions"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.005"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import sys\n",
"sys.getswitchinterval()"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"9223372036854775807"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sys.maxsize"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [],
"source": [
"import timeit"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.046036384999752045"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"code = '''\n",
"x = 47\n",
"x = x * 2\n",
"'''\n",
"timeit.timeit(stmt=code)"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.07047916000010446"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"code = '''\n",
"x = 47\n",
"x = x << 1\n",
"'''\n",
"timeit.timeit(stmt=code)"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.04736856800445821"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"code = '''\n",
"x = 47\n",
"x = x + x\n",
"'''\n",
"timeit.timeit(stmt=code)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Time complexity\n",
"- Time complexity of various Python built-in data sturctures and functions \n",
"- https://wiki.python.org/moin/TimeComplexity"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}