{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

cs1001.py , Tel Aviv University, Spring 2019

\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Recitation 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We talked about binary numbers (with some theory) and base conversions and implemented an algorithm for base conversion (from decimal to 2<=b<=36).\n", "We examined Python's memory model and analyzed the efficiency of constructing a list using + or += operators.\n", "\n", "\n", "\n", "### Takeaways:\n", "1. Make sure you understand binary numbers and base conversions (including the algorithms for conversion to and from a base b to decimal). It is a very useful tool in computer science.\n", "2. Elements of a list can be changed from inside a function, if this list is given as a parameter. Note that one should use dedicated list functions or the [] operator for mutating the list.\n", "3. Use [Python tutor](http://www.pythontutor.com) in order to understand what's going on in terms of memory. It can be very helpful.\n", "4. Try to analyze the number of operations your function does to see how will its runtime scale as a function of the input (we will elabore on this soon).\n", " \n", "### Python tutor guidelines:\n", "Before you click \"Visualize Execution\" button, you may want to use the following settings (can be adjusted via the drop boxes next to the textbox):\n", "
    \n", "
  1. Python 3.6
  2. \n", "
  3. \n", " Show exited frames (Python)
  4. \n", "
  5. Render all objects on the heap
  6. \n", "
  7. Draw pointers as arrows [default]
  8. \n", "
\n", "\n", "\n", "## The binary system and base conversions\n", "We use decimal numbers, that is, numbers in base 10. These numbers are represented using 10 digits (0-9).\n", "Binary numbers (in base 2) use two possible digits: 0-1.\n", "In general, a number in base $b$ will be represented using $b$ possible digits.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Code for printing several outputs in one cell (not part of the recitation):" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from IPython.core.interactiveshell import InteractiveShell\n", "InteractiveShell.ast_node_interactivity = \"all\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Base conversions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using Python functions:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'0b1010'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "str" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "'0xa1'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "345" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "13" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "161" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "ename": "ValueError", "evalue": "invalid literal for int() with base 2: 'a1'", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[0mint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"1101\"\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m2\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[0mint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"a1\"\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m16\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 7\u001b[1;33m \u001b[0mint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"a1\"\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m2\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mValueError\u001b[0m: invalid literal for int() with base 2: 'a1'" ] } ], "source": [ "bin(10)\n", "type(bin(10))\n", "hex(161)\n", "int(\"345\")\n", "int(\"1101\", 2)\n", "int(\"a1\", 16)\n", "int(\"a1\", 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Converting from binary to decimal\n", " \n", "Let $x_{base 2} = a_{n-1} ... a_1 a_0$ be a binary number in base 2 with $n$ digits. \n", "The following formula returns its decimal representation:\n", " $$x_{base 10} = \\sum_{0 \\le k \\le n-1} a_k \\cdot 2^k$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Converting from decimal to binary \n", "Converting from decimal to binary is done by integer division and modulo operations. \n", "Here is an implementation of the algorithm for conversion from decimal to a general base $b$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Our version for convert_base:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def convert_base(n,b):\n", " '''convert_base(int, int)->string\n", " Returns the textual representation \n", " of n(decimal) in base 2<=b<=10\n", " '''\n", " result = \"\"\n", " while n != 0:\n", " digit = n % b\n", " n //= b\n", " result = str(digit) + result\n", " \n", " return result\n", " \n", " \n", " \n", " " ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'1010'" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "convert_base(10,2)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "''" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "convert_base(0, 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "convert_base above mishandles n = 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Improved version of convert_base:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "def convert_base(n,b):\n", " '''convert_base(int, int)->string\n", " Returns the textual representation \n", " of n(decimal) in base 2<=b<=10\n", " '''\n", " if n == 0:\n", " return \"0\"\n", " \n", " result = \"\"\n", " while n != 0:\n", " digit = n % b\n", " n //= b\n", " result = str(digit) + result\n", " \n", " return result\n", " \n", " \n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'0'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "'110'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "convert_base(0,2)\n", "convert_base(12,3)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'101'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "convert_base(161,16)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The improved version above fails for 10string\n", " Returns the textual representation \n", " of n(decimal) in base 2<=b<=10\n", " '''\n", " assert 2 <= b <= 36\n", " \n", " if n == 0:\n", " return \"0\"\n", " \n", " result = \"\"\n", " alphabet = \"0123456789abcdefghijklmnopqrstuvwxyz\"\n", " while n != 0:\n", " digit = n % b\n", " n //= b\n", " result = alphabet[digit] + result\n", " \n", " return result\n", " \n", " \n" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'a1'" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "'1010'" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "ename": "AssertionError", "evalue": "", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mAssertionError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[0mconvert_base\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m161\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m16\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[0mconvert_base\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m10\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m2\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[0mconvert_base\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m10\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m80\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;32m\u001b[0m in \u001b[0;36mconvert_base\u001b[1;34m(n, b)\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[0mof\u001b[0m \u001b[0mn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdecimal\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mbase\u001b[0m \u001b[1;36m2\u001b[0m\u001b[1;33m<=\u001b[0m\u001b[0mb\u001b[0m\u001b[1;33m<=\u001b[0m\u001b[1;36m10\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m '''\n\u001b[1;32m----> 6\u001b[1;33m \u001b[1;32massert\u001b[0m \u001b[1;36m2\u001b[0m \u001b[1;33m<=\u001b[0m \u001b[0mb\u001b[0m \u001b[1;33m<=\u001b[0m \u001b[1;36m36\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 7\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 8\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mn\u001b[0m \u001b[1;33m==\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mAssertionError\u001b[0m: " ] } ], "source": [ "convert_base(161, 16)\n", "convert_base(10, 2)\n", "convert_base(10, 80)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Memory model\n", "By understanding Python's memory model we \n", "
    \n", "
  1. can distinguish between objects that can be mutated and those that cannot. We can also understand how to mutate objects.
  2. \n", "
  3. can write a more efficient code.
  4. \n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "#### Motivation examples:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "30" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def change_num(num):\n", " num = 999\n", "\n", " \n", "x = 30\n", "change_num(x)\n", "x\n", "\n", " \n", " " ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 999]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def add_to_list(lst):\n", " lst.append(999)\n", " \n", "mylst = [1,2,3]\n", "add_to_list(mylst)\n", "mylst" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def change_lst(lst):\n", " lst = []\n", "\n", "mylst = [1,2,3]\n", "change_lst(mylst)\n", "mylst" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Programs examined with [Python tutor](http:\\\\www.pythontutor.com):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#1 basics #####################################################\n", "a = 1000 #1\n", "b = \"hello\" #2\n", "a += len(b) #3\n", "c = 2*b[:2] #4\n", "if b!=c: #5\n", " c = b #6\n", "del c #7" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#2 lists #####################################################\n", "a = 1000 #1\n", "d = [a,2] #2\n", "d[1] = -1 #3\n", "a = 1003 #4\n", "for x in d: #5a\n", " x = 7 #5b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#3 funcs #####################################################\n", "a = 1000 #1\n", "b = \"hello\" #2\n", "def is_palindrom(a): #3a\n", " b = a[::-1] #3b\n", " return a==b #3c\n", "is_pal = is_palindrom #4\n", "x = is_pal(b) #5" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#4 lists+funcs #####################################################\n", "a = 1000 #1\n", "d = [a,2] #2\n", "def f(a,d): #3a\n", " a = 2000 #3b\n", " d[0] = a #3c\n", " d = [] #3d\n", " return d #3e\n", "x = f(a,d) #4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Extend(), append(), sort(), and sorted():" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1787938575240" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[3, 2, 1]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "1787938858056" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[3, 2, 1, 0]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "1787938858056" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[3, 2, 1, 0, 4]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "1787938858056" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[3, 5, 2, 1, 0, 4]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "1787938858056" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[3, 5, 2, 1, 0, 4, 6, 7]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "1787938858056" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[3, 5, 2, 1, 0, 4, 6, 7, 8]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#5 – lists operators and methods #####################################################\n", "lst = [3,2,1]\n", "id(lst)\n", "lst\n", "lst = lst+[0] \n", "id(lst)\n", "lst\n", "lst.append(4)\n", "id(lst)\n", "lst\n", "lst.insert(1,5)\n", "id(lst)\n", "lst\n", "lst.extend([6,7])\n", "id(lst)\n", "lst\n", "lst += [8] \n", "id(lst)\n", "lst" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[3, 5, 2, 1, 0, 4, 6, 7, 8]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "1787938858056" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "1787938858056" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[3, 5, 2, 1, 0, 4, 6, 7, 8]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "1787938858056" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[0, 1, 2, 3, 4, 5, 6, 7, 8]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lst\n", "id(lst)\n", "#sorted returns a sorted copy of the list, and does not change the original one\n", "lst2 = sorted(lst)\n", "id(lst)\n", "lst\n", "#list.sort() sorts in place, and does not return anything\n", "lst.sort()\n", "id(lst)\n", "lst" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## comparison between + and += " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use + or += operators in order to construct the same list.\n", "\n", "Using operator +: less efficient, since a new list is constructed every iteration\n", "the number of integers written to memory equals 1+2+ ... + n = (n+1)*n/2\n", "(= asumptotically speaking, grows quadratically with n)\n", "\n", "Using operator +=: more efficient, since the same list is extended again and again\n", "the number of integers written to memory equals n\n", "(= asumptotically speaking, grows linearly with n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### memory and running times" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "import time" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Extending a list n= 20000 times, using the + operator took 0.5090059999999994 seconds\n", "Extending a list n= 20000 times, using the += operator (extend()) took 0.0018683000000692118 seconds\n" ] } ], "source": [ "n= 20000\n", "\n", "lst = []\n", "t0 = time.perf_counter()\n", "for i in range(n):\n", "\tlst = lst + [i]\n", "t1 = time.perf_counter()\n", "print(\"Extending a list n=\", n, \"times, using the + operator took\",t1-t0, \"seconds\")\n", "\n", "\n", "\n", "lst = []\n", "t0 = time.perf_counter()\n", "for i in range(n):\n", "\tlst += [i]\n", "t1 = time.perf_counter()\n", "print(\"Extending a list n=\", n, \"times, using the += operator (extend()) took\",t1-t0, \"seconds\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We double the size of the input." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Extending a list n= 40000 times, using the + operator took 2.10560119999991 seconds\n", "Extending a list n= 40000 times, using the += operator (extend()) took 0.004132899999945039 seconds\n" ] } ], "source": [ "\n", "n= 40000\n", "\n", "lst = []\n", "t0 = time.perf_counter()\n", "for i in range(n):\n", "\tlst = lst + [i]\n", "t1 = time.perf_counter()\n", "print(\"Extending a list n=\", n, \"times, using the + operator took\",t1-t0, \"seconds\")\n", "\n", "\n", "\n", "lst = []\n", "t0 = time.perf_counter()\n", "for i in range(n):\n", "\tlst += [i]\n", "t1 = time.perf_counter()\n", "print(\"Extending a list n=\", n, \"times, using the += operator (extend()) took\",t1-t0, \"seconds\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "______________________________________________________________\n", "\n", "In accordance with the theory, the running time of the inefficient code increases by approximately 2**2 = 4 as n increases by 2, while the running time of the efficient code increases by approximately 2.\n", "Apparently n is large enough for the constants not to play a serious role, so that the measurements agree with the asymptotic analysis." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" } }, "nbformat": 4, "nbformat_minor": 1 }