{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

cs1001.py , Tel Aviv University, Fall 2019-2020

\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Recitation 9" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We discussed two data structures - Linked Lists and Binary Search Trees.\n", "\n", "#### Takeaways:\n", "- Important properties of Linked lists: \n", " - Not stored continuously in memory.\n", " - Allows for deletion and insertion after a __given__ element in $O(1)$ time.\n", " - Accessing the $i$'th element takes $O(i)$ time.\n", " - Search takes $O(n)$ time (no random access $\\implies$ no $O(\\log{n})$ search).\n", "- Important properties of Binary Search Trees:\n", " - Insert and find take $O(h)$ time where $h$ is the height of the tree.\n", " - When a tree containing $n$ nodes is balanced, $h = O(\\log{n})$.\n", " - Many methods in this class are implemented using recursion." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Code for printing several outputs in one cell (not part of the recitation):" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "from IPython.core.interactiveshell import InteractiveShell\n", "InteractiveShell.ast_node_interactivity = \"all\"" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "## This file contains functions for the representation of binary trees.\n", "## used in class Binary_search_tree's __repr__\n", "## Written by a former student in the course - thanks to Amitai Cohen\n", "## No need to fully understand this code\n", "\n", "def printree(t, bykey = True):\n", " \"\"\"Print a textual representation of t\n", " bykey=True: show keys instead of values\"\"\"\n", " #for row in trepr(t, bykey):\n", " # print(row)\n", " return trepr(t, bykey)\n", "\n", "def trepr(t, bykey = False):\n", " \"\"\"Return a list of textual representations of the levels in t\n", " bykey=True: show keys instead of values\"\"\"\n", " if t==None:\n", " return [\"#\"]\n", "\n", " thistr = str(t.key) if bykey else str(t.val)\n", "\n", " return conc(trepr(t.left,bykey), thistr, trepr(t.right,bykey))\n", "\n", "def conc(left,root,right):\n", " \"\"\"Return a concatenation of textual represantations of\n", " a root node, its left node, and its right node\n", " root is a string, and left and right are lists of strings\"\"\"\n", " \n", " lwid = len(left[-1])\n", " rwid = len(right[-1])\n", " rootwid = len(root)\n", " \n", " result = [(lwid+1)*\" \" + root + (rwid+1)*\" \"]\n", " \n", " ls = leftspace(left[0])\n", " rs = rightspace(right[0])\n", " result.append(ls*\" \" + (lwid-ls)*\"_\" + \"/\" + rootwid*\" \" + \"|\" + rs*\"_\" + (rwid-rs)*\" \")\n", " \n", " for i in range(max(len(left),len(right))):\n", " row = \"\"\n", " if ihttp://getpython3.com/diveintopython3/special-method-names.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linked Lists" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A linked list is a linear data structure (i.e. - nodes can be accessed one after the other). The list is composed of nodes, where each node contains a value and a \"pointer\" to the next node in the list.\n", "\n", "Linked lists support operations such as insertion, deletion, search and many more." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "class Node():\n", " def __init__(self, val):\n", " self.value = val\n", " self.next = None\n", " \n", " def __repr__(self):\n", " #return \"[\" + str(self.value) + \",\" + str(id(self.next))+ \"]\"\n", " \n", " #for today's recitation, we print the id of self instead of self.next\n", " return \"[\" + str(self.value) + \",\" + str(id(self))+ \"]\"\n", "\n", "\n", "\n", "class Linked_list():\n", "\n", " def __init__(self, seq=None):\n", " self.next = None\n", " self.len = 0\n", " if seq != None:\n", " for x in seq[::-1]:\n", " self.add_at_start(x)\n", " \n", "\n", " def __repr__(self):\n", " out = \"\"\n", " p = self.next\n", " while p != None:\n", " out += str(p) + \" \" #str envokes __repr__ of class Node\n", " p = p.next\n", " return out\n", "\n", " \n", " def __len__(self):\n", " ''' called when using Python's len() '''\n", " return self.len\n", " \n", " \n", " def add_at_start(self, val):\n", " ''' add node with value val at the list head '''\n", " p = self\n", " tmp = p.next\n", " p.next = Node(val)\n", " p.next.next = tmp\n", " self.len += 1\n", " \n", " def add_at_end(self, val):\n", " ''' add node with value val at the list tail '''\n", " p = self\n", " while (p.next != None):\n", " p = p.next\n", " p.next = Node(val)\n", " self.len += 1\n", " \n", " \n", " def insert(self, loc, val):\n", " ''' add node with value val after location 0<=loc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Adding elements one by one. Try using Python tutor. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[b,91843568] [a,91843664] " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "l = Linked_list()\n", "l.add_at_start(\"a\")\n", "l.add_at_start(\"b\")\n", "l" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### A short summary of methods complexity of Linked lists vs. lists:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Method List Linked list
init $O(1)$ $O(1)$
init with $k$ items $O(k)$ $O(k)$
for the rest of the methods we assume that the structure contains $n$ items
add at start $O(n)$ $O(1)$
add at end $O(1)$ $O(n)$
$lst[k]$ (get the $k$th item) $O(1)$ $O(k)$
delete $lst[k]$ $O(n-k)$ $O(k)$
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Reversing a linked list in $O(1)$ additional memory. \n", "See the code and demo here" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[a,91941296] [b,91941232] [c,91941200] " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[c,91941200] [b,91941232] [a,91941296] " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "class Node():\n", " def __init__(self, val):\n", " self.value = val\n", " self.next = None\n", " \n", " def __repr__(self):\n", " #return \"[\" + str(self.value) + \",\" + str(id(self.next))+ \"]\"\n", " \n", " #for today's recitation, we print the id of self instead of self.next\n", " return \"[\" + str(self.value) + \",\" + str(id(self))+ \"]\"\n", "\n", "\n", "class Linked_list():\n", "\n", " def __init__(self, seq=None):\n", " self.next = None\n", " self.len = 0\n", " if seq != None:\n", " for x in seq[::-1]:\n", " self.add_at_start(x)\n", " \n", "\n", " def __repr__(self):\n", " out = \"\"\n", " p = self.next\n", " while p != None:\n", " out += str(p) + \" \" #str envokes __repr__ of class Node\n", " p = p.next\n", " return out\n", "\n", " \n", " def __len__(self):\n", " ''' called when using Python's len() '''\n", " return self.len\n", " \n", " \n", " def add_at_start(self, val):\n", " ''' add node with value val at the list head '''\n", " p = self\n", " tmp = p.next\n", " p.next = Node(val)\n", " p.next.next = tmp\n", " self.len += 1\n", " \n", " def add_at_end(self, val):\n", " ''' add node with value val at the list tail '''\n", " p = self\n", " while (p.next != None):\n", " p = p.next\n", " p.next = Node(val)\n", " self.len += 1\n", " \n", " \n", " def insert(self, loc, val):\n", " ''' add node with value val after location 0<=loc node.key:\n", " if node.right == None:\n", " node.right = Tree_node(key, val)\n", " else:\n", " insert_rec(node.right, key, val)\n", " return\n", " \n", " if self.root == None: #empty tree\n", " self.root = Tree_node(key, val)\n", " else:\n", " insert_rec(self.root, key, val)\n", "\n", "\n", " def minimum(self):\n", " ''' return node with minimal key '''\n", " if self.root == None:\n", " return None\n", " node = self.root\n", " left = node.left\n", " while left != None:\n", " node = left\n", " left = node.left\n", " return node\n", "\n", "\n", " def depth(self):\n", " ''' return depth of tree, uses recursion'''\n", " def depth_rec(node):\n", " if node == None:\n", " return -1\n", " else:\n", " return 1 + max(depth_rec(node.left), depth_rec(node.right))\n", "\n", " return depth_rec(self.root)\n", "\n", "\n", " def size(self):\n", " ''' return number of nodes in tree, uses recursion '''\n", " def size_rec(node):\n", " if node == None:\n", " return 0\n", " else:\n", " return 1 + size_rec(node.left) + size_rec(node.right)\n", "\n", " return size_rec(self.root)\n", " \n", " def inorder(self):\n", " '''prints the keys of the tree in a sorted order'''\n", " def inorder_rec(node):\n", " if node == None:\n", " return\n", " inorder_rec(node.left)\n", " print(node.key)\n", " inorder_rec(node.right)\n", " \n", " inorder_rec(self.root)\n", "\n", "\n", "t = Binary_search_tree()\n", "t.insert(4,'a')\n", "t.insert(2,'a')\n", "t.insert(6,'a')\n", "t.insert(1,'a')\n", "t.insert(3,'a')\n", "t.insert(5,'a')\n", "t.insert(7,'a')\n", "t\n", "t.inorder()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Claim: An inorder traversal of a binary search tree prints the keys in ascending order.\n", "\n", "Proof, by complete induction on the size of tree:\n", "* Base - for $n = 1$ the claim is trivial\n", "* Assume the claim holds for all $i \\leq n$\n", "* For a tree of size $n+1$ with root $v$ consider the following:\n", " * Both the right and the left subtree of $v$ have size at most $n$\n", " * By induction, all keys in $v.left$ are printed in ascending order (and they are all smaller than $v.key$)\n", " * Next, $v.key$ is printed\n", " * Finally, by induction, all keys in $v.right$ are printed in ascending order (and they are all larger than $v.key$)\n", " * Thus, the keys of the tree (which has size $n+1$) are printed in ascending order\n", "\n", "\n", "More information on complete induction: https://en.wikipedia.org/wiki/Mathematical_induction#Complete_(strong)_induction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Question: Assume $N = 2^n - 1$ for some $n$, find a method of inserting the numbers $[1,...,N]$ to a BST such that it is completely balanced (i.e. - it is a complete tree).\n", "\n", "Solution: As we usually do with trees, we give a recursive solution. Our input will be a node and a first and last index to enter into the tree rooted in the node. We start with the root, $first = 1, last = 2^n - 1$\n", "\n", "- Base case: If $first = last$ then we simply need to create a root with no sons labeled $first$ \n", "- Otherwise, we find the mid point $mid = (first + last - 1 ) // 2$. We recursively insert $first,...,mid$ into the left son of the node, $mid + 1$ into the current node and $mid + 2, ..., last$ into the right son of the node.\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "[' 8 ',\n", " ' ______________/ |________________ ',\n", " ' 4 12 ',\n", " ' ______/ |______ _______/ |_______ ',\n", " ' 2 6 10 14 ',\n", " ' __/ |__ __/ |__ __/ |__ __/ |__ ',\n", " ' 1 3 5 7 9 11 13 15 ',\n", " ' / | / | / | / | / | / | / | / | ',\n", " '# # # # # # # # # # # # # # # #']" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "class Tree_node():\n", " def __init__(self, key, val):\n", " self.key = key\n", " self.val = val\n", " self.left = None\n", " self.right = None\n", "\n", " def __repr__(self):\n", " return \"(\" + str(self.key) + \":\" + str(self.val) + \")\"\n", " \n", " \n", " \n", "class Binary_search_tree():\n", "\n", " def __init__(self):\n", " self.root = None\n", "\n", "\n", " def __repr__(self): #no need to understand the implementation of this one\n", " out = \"\"\n", " for row in printree(self.root): #need printree.py file\n", " out = out + row + \"\\n\"\n", " return out\n", "\n", "\n", " def lookup(self, key):\n", " ''' return node with key, uses recursion '''\n", "\n", " def lookup_rec(node, key):\n", " if node == None:\n", " return None\n", " elif key == node.key:\n", " return node\n", " elif key < node.key:\n", " return lookup_rec(node.left, key)\n", " else:\n", " return lookup_rec(node.right, key)\n", "\n", " return lookup_rec(self.root, key)\n", "\n", "\n", "\n", " def insert(self, key, val):\n", " ''' insert node with key,val into tree, uses recursion '''\n", "\n", " def insert_rec(node, key, val):\n", " if key == node.key:\n", " node.val = val # update the val for this key\n", " elif key < node.key:\n", " if node.left == None:\n", " node.left = Tree_node(key, val)\n", " else:\n", " insert_rec(node.left, key, val)\n", " else: #key > node.key:\n", " if node.right == None:\n", " node.right = Tree_node(key, val)\n", " else:\n", " insert_rec(node.right, key, val)\n", " return\n", " \n", " if self.root == None: #empty tree\n", " self.root = Tree_node(key, val)\n", " else:\n", " insert_rec(self.root, key, val)\n", "\n", "\n", " def minimum(self):\n", " ''' return node with minimal key '''\n", " if self.root == None:\n", " return None\n", " node = self.root\n", " left = node.left\n", " while left != None:\n", " node = left\n", " left = node.left\n", " return node\n", "\n", "\n", " def depth(self):\n", " ''' return depth of tree, uses recursion'''\n", " def depth_rec(node):\n", " if node == None:\n", " return -1\n", " else:\n", " return 1 + max(depth_rec(node.left), depth_rec(node.right))\n", "\n", " return depth_rec(self.root)\n", "\n", "\n", " def size(self):\n", " ''' return number of nodes in tree, uses recursion '''\n", " def size_rec(node):\n", " if node == None:\n", " return 0\n", " else:\n", " return 1 + size_rec(node.left) + size_rec(node.right)\n", "\n", " return size_rec(self.root)\n", " \n", " def inorder(self):\n", " '''prints the keys of the tree in a sorted order'''\n", " def inorder_rec(node):\n", " if node == None:\n", " return\n", " inorder_rec(node.left)\n", " print(node.key)\n", " inorder_rec(node.right)\n", " \n", " inorder_rec(self.root)\n", " \n", " def insert_balanced(self, n):\n", " def insert_balanced_rec(first, last):\n", " if first == last:\n", " return Tree_node(first, first)\n", " mid = (first + last) // 2\n", " root = Tree_node(mid, mid)\n", " root.left = insert_balanced_rec(first, mid - 1)\n", " root.right = insert_balanced_rec(mid + 1, last)\n", " return root\n", " self.root = insert_balanced_rec(1, 2**n - 1) \n", " \n", "\n", "\n", "t = Binary_search_tree()\n", "t.insert(4,'a')\n", "t.insert(2,'a')\n", "t.insert(6,'a')\n", "t.insert(1,'a')\n", "t.insert(3,'a')\n", "t.insert(5,'a')\n", "t.insert(7,'a')\n", "\n", "#t.inorder()\n", "\n", "t = Binary_search_tree()\n", "t.insert_balanced(4)\n", "printree(t.root)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Exam question (2018, Spring Semester)\n", "\n", "Let $L,K$ be two linked lists of length $n,m$ respectively. We say that the **lists merge** if there exists a node that is on both lists.\n", "\n", "I.e. - there exists a node $q$ in $L$ and $p$ in $K$ such that $q ~is~ p == True$. In such a case we call $p$ a **joint node**.\n", "\n", "Write a function $are\\_merged$ that gets two linked lists as an input and returns True iff the lists merge.\n", "\n", "The main idea - two linked lists merge if and only if their last node (their tail) is the same node.\n" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[a,2582652271360] [b,2582652268728] [c,2582652271808] [d,2582652268840] " ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[e,2582652270800] [c,2582652271808] [d,2582652268840] " ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[d,2582652629344] " ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "True\n", "False\n" ] } ], "source": [ "def are_merged(L, K):\n", " node_l = L.next\n", " node_k = K.next\n", " while node_l.next != None:\n", " node_l = node_l.next\n", " while node_k.next != None:\n", " node_k = node_k.next\n", " return node_k is node_l\n", " \n", " #Here is an equivalent solution with a shorter code:\n", " #last_node_l = L[len(L)-1]\n", " #last_node_k = K[len(K)-1]\n", " #return last_node_l is last_node_k\n", "\n", "L = Linked_list(\"abcd\")\n", "L\n", "\n", "K = Linked_list()\n", "K.next = L.next.next.next\n", "# Manually define the length of K\n", "K.len = 2\n", "K.add_at_start(\"e\")\n", "K\n", "\n", "M = Linked_list()\n", "M.add_at_start(\"d\")\n", "M\n", "\n", "print(are_merged(L,K))\n", "print(are_merged(L,M))\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Time and space analysis\n", "\n", "Space: Saving a pointer to two nodes can be done in $O(1)$ memory\n", "Time: Each node in the lists is read exactly once and all operations run in time $O(1)$ so runtime is $O(n+m)$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, write a function $find\\_shared$ that gets two merged linked lists as an input and returns the first joint node of the lists.\n", "\n", "Analyze its running time and space complexity." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[a,91940784] [b,91940624] [c,91940592] [d,91940272] " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[e,91940432] [c,91940592] [d,91940272] " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "[d,1901584] " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "True\n", "False\n", "advp\n", "advp\n", "[c,91940592] True\n", "[c,91940592] True\n", "[c,91940592] True\n" ] } ], "source": [ "def find_shared_1(L, K):\n", " id_set = set()\n", " p = L.next\n", " while p != None:\n", " id_set.add(id(p))\n", " p = p.next\n", " q = K.next\n", " while q != None:\n", " if id(q) in id_set:\n", " return q\n", " q = q.next\n", " return None # should not reach here since L, K have joint items\n", "\n", "\n", "def find_shared_2(L, K):\n", " n = len(L)\n", " m = len(K)\n", " p = L.next\n", " q = K.next\n", " for i in range(max(0, n-m)):\n", " print('advp')\n", " p = p.next\n", " for i in range(max(0, m-n)):\n", " print('advq')\n", " q = q.next\n", " while p != None:\n", " if p is q:\n", " return p\n", " p = p.next\n", " q = q.next\n", " return None # Should never get here\n", "\n", "def find_shared_3(L, K):\n", " p = L.next\n", " q = K.next\n", " while p != None and q != None:\n", " if p is q:\n", " return p\n", " p = p.next\n", " q = q.next\n", " if p == None:\n", " p = K.next\n", " if q == None:\n", " q = L.next\n", " return None # must end before, since L, K have joint items \n", "\n", "L = Linked_list(\"abcd\")\n", "L\n", "\n", "K = Linked_list()\n", "K.next = L.next.next.next\n", "# Save the first joint node\n", "x = K.next\n", "K.len = 2\n", "K.add_at_start(\"e\")\n", "K\n", "\n", "M = Linked_list()\n", "M.add_at_start(\"d\")\n", "M\n", "\n", "print(are_merged(L,K))\n", "print(are_merged(L,M))\n", "\n", "\n", "\n", "print(find_shared_1(L,K), find_shared_1(L,K) == x)\n", "print(find_shared_2(L,K), find_shared_2(L,K) == x)\n", "print(find_shared_3(L,K), find_shared_3(L,K) == x)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the analysis assume wlog that $n \\geq m$:\n", "* For find_shared_1:\n", " * Space: Saving all of the node ids from $L$ in a set: $O(n)$, saving one or two node pointers at a time is $O(1)$\n", " * Time: All lookups in the set run in $O(1)$ on average, apart from that each node in $K,L$ is read exactly once - $O(n)$ on average\n", "* For find_shared_2:\n", " * Space: Saving $n,m$ and an index for the for loop: $O(\\log n)$, saving one or two node pointers at a time is $O(1)$\n", " * Time: First two for loops and the while loop traverse each node in $K,L$ exactly once - $O(n)$\n", "* For find_shared_3:\n", " * Space: Saving one or two node pointers throughout the while loop is $O(1)$\n", " * Time: Pointer p will first traverse the $n$ nodes of L and then the $m$ nodes of K. Pointer $q$ will first traverse the $m$ nodes of K and then the $n$ nodes of L. Suppose that the joint part is of length $w$, then both pointers will reach it after $n+m-w$ steps. Since all operations in the while loop run in $O(1)$ time, the overall running time is $O(n)$\n", " " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" } }, "nbformat": 4, "nbformat_minor": 1 }