{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sebastian Raschka \n", "last updated: 2016-08-02 \n", "\n", "CPython 3.5.1\n", "IPython 5.0.0\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -a 'Sebastian Raschka' -u -d -v" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Singly Linked Lists" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***Singly Linked Lists*** are dynamic list or array structure that allows us to store an arbitrary number of elements. The overall concept of singly linked lists is very simple: It consists of an arbitrary number of nodes, where each nodes stores the content (or data) of that node and a pointer or reference to the next node. One major advantage of a singly linked lists is that we can grow it arbitrarily in size, which is very useful if we don't know the number of elements we want to store upfront. Furthermore, the insertion of new elements at the beginning of the list is very cheap, since it only requires us to change one reference or pointer as we will see later." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](../../images/data-structures/singly-linked-list/singly-linked-list.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Algorithm Complexity Overview:**\n", " \n", "- inserting an element at the head: O(1)\n", "- counting the number of elements: O(n)\n", "- searching for an element: O(n) [worst case]\n", "- deleting an element: O(n) [worst case]\n", "\n", "(We will discuss this in more detail in the following sections.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Code Overview" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "class SLLNode(object):\n", " def __init__(self, data, next_node=None):\n", " self.data = data\n", " self.next_node = next_node" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "class SinglyLinkedList(object):\n", " def __init__(self, head=None):\n", " self.head = head\n", " \n", " def __repr__(self):\n", " s = ''\n", " if self.head is not None:\n", " current_node = self.head\n", " else:\n", " current_node = None\n", " \n", " while current_node is not None:\n", " s += ' --> %s' % current_node.data\n", " current_node = current_node.next_node\n", " return 'Size: %s\\nData: ' % self.size() + s[5:]\n", " \n", " def insert(self, data):\n", " new_node = SLLNode(data=data, next_node=None)\n", " new_node.next_node = self.head\n", " self.head = new_node\n", " \n", " def size(self):\n", " current_node = self.head\n", " cnt = 0\n", " while current_node:\n", " cnt += 1\n", " current_node = current_node.next_node\n", " return cnt\n", " \n", " def search(self, data):\n", " current_node = self.head\n", " target_node = None\n", " while current_node is not None and target_node is None:\n", " if current_node.data == data:\n", " target_node = current_node\n", " else:\n", " current_node = current_node.next_node\n", " return target_node \n", " \n", " def delete(self, data):\n", " current_node = self.head\n", " previous_node = None\n", " found = False\n", "\n", " while current_node is not None and not found:\n", " if current_node.data == data:\n", " found = True\n", " if previous_node is None:\n", " self.head = current_node.next_node\n", " else:\n", " previous_node.next_node = current_node.next_node\n", " \n", " else:\n", " previous_node = current_node\n", " current_node = current_node.next_node" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Node Class" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you glanced over the code overview above, you noticed that we are going to implement the Singly Linked List using two classes, a Singly Linked List Node (`SLLNode`) and the Singly Linked List itself (`SinglyLinkedList`). First, let's have a look at the `SLLNode` object.\n", "\n", "```python\n", "class SLLNode(object):\n", " def __init__(self, data, next_node=None):\n", " self.data = data\n", " self.next_node = next_node\n", "```\n", "\n", "This simple class takes a `data` argument as input as well as an argument `next_node`, which defaults to `None`. When we instantiate a new node, we typically want to have some data associated with it -- why would we want to create a new node, otherwise? The reason why we don't want to require a `next_node` attribute is that the last node in the Singly Linked List does not have a `next_node`.\n", "\n", "To summarize, a `SLLNode` object is a simple object that consists of two attributes only, the data, which it is meant to store, and a \"pointer\" to the next node." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The SinglyLinkedList Class" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's go through the `SinglyLinkedList` implementation step by step. The rough skeleton of this class would look like this:\n", "\n", "```python\n", "\n", "class SinglyLinkedList(object):\n", " def __init__(self, head=None):\n", " self.head = head\n", " \n", " def __repr__(self):\n", " ...\n", " \n", " def insert(self, data):\n", " ...\n", " \n", " def size(self):\n", " ...\n", " \n", " def search(self, data):\n", " ... \n", " \n", " def delete(self, data):\n", " ...\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `__init__` and `__repr__`\n", "\n", "- `__init__` : The `SinglyLinkedList` objects only have one single attribute, `head`, which points to the first node in our data structure. \n", "\n", "- `__repr__`: This method is not really required; we implement this \"operator overloading\" for our own convenience, so that we can use the `print` function to display the SinglyLinked list. Unfortunately, this `__repr__` method may look a bit complex at first glance, but it will become much more intuitive later when we talk about the other `SinglyLinkedList` methods. If you are interested, let's take a quick look at it now, anyway; however, please feel free to skip this section since it is not really relevant for implementing a singly linked list -- it's just an additinional method for convenience:\n", "\n", "```python\n", "def __repr__(self):\n", " s = ''\n", " if self.head is not None:\n", " current_node = self.head\n", " else:\n", " current_node = None\n", "\n", " while current_node is not None:\n", " s += ' --> %s' % current_node.data\n", " current_node = current_node.next_node\n", " return 'Size: %s\\nData: ' % self.size() + s[5:]\n", "```\n", "\n", "First, we initialize an empty string variable `s`. If our `SinglyLinkedList` variable is empty, which we check via the `head` attribute, we set the `current_node` to None so that the `while` loop that follows will not be exectuted; otherwise, we sed the `current_node` variable to the linked list's `head`. Now, via the `while` loop, we now traverse through the linked list as long as the `current_node` points to a `next_node`. \n", "If the `current_node` is not `None`, we add its data to the string variable `s`, which we initialized at the beginning of this method; then, we set the `current_node` variable to the next node via its `next_node` pointer. The while loop stops if `current_node` is `None` in the next iteration, which happens if the `current_node` in the previous iterations does not point to a `next_node`. \n", "Finally, we return the string variable after we made some additional modifications to it. First of all, we compute the `size` of the linked list (i.e., the number of nodes in the list) using the `size` method, which we will discuss a little bit later. Then, we add the string that we created inside the while loop to it. Note that we \"slice\" the list (`s[5:]`) to get rid of the prepending \" `-->` \" at the beginning of the string." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let us initialize an empty linke list and exectute the `print` method:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 0\n", "Data: \n" ] } ], "source": [ "sll = SinglyLinkedList()\n", "print(sll)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see, the the `print` method displays a string consisting of 2 lines: The first line shows the size of the list, which is obviously 0 since we didn't add any nodes to it, yet. The second line prints the contents of the nodes. Since we don't have a node in the linked list, there's no data to be printed." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ``insert``" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](../../images/data-structures/singly-linked-list/singly-linked-list-insert.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's go ahead and insert some data into our newly created Singly Linked List object using the insert method:\n", "\n", "```python\n", "def insert(self, data):\n", " new_node = SLLNode(data=data, next_node=None)\n", " new_node.next_node = self.head\n", " self.head = new_node\n", "```\n", "\n", "This implementation of `insert` creates a new node that points to the current head node of the list (if it exists). Then, the new node becomes the head node itself. In other words, we insert a new node at the beginning of the linked list as a head node, which points to the previous head node; this moves all exisiting nodes in the linked list back by one position. Since we only create a new node, i.e., change one single pointer, the `insert` method is very efficient with constant complexity O(1)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's give it a try by insert a node that holds a Python `list` as its data element:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 1\n", "Data: [1, 2, 3]\n" ] } ], "source": [ "sll.insert(data=[1, 2, 3])\n", "print(sll)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we discuss the `search` method in the next section, let's add two more nodes to our linked list: " ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 3\n", "Data: [7, 8, 9] --> [4, 5, 6] --> [1, 2, 3]\n" ] } ], "source": [ "sll.insert(data=[4, 5, 6])\n", "sll.insert(data=[7, 8, 9])\n", "\n", "print(sll)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `search`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](../../images/data-structures/singly-linked-list/singly-linked-list-search.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The complexity of searching a singly linked list is O(n) (where n is the number of nodes or the size of the linked list), since we have to traverse through the whole list in the worst case scenario, i.e., if the item we are looking for is the last item or if the search key isn't contained in the linked list at all.\n", "\n", "Searching a linked list is pretty straight forward:\n", "\n", "```python\n", "def search(self, data):\n", " current_node = self.head\n", " target_node = None\n", " while current_node is not None and target_node is None:\n", " if current_node.data == data:\n", " target_node = current_node\n", " else:\n", " current_node = current_node.next_node\n", " return target_node \n", "```\n", "\n", "We start with the head node, and in a while loop, we look for the `data` (search key) until we find it in the current node `current_node.data == data:` or until we are running out of nodes in the list `while current_node is not None`. Finally, we return the target note that contains our `data` search key, or return `None` if no node matched our query. Let's see it in action:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "<__main__.SLLNode object at 0x1047c5208> [4, 5, 6]\n", "<__main__.SLLNode object at 0x1047c51d0> [7, 8, 9]\n", "<__main__.SLLNode object at 0x1047baf98> [1, 2, 3]\n" ] } ], "source": [ "target = sll.search(data=[4, 5, 6])\n", "print(target, target.data)\n", "\n", "target = sll.search(data=[7, 8, 9])\n", "print(target, target.data)\n", "\n", "target = sll.search(data=[1, 2, 3])\n", "print(target, target.data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we don't find a node in our linked list that contains the search key, we return `None`:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "None\n" ] } ], "source": [ "target = sll.search(data=5)\n", "print(target)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `delete`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](../../images/data-structures/singly-linked-list/singly-linked-list-delete.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Deleting an element from a singly linked list is also a O(n) operation; it's actually very similar to `search`, which we discussed above.\n", "\n", "```python\n", "def delete(self, data):\n", " current_node = self.head\n", " previous_node = None\n", " found = False\n", "\n", " while current_node is not None and not found:\n", " if current_node.data == data:\n", " found = True\n", " if previous_node is None:\n", " self.head = current_node.next_node\n", " else:\n", " previous_node.next_node = current_node.next_node\n", "\n", " else:\n", " previous_node = current_node\n", " current_node = current_node.next_node\n", "```\n", "\n", "The difference between `search` and `delete` is that we also keep track of the previous node now. Again, we start with the head node as the current node, and we excecute a `while` loop that runs until we reach the last node in the linked list or deleted the respective node -- note that we are modifying the linked list in place in this implementation. \n", "The `deletion`, on a conceptual level, works like this: If we find a node that contains our search key, let's call it out `current_node`, we move the pointer of the `previous_node` to the node that comes after the `current_node` (i.e., the `current_node.next_node`. By doing so, we are essentially removing the reference to the deleted node from the linked list so that Python's garbage collector can get rid of it since it is no longer needed. If our search query matches the head node, that is, if there is no \"previous node,\" we simply re-assign the head node to the second element in the linked list.\n", "\n", "Let's give it a try by removing an element from the middle of the list:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initial linked list:\n", "Size: 3\n", "Data: [7, 8, 9] --> [4, 5, 6] --> [1, 2, 3]\n", "\n", "Linked list after deletion:\n", "Size: 2\n", "Data: [7, 8, 9] --> [1, 2, 3]\n" ] } ], "source": [ "print('Initial linked list:')\n", "print(sll)\n", "\n", "target = sll.delete(data=[4, 5, 6])\n", "print('\\nLinked list after deletion:', )\n", "print(sll)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's remove the head node:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 1\n", "Data: [1, 2, 3]\n" ] } ], "source": [ "target = sll.delete(data=[7, 8, 9])\n", "print(sll)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And finally, we remove the last remaining element from our linked list:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 0\n", "Data: \n" ] } ], "source": [ "target = sll.delete(data=[1, 2, 3])\n", "print(sll)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, that's basically \"Single Linked Lists\" in a nutshell!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Ordered Lists" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The singly-linked-list implementation above can is a so-called unordered data structure or list. The reason is that it collects items in an unsorted manner. On the other hand, ordered lists store items sorted by their values. For example, in our unordered singly-linked example, we may have a list of items in the following order:\n", "\n", "- [5, 1, 3]\n", "\n", "and and we can add items at an aribtrary position. For example, we can add a new number 4 as follows\n", "\n", "- [4, 5, 1, 3]\n", "- [5, 4, 1, 3]\n", "- [5, 1, 4, 3]\n", "- [5, 1, 3, 4]\n", "\n", "In an ordered list, the sorting order is always maintained. Assuming that we have an ordered list that keeps the items in ascending order, [1, 3, 5], the \"add\" method will add the number 4 only as follows:\n", "\n", "- [1, 3, 4, 5]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [], "source": [ "class SLLNode(object):\n", " def __init__(self, data, next_node=None):\n", " self.data = data\n", " self.next_node = next_node\n", "\n", " \n", "class OrderedList(object):\n", " def __init__(self, head=None, datatype=int):\n", " self.head = head\n", " self.datatype = datatype\n", " \n", " def __repr__(self):\n", " s = ''\n", " if self.head is not None:\n", " current_node = self.head\n", " else:\n", " current_node = None\n", " \n", " while current_node is not None:\n", " s += ' --> %s' % current_node.data\n", " current_node = current_node.next_node\n", " return 'Size: %s\\nData: ' % self.size() + s[5:]\n", " \n", " def insert(self, data):\n", " if not isinstance(data, self.datatype):\n", " raise AttributeError('New element must have type %s' % self.datatype)\n", " new_node = SLLNode(data=data, next_node=None)\n", " \n", " inserted = False\n", " \n", " if self.head is None:\n", " print('new node')\n", " self.head = new_node\n", " inserted = True\n", "\n", " current = self.head\n", " previous = None\n", " while current is not None and not inserted:\n", " if current.data >= new_node.data:\n", " new_node.next_node = current\n", " if previous is not None:\n", " previous.next_node = new_node\n", " else:\n", " self.head = new_node\n", " inserted = True\n", " previous = current\n", " current = current.next_node\n", " \n", " if not inserted:\n", " previous.next_node = new_node\n", " new_node.next_node = current\n", " \n", " def size(self):\n", " current_node = self.head\n", " cnt = 0\n", " while current_node:\n", " cnt += 1\n", " current_node = current_node.next_node\n", " return cnt\n", " \n", " def search(self, data):\n", " current_node = self.head\n", " target_node = None\n", " while current_node is not None and target_node is None:\n", " if current_node.data == data:\n", " target_node = current_node\n", " else:\n", " current_node = current_node.next_node\n", " return target_node \n", " \n", " def delete(self, data):\n", " current_node = self.head\n", " previous_node = None\n", " found = False\n", "\n", " while current_node is not None and not found:\n", " if current_node.data == data:\n", " found = True\n", " if previous_node is None:\n", " self.head = current_node.next_node\n", " else:\n", " previous_node.next_node = current_node.next_node\n", " \n", " else:\n", " previous_node = current_node\n", " current_node = current_node.next_node" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 0\n", "Data: \n", "\n", "new node\n", "Size: 1\n", "Data: 1\n", "None\n" ] } ], "source": [ "sll = OrderedList(datatype=int)\n", "print(sll)\n", "\n", "print()\n", "\n", "sll.insert(data=1)\n", "print(sll)\n", "print(sll.head.next_node)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 2\n", "Data: 1 --> 2\n" ] } ], "source": [ "sll.insert(data=2)\n", "print(sll)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 3\n", "Data: 1 --> 2 --> 2\n" ] } ], "source": [ "sll.insert(data=2)\n", "print(sll)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 4\n", "Data: 1 --> 2 --> 2 --> 3\n" ] } ], "source": [ "sll.insert(data=3)\n", "print(sll)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 5\n", "Data: 1 --> 2 --> 2 --> 3 --> 10\n" ] } ], "source": [ "sll.insert(data=10)\n", "print(sll)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 6\n", "Data: 1 --> 2 --> 2 --> 3 --> 10 --> 10\n" ] } ], "source": [ "sll.insert(data=10)\n", "print(sll)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 7\n", "Data: 1 --> 2 --> 2 --> 2 --> 3 --> 10 --> 10\n" ] } ], "source": [ "sll.insert(data=2)\n", "print(sll)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 8\n", "Data: 0 --> 1 --> 2 --> 2 --> 2 --> 3 --> 10 --> 10\n" ] } ], "source": [ "sll.insert(data=0)\n", "print(sll)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size: 7\n", "Data: 0 --> 2 --> 2 --> 2 --> 3 --> 10 --> 10\n" ] } ], "source": [ "sll.delete(data=1)\n", "print(sll)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sll.search(data=3).data" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.1" } }, "nbformat": 4, "nbformat_minor": 0 }