{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook was prepared by [Donne Martin](https://github.com/donnemartin). Source and license info is on [GitHub](https://github.com/donnemartin/interactive-coding-challenges)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Solution Notebook" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem: Implement a trie with find, insert, remove, and list_words methods.\n", "\n", "* [Constraints](#Constraints)\n", "* [Test Cases](#Test-Cases)\n", "* [Algorithm](#Algorithm)\n", "* [Code](#Code)\n", "* [Unit Test](#Unit-Test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Constraints\n", "\n", "* Can we assume we are working with strings?\n", " * Yes\n", "* Are the strings in ASCII?\n", " * Yes\n", "* Should `find` only match exact words with a terminating character?\n", " * Yes\n", "* Should `list_words` only return words with a terminating character?\n", " * Yes\n", "* Can we assume this fits memory?\n", " * Yes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test Cases\n", "\n", "
\n",
    "\n",
    "root node is denoted by ''\n",
    "\n",
    "         ''\n",
    "       /  |  \\\n",
    "      h   a*  m\n",
    "     / \\   \\   \\\n",
    "    a   e*  t*  e*\n",
    "   / \\         / \\\n",
    "  s*  t*      n*  t*\n",
    "             /\n",
    "            s*\n",
    "\n",
    "find\n",
    "\n",
    "* Find on an empty trie\n",
    "* Find non-matching\n",
    "* Find matching\n",
    "\n",
    "insert\n",
    "\n",
    "* Insert on empty trie\n",
    "* Insert to make a leaf terminator char\n",
    "* Insert to extend an existing terminator char\n",
    "\n",
    "remove\n",
    "\n",
    "* Remove me\n",
    "* Remove mens\n",
    "* Remove a\n",
    "* Remove has\n",
    "\n",
    "list_words\n",
    "\n",
    "* List empty\n",
    "* List general case\n",
    "
\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Algorithm\n", "\n", "### find\n", "\n", "* Set node to the root\n", "* For each char in the input word\n", " * Check the current node's children to see if it contains the char\n", " * If a child has the char, set node to the child\n", " * Else, return None\n", "* Return the last child node if it has a terminator, else None\n", "\n", "Complexity:\n", "* Time: O(m), where m is the length of the word\n", "* Space: O(h) for the recursion depth (tree height), or O(1) if using an iterative approach\n", "\n", "### insert\n", "\n", "* set node to the root\n", "* For each char in the input word\n", " * Check the current node's children to see if it contains the char\n", " * If a child has the char, set node to the child\n", " * Else, insert a new node with the char\n", " * Update children and parents\n", "* Set the last node as a terminating node\n", "\n", "Complexity:\n", "* Time: O(m), where m is the length of the word\n", "* Space: O(h) for the recursion depth (tree height), or O(1) if using an iterative approach\n", "\n", "### remove\n", "\n", "* Determine the matching terminating node by calling the find method\n", "* If the matching node has children, remove the terminator to prevent orphaning its children\n", "* Set the parent node to the matching node's parent\n", "* We'll be looping up the parent chain to propagate the delete\n", "* While the parent is valid\n", " * If the node has children\n", " * Return to prevent orphaning its remaining children\n", " * If the node is a terminating node and it isn't the original matching node from the find call\n", " * Return to prevent deleting this additional valid word\n", " * Remove the parent node's child entry matching the node\n", " * Set the node to the parent\n", " * Set the parent to the parent's parent\n", "\n", "Complexity:\n", "* Time: O(m+h), where where m is the length of the word and h is the tree height\n", "* Space: O(h) for the recursion depth (tree height), or O(1) if using an iterative approach\n", "\n", "### list_words\n", "\n", "* Do a pre-order traversal, passing down the current word\n", " * When you reach a terminating node, add it to the list of results\n", "\n", "Complexity:\n", "* Time: O(n)\n", "* Space: O(h) for the recursion depth (tree height), or O(1) if using an iterative approach" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Code" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting trie.py\n" ] } ], "source": [ "%%writefile trie.py\n", "from collections import OrderedDict\n", "\n", "\n", "class Node(object):\n", "\n", " def __init__(self, key, parent=None, terminates=False):\n", " self.key = key\n", " self.terminates = False\n", " self.parent = parent\n", " self.children = {}\n", "\n", "\n", "class Trie(object):\n", "\n", " def __init__(self):\n", " self.root = Node('')\n", "\n", " def find(self, word):\n", " if word is None:\n", " raise TypeError('word cannot be None')\n", " node = self.root\n", " for char in word:\n", " if char in node.children:\n", " node = node.children[char]\n", " else:\n", " return None\n", " return node if node.terminates else None\n", "\n", " def insert(self, word):\n", " if word is None:\n", " raise TypeError('word cannot be None')\n", " node = self.root\n", " parent = None\n", " for char in word:\n", " if char in node.children:\n", " node = node.children[char]\n", " else:\n", " node.children[char] = Node(char, parent=node)\n", " node = node.children[char]\n", " node.terminates = True\n", "\n", " def remove(self, word):\n", " if word is None:\n", " raise TypeError('word cannot be None')\n", " node = self.find(word)\n", " if node is None:\n", " raise KeyError('word does not exist')\n", " node.terminates = False\n", " parent = node.parent\n", " while parent is not None:\n", " # As we are propagating the delete up the \n", " # parents, if this node has children, stop\n", " # here to prevent orphaning its children.\n", " # Or\n", " # if this node is a terminating node that is\n", " # not the terminating node of the input word, \n", " # stop to prevent removing the associated word.\n", " if node.children or node.terminates:\n", " return\n", " del parent.children[node.key]\n", " node = parent\n", " parent = parent.parent\n", "\n", " def list_words(self):\n", " result = []\n", " curr_word = ''\n", " self._list_words(self.root, curr_word, result)\n", " return result\n", "\n", " def _list_words(self, node, curr_word, result):\n", " if node is None:\n", " return\n", " for key, child in node.children.items():\n", " if child.terminates:\n", " result.append(curr_word + key)\n", " self._list_words(child, curr_word + key, result)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "%run trie.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Unit Test" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting test_trie.py\n" ] } ], "source": [ "%%writefile test_trie.py\n", "import unittest\n", "\n", "\n", "class TestTrie(unittest.TestCase): \n", "\n", " def test_trie(self):\n", " trie = Trie()\n", "\n", " print('Test: Insert')\n", " words = ['a', 'at', 'has', 'hat', 'he',\n", " 'me', 'men', 'mens', 'met']\n", " for word in words:\n", " trie.insert(word)\n", " for word in trie.list_words():\n", " self.assertTrue(trie.find(word) is not None)\n", " \n", " print('Test: Remove me')\n", " trie.remove('me')\n", " words_removed = ['me']\n", " words = ['a', 'at', 'has', 'hat', 'he',\n", " 'men', 'mens', 'met']\n", " for word in words:\n", " self.assertTrue(trie.find(word) is not None)\n", " for word in words_removed:\n", " self.assertTrue(trie.find(word) is None)\n", "\n", " print('Test: Remove mens')\n", " trie.remove('mens')\n", " words_removed = ['me', 'mens']\n", " words = ['a', 'at', 'has', 'hat', 'he',\n", " 'men', 'met']\n", " for word in words:\n", " self.assertTrue(trie.find(word) is not None)\n", " for word in words_removed:\n", " self.assertTrue(trie.find(word) is None)\n", "\n", " print('Test: Remove a')\n", " trie.remove('a')\n", " words_removed = ['a', 'me', 'mens']\n", " words = ['at', 'has', 'hat', 'he',\n", " 'men', 'met']\n", " for word in words:\n", " self.assertTrue(trie.find(word) is not None)\n", " for word in words_removed:\n", " self.assertTrue(trie.find(word) is None)\n", "\n", " print('Test: Remove has')\n", " trie.remove('has')\n", " words_removed = ['a', 'has', 'me', 'mens']\n", " words = ['at', 'hat', 'he',\n", " 'men', 'met']\n", " for word in words:\n", " self.assertTrue(trie.find(word) is not None)\n", " for word in words_removed:\n", " self.assertTrue(trie.find(word) is None)\n", "\n", " print('Success: test_trie')\n", "\n", " def test_trie_remove_invalid(self):\n", " print('Test: Remove from empty trie')\n", " trie = Trie()\n", " self.assertTrue(trie.remove('foo') is None) \n", "\n", "\n", "def main():\n", " test = TestTrie()\n", " test.test_trie()\n", " test.assertRaises(KeyError, test.test_trie_remove_invalid)\n", "\n", "\n", "if __name__ == '__main__':\n", " main()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test: Insert\n", "Test: Remove me\n", "Test: Remove mens\n", "Test: Remove a\n", "Test: Remove has\n", "Success: test_trie\n", "Test: Remove from empty trie\n" ] } ], "source": [ "%run -i test_trie.py" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" } }, "nbformat": 4, "nbformat_minor": 1 }