{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Longest palindromic substring\n", "\n", "### Problem statement and initial analysis (1)\n", "Q: *Longest palindromic substring* — Find the longest palindromic substring in a string.\n", "\n", "This is a classic interview problem. It's not something that occurs IRL very much, but it's probably the third most common dynamic programming question after Fibbonacci and the knapsack problem. Every palindrome of length $n$ contains palindromes of length $n/2$ and smaller inside. So to brute force this problem, search for all of the palindromes of length 2 or 3 and grow them outwards." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Output unit tests (2)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "......\n", "----------------------------------------------------------------------\n", "Ran 6 tests in 0.002s\n", "\n", "OK\n" ] } ], "source": [ "import unittest\n", "\n", "class TestP(unittest.TestCase):\n", " \"\"\"Example of how to use unittest in Jupyter.\"\"\"\n", " \n", " def testEmptyString(self):\n", " self.assertEqual(p(\"\"), \"\")\n", " \n", " def testUnitStringP(self):\n", " self.assertEqual(p(\"A\"), \"A\")\n", " \n", " def testStartP(self):\n", " self.assertEqual(p(\"AABB\"), \"AA\")\n", " \n", " def testEndP(self):\n", " self.assertEqual(p(\"ABCC\"), \"CC\")\n", " \n", " def testMidP(self):\n", " self.assertEqual(p(\"ACCB\"), \"CC\")\n", " \n", " def testOddP(self):\n", " self.assertEqual(p(\"ABCDQQQABCD\"), \"QQQ\")\n", "\n", " \n", "if __name__ == '__main__':\n", " unittest.main(argv=[''], exit=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Brute force psuedocode (3), brute force implementation (4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Omitted, did this for a coding exercise at RC." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Program properties (5)\n", "\n", "* Even palindromes of length $n$ contain two sub-palindromes of length $n - 2$ on the left and right side, and so on.\n", "* Odd palindromes of length $n$ contains one sub-palindrome of length $n - 2$ straddling the middle, and so on.\n", "* The recurrance relationship given indices $j$ and $k$ on string $s$ is: $p(j, k)\\:\\text{iff}\\:[p(j + 1, k - 1)\\:\\cap\\:s_j = s_k]$.\n", "\n", "If you draw out an $i-j$ table on paper on an example palindromic string, you will find that the lower triangular is null, whilst the upper triangular is a sequence of boolean values ascending 1,1 upwards and to the right which are always some number of trues followed by some number of falses. So if the $1,2$ entry is true, the $0,3$ entry might be true, but if $1,2$ is false $0,3$ will never be true.\n", "\n", "To learn, descend in a zigzag pattern down these array entries and climb outwards. To generate a result, repeat that journey while tallying true lengths. This reduces the problem to solving for these diagonal matrix entries." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optimized solution psuedocode (6)\n", "\n", "```\n", "function p(s):\n", " table = Array(i=0,...,n-1; j=1,...,n)\n", " len_longest = 1\n", " idx_start_longest = 0\n", " curr = <-1, 1>\n", " next_step = <1, 0>\n", " \n", " while(curr[0] < len(s) and curr[1] < len(s)):\n", " curr + curr + next_step\n", " next_step = next_step[::-1]\n", " offset = 0\n", " i, j = curr[0], curr[1]\n", " while s[i:j] = s[i:j:-1]:\n", " offset += 1\n", " i -= 1\n", " j += 1\n", " len = offset[0] + adjustment for even or odd start\n", " if len > len_longest:\n", " len_longest = len\n", " idx_start_longest = i\n", " \n", " return s[idx_start_longest:idx_start_longest + len_longest]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optimized solution\n", "\n", "Notice how my psuedocode missed an important base case. Note also that I inline retaining and returning the longest known palindromic sequence, instead of storing the information in a data store for later, as in some implementations (notably [this one](https://www.geeksforgeeks.org/longest-palindrome-substring-set-1/))." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def p(s):\n", " if s == \"\": return \"\"\n", " \n", " len_longest = 1\n", " idx_start_longest = 0\n", " curr = [-1, 1]\n", " next_step = [1, 0]\n", " \n", " while max(curr[0], curr[1]) <= len(s):\n", " curr[0], curr[1] = curr[0] + next_step[0], curr[1] + next_step[1]\n", " next_step = tuple(list(next_step)[::-1])\n", " offset = 0\n", " i, j = curr[0], curr[1]\n", " \n", " # If our starting point is not already a palindrome, skip this iteration.\n", " # Odd starting points are always palindromic, but even starting points may not be.\n", " if s[i:j] != s[i:j][::-1]:\n", " continue\n", " \n", " while s[i - 1:j + 1] == s[i - 1:j + 1][::-1] and i - 1 >= 0 and j + 1 <= len(s):\n", " offset += 1\n", " i -= 1\n", " j += 1\n", " \n", " len_current = j - i\n", " if len_current > len_longest:\n", " len_longest = len_current\n", " idx_start_longest = i\n", " \n", " return s[idx_start_longest:idx_start_longest + len_longest]" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }