{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# From last time"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Picking up from part 1, we have the following two examples:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "A = 'ABCBDAB'\n",
    "B = 'BDCABA'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are two longest common subsequences:\n",
    "\n",
    "1. B, C, B, A\n",
    "\n",
    "2. B, D, A, B"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "occidental = \"occidental\"\n",
    "superdelicately = \"superdelicately\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One longest common subsequence (LCS) is: \"deal\"."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some code from part 1:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import lru_cache"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def pretty_print(arr):\n",
    "    '''A hacky function that makes every string in an array equal length, then prints\n",
    "    it row by row.'''\n",
    "    longest_length = len(arr[:-1][:-1])\n",
    "    nicer = []\n",
    "    for i, row in enumerate(arr):\n",
    "        new_row = []\n",
    "        for j, cell in enumerate(row):\n",
    "            cell_length = len(cell)\n",
    "            padding_needed = longest_length - cell_length\n",
    "            new_cell = cell + (' ' * padding_needed)\n",
    "            new_row.append(new_cell)\n",
    "        nicer.append(new_row)\n",
    "    \n",
    "    [print(r) for r in nicer]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def lcs4(A, B, print_table=False):\n",
    "    # Create a lookup table. This is a 2D array with the letters of A as columns\n",
    "    # and the letters of B as rows. It includes a buffer row and column filled with empty strings.\n",
    "    columns = len(A) + 1\n",
    "    rows = len(B) + 1\n",
    "    lookup = [['' for n in range(columns)] for n in range(rows)]\n",
    "    \n",
    "    # Since the lookup table is initialized with '' \n",
    "    # the base case (len(A) == 0 or len(B) == 0) is \"baked-in\".\n",
    "    \n",
    "    # What's left is to go row by row, column by column and \n",
    "    # apply the iterative version of the recursive cases in lcs().\n",
    "    for row_id in range(1, rows):\n",
    "        for column_id in range(1, columns):\n",
    "            letter_in_A = A[column_id-1]\n",
    "            letter_in_B = B[row_id-1]\n",
    "            if letter_in_A == letter_in_B:\n",
    "                # \"Recursive case 1\"\n",
    "                lookup[row_id][column_id] = lookup[row_id - 1][column_id - 1] + letter_in_A\n",
    "            else:\n",
    "                # \"Recursive case 2\"\n",
    "                option1 = lookup[row_id][column_id-1]\n",
    "                option2 = lookup[row_id-1][column_id]\n",
    "                if len(option1) > len(option2):\n",
    "                    lookup[row_id][column_id] = option1\n",
    "                else:\n",
    "                    lookup[row_id][column_id] = option2\n",
    "    \n",
    "    \n",
    "    if print_table:\n",
    "        pretty_print(lookup)\n",
    "    \n",
    "    return lookup[rows-1][columns-1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# A connection between LCS and LIS?\n",
    "\n",
    "There is an interesting connection between longest common subsequence (LCS) and longest increasing subsequence (LIS). \n",
    "\n",
    ">The longest increasing subsequence of a list of numbers is equal to the longest common subsequence of that list of numbers and its sorted counterpart. \n",
    "\n",
    "Source: \"Course15.pdf\" by [Jean-Lou De Carufel](http://cglab.ca/~jdecaruf/CSI3105.html) (probably taken from Dasgupta et. al.)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[3, 2, 1, 4]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "nums = [3, 2, 1, 4]\n",
    "nums"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Copy and sort nums\n",
    "sorted_nums = nums[:]\n",
    "sorted_nums.sort()\n",
    "sorted_nums"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Convert both to strings\n",
    "nums_str = [str(n) for n in nums]\n",
    "sorted_nums_str = [str(n) for n in sorted_nums]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'14'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Find the longest common subsequence of S and S_sorted.\n",
    "lcs4(nums_str, sorted_nums_str)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[1, 4] is a longest **increasing** subsequence of `nums`.\n",
    "\n",
    "That said, this doesn't seem to work for lists with duplicate elements."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "dupe_nums = [2, 2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'22'"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dupe_nums_str = ''.join([str(n) for n in dupe_nums])\n",
    "dupe_nums_str"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'22'"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lcs4(dupe_nums_str, dupe_nums_str)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Whether $2 \\rightarrow 2$ is \"increasing\" depends on how you define it. It is not strictly increasing and will fail to pass certain tests in [this leetcode question](https://leetcode.com/problems/longest-increasing-subsequence/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Extending LCS to numbers\n",
    "\n",
    "Thinking about it, my current implementation of longest common subsequence only works with lists of numbers where no number is greater than 9 (only single digits). With stringified number lists, there is no way to tell \"10\" from \"1\", \"0\".\n",
    "\n",
    "The solution is simple: use lists instead of strings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "def lcs5(A, B):\n",
    "    # Create a lookup table. This is a 2D array with the letters of A as columns\n",
    "    # and the letters of B as rows. It includes a buffer row and column filled with empty strings.\n",
    "    columns = len(A) + 1\n",
    "    rows = len(B) + 1\n",
    "    lookup = [[[] for n in range(columns)] for n in range(rows)]\n",
    "    \n",
    "    # Since the lookup table is initialized with '' \n",
    "    # the base case (len(A) == 0 or len(B) == 0) is \"baked-in\".\n",
    "    \n",
    "    # What's left is to go row by row, column by column and \n",
    "    # apply the iterative version of the recursive cases in lcs().\n",
    "    for row_id in range(1, rows):\n",
    "        for column_id in range(1, columns):\n",
    "            letter_in_A = A[column_id-1]\n",
    "            letter_in_B = B[row_id-1]\n",
    "            if letter_in_A == letter_in_B:\n",
    "                # \"Recursive case 1\"\n",
    "                lookup[row_id][column_id] = lookup[row_id - 1][column_id - 1] + [letter_in_A]\n",
    "            else:\n",
    "                # \"Recursive case 2\"\n",
    "                option1 = lookup[row_id][column_id-1]\n",
    "                option2 = lookup[row_id-1][column_id]\n",
    "                if len(option1) > len(option2):\n",
    "                    lookup[row_id][column_id] = option1\n",
    "                else:\n",
    "                    lookup[row_id][column_id] = option2\n",
    "    \n",
    "    \n",
    "    return lookup[rows-1][columns-1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "numbers_a = ['10', '22', '5', '11', '10']\n",
    "numbers_b = ['100', '22', '3', '11', '14']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['22', '11']"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lcs5(numbers_a, numbers_b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Of course, letters still work:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['o', 'c', 'c', 'i', 'd', 'e', 'n', 't', 'a', 'l']"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "occidental_letters = list(occidental)\n",
    "occidental_letters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['s', 'u', 'p', 'e', 'r', 'd', 'e', 'l', 'i', 'c', 'a', 't', 'e', 'l', 'y']"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "superdelicate_letters = list(superdelicately)\n",
    "superdelicate_letters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['d', 'e', 'a', 'l']"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lcs5(occidental_letters, superdelicate_letters)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This combined with some input processing was enough to solve [this HackerRank question](https://www.hackerrank.com/challenges/dynamic-programming-classics-the-longest-common-subsequence/problem)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}