{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%%capture\n", "%load_ext autoreload\n", "%autoreload 2\n", "%matplotlib inline\n", "# %cd .. \n", "import sys\n", "sys.path.append(\"..\")\n", "import math \n", "import statnlpbook.util as util\n", "import statnlpbook.parsing as parsing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "$$\n", "\\newcommand{\\Xs}{\\mathcal{X}}\n", "\\newcommand{\\Ys}{\\mathcal{Y}}\n", "\\newcommand{\\y}{\\mathbf{y}}\n", "\\newcommand{\\balpha}{\\boldsymbol{\\alpha}}\n", "\\newcommand{\\bbeta}{\\boldsymbol{\\beta}}\n", "\\newcommand{\\aligns}{\\mathbf{a}}\n", "\\newcommand{\\align}{a}\n", "\\newcommand{\\source}{\\mathbf{s}}\n", "\\newcommand{\\target}{\\mathbf{t}}\n", "\\newcommand{\\ssource}{s}\n", "\\newcommand{\\starget}{t}\n", "\\newcommand{\\repr}{\\mathbf{f}}\n", "\\newcommand{\\repry}{\\mathbf{g}}\n", "\\newcommand{\\x}{\\mathbf{x}}\n", "\\newcommand{\\prob}{p}\n", "\\newcommand{\\vocab}{V}\n", "\\newcommand{\\params}{\\boldsymbol{\\theta}}\n", "\\newcommand{\\param}{\\theta}\n", "\\DeclareMathOperator{\\perplexity}{PP}\n", "\\DeclareMathOperator{\\argmax}{argmax}\n", "\\DeclareMathOperator{\\argmin}{argmin}\n", "\\newcommand{\\train}{\\mathcal{D}}\n", "\\newcommand{\\counts}[2]{\\#_{#1}(#2) }\n", "\\newcommand{\\length}[1]{\\text{length}(#1) }\n", "\\newcommand{\\indi}{\\mathbb{I}}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Parsing \n", "\n", "Note that this chapter is heavily influenced by the structure and content of [Mike Collins' PCFG lecture](http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/pcfgs.pdf). \n", "\n", "In many NLP applications it is useful to understand the syntactic structure of a sentence: where are the verbs, what are the subject and object of the verbs, which phrases form coherent sub-structures of the sentence? Understanding this enables the machine to more effectively translate from Japanese to English, or to understand the query [\"who is the president of the united state\"](https://www.google.co.uk/search?q=who+is+the+president+of+the+united+state&oq=who+is+the+president+of+the+united+state&aqs=chrome..69i57j0l5.252j0j4&sourceid=chrome&es_sm=119&ie=UTF-8) and execute it against a database. \n", "\n", "In linguistics these questions are asked in the field of **syntax**, from the Greek syntaxis (arrangement). There are three core concepts:\n", "\n", "* **Constituency**: groups of words act as single units.\n", "* **Grammatical Relations**: object, subject, direct object etc. \n", "* **Subcategorization**: restrictions on the type of phrases that go with certain words.\n", "\n", "## Context Free Grammars\n", "A common approach to capture constituency, grammatical relations and subcategorization is based on [Context Free Grammars](https://www.cs.rochester.edu/~nelson/courses/csc_173/grammars/cfg.html) (CFGs). On a high level, these grammars assume that legal sentences can be derived by repeatedly and _independently_ expanding abstract symbols (such as \"NounPhrase\" or \"Adjective\") into more concrete sequences of symbols (such as \"Adjective Noun\" or \"green\") until each symbol is a concrete word. \n", "\n", "More formally, a CFG is a 4-tuple \\\\(G=(N,\\Sigma,R,S)\\\\) where\n", "\n", " * \\\\(N\\\\) is a set of _non-terminal symbols_.\n", " * \\\\(\\Sigma\\\\) is a set of _terminal symbols_.\n", " * \\\\(R\\\\) is a finite set of _rules_ \\\\(X \\rightarrow Y_1 Y_2\\ldots Y_n\\\\) where \\\\(X \\in N\\\\) and \\\\(Y_i \\in N \\cup \\Sigma\\\\). \n", " * \\\\(S \\in N\\\\) is a _start symbol_. \n", "\n", "Before we show examples, let us define a Python data structure for CFGs." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "class CFG:\n", " \"\"\"\n", " A Python representation of a context free grammar.\n", " \"\"\"\n", " def __init__(self, n, sigma, r, s):\n", " \"\"\"\n", " Creates a new CFG.\n", " Args:\n", " n: the set of non-terminals\n", " sigma: the set of terminals\n", " r: the list of rules as pairs `(lhs, rhs)` where rhs is a list of right-hand side symbols of the rule.\n", " s: the starting symbol\n", " \"\"\"\n", " self.n = n\n", " self.sigma = sigma\n", " self.r = [(lhs,tuple(rhs)) for lhs,rhs in r]\n", " self.s = s\n", " \n", " @classmethod\n", " def from_rules(cls, rules, s='S'):\n", " \"\"\"\n", " Creates a CFG just from given rules. Symbols in the rules that appear on the left-hand sides\n", " are non-terminals, all other symbols are non-terminals.\n", " Args:\n", " rules: the list of rules to build the CFG from\n", " s: the start symbol.\n", " Returns:\n", " a CFG with induces terminal and non-terminals.\n", " \"\"\"\n", " non_terminals = {rule[0] for rule in rules}\n", " left_hand_sides = {node for rule in rules for node in rule[1]}\n", " terminals = {n for n in left_hand_sides if n not in non_terminals}\n", " return cls(non_terminals, terminals, rules, s)\n", " \n", " def _repr_html_(self):\n", " \"\"\"\n", " Simple HTML representation of a CFG for easy rendering in notebook.\n", " Returns:\n", " an HTML string representing the CFG as table.\n", " \"\"\"\n", " rules_per_table = math.ceil(len(self.r) / 3)\n", " rules = [\"{}{}\".format(rule[0],\" \".join(rule[1])) for rule in self.r]\n", " tables = [\"\"\"{}
\"\"\".format(\"\".join(rules[i:i+rules_per_table])) \n", " for i in range(0,len(rules),rules_per_table)]\n", " return \" \".join(tables)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us now create an example CFG." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
SNP_p VP_p
SNP_s VP_s
NP_pMatko raps
VP_pare ADJ
NP_sMatko
VP_sraps in StatNLP
ADJsilly
" ], "text/plain": [ "<__main__.CFG at 0x7f5e63cd47b8>" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cfg = CFG.from_rules([('S', ['NP_p','VP_p']),('S',['NP_s','VP_s']), \n", " ('NP_p', ['Matko', 'raps']),\n", " ('VP_p', ['are', 'ADJ']),\n", " ('NP_s', ['Matko']),\n", " ('VP_s', ['raps', 'in', 'StatNLP']),\n", " ('ADJ', ['silly'])\n", " ])\n", "cfg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## (Left-most) Derivation\n", "A left-most derivation given a CFG \\\\(G\\\\) is a sequence of strings \\\\(s_1 \\ldots s_n\\\\) such that \n", "\n", "* \\\\(s_1 = S\\\\), that is, the first string consists only of the start symbol.\n", "* \\\\(s_n \\in \\Sigma^*\\\\), that is, the last string consists of only terminals.\n", "* Each \\\\(s_i\\\\) for \\\\(i > 1\\\\) is generated by replacing the left-most non-terminal \\\\(\\alpha\\\\) with the right-hand side of any rule that has \\\\(\\alpha\\\\) as left-hand side. \n", "\n", "Let us write some code that puts this definition into action and generates random derivations based on a grammar. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import random\n", "def generate_deriv(cfg, sentence, results = None):\n", " \"\"\"\n", " Given a CFG and input sentence, generate a tuple of lists corresponding to a derivation as defined above.\n", " Args:\n", " cfg: a CFG\n", " sentence: list of tokens (strings)\n", " results: derivation built so far. If `None` is passed a singleton tuple with the \n", " sentence as element is used as initial derivation.\n", " Returns:\n", " a tuple representing the individual sentences in the derivation.\n", " \"\"\"\n", " actual_result = (sentence,) if results is None else results\n", " non_terminals = ((t,i) for i, t in enumerate(sentence) if t in cfg.n)\n", " first_non_terminal, first_index = next(non_terminals, (None, -1))\n", " if first_non_terminal is not None:\n", " relevant_rules = [rule for rule in cfg.r if rule[0] == first_non_terminal]\n", " sampled_rule = random.choice(relevant_rules)\n", " new_sentence = sentence[:first_index] + list(sampled_rule[1]) + sentence[first_index+1:]\n", " return generate_deriv(cfg, new_sentence, actual_result + (new_sentence,))\n", " else:\n", " return actual_result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us generate an example derivation." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(['S'],\n", " ['NP_p', 'VP_p'],\n", " ['Matko', 'raps', 'VP_p'],\n", " ['Matko', 'raps', 'are', 'ADJ'],\n", " ['Matko', 'raps', 'are', 'silly'])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "generate_deriv(cfg, [cfg.s])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Parse Trees\n", "Derivations can be compactly present as trees where each non-leaf node corresponds to an expanded left-hand-side and its children to the rules' right hand side. We will represent trees simply by using string objects for terminal nodes, and `(label,children)` tuples for non-terminals where `label` is the string non-terminal label, and `children` is a list of child trees." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "VP_p\n", "\n", "\n", "5->3\n", "\n", "\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "6\n", "\n", "S\n", "\n", "\n", "6->2\n", "\n", "\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tree = ('S', [('NP_p',['Matko','raps']), ('VP_p',['are','silly'])])\n", "parsing.render_tree(tree)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the same way we could generate derivations before, we can now generate parse trees from a CFG." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def generate_tree(cfg, rhs):\n", " if rhs in cfg.sigma:\n", " return rhs\n", " else:\n", " relevant_rules = [rule for rule in cfg.r if rule[0] == rhs]\n", " sampled_rule = random.choice(relevant_rules)\n", " children = [generate_tree(cfg, child) for child in sampled_rule[1]]\n", " return (rhs, children)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let us generate a tree, starting from a non-terminal in the CFG." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "ADJ\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "6\n", "\n", "VP_p\n", "\n", "\n", "6->3\n", "\n", "\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "7\n", "\n", "S\n", "\n", "\n", "7->2\n", "\n", "\n", "\n", "\n", "7->6\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "parsing.render_tree(generate_tree(cfg,'S')) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Finding Parse Trees\n", "As you can see, parse trees uncover how a sentence is structured with respect to a grammar. In fact, they give us insights about both constituency and grammatical relations: In the above case the subtrees headed by \"NP\" nodes represent noun phrase constituents; likewise, the \"NP\" and \"VP\" subtrees under \"S\" represent the subject and object of a sentence. It is interesting to generate trees and hence sentences given a starting symbol. However, in practice it is often more useful to find a parse tree given a sentence, if existent. This process is generally referred to as parsing. \n", "\n", "There are a couple of approaches to find a legal parse tree given a sentence and grammar:\n", "\n", "* **Top-Down**: Start with the start symbol and generate trees; backtrack if they do not match observed sentence.\n", "* **Bottom-Up**: Start with the sentence, and find rules that generate parts of it; backtrack if no start symbol can be induced.\n", "* **Dynamic Programming**: Explore several trees in parallel and re-use computations.\n", "\n", "### Bottom-Up Parsing with Backtracking\n", "A bottom-up parser with backtracking maintains a state consisting of a stack of processed trees, and buffer of remaining words, and a history of previous states we can backtrack to if we cannot make more progress. Roughly speaking, the algorithm proceeds by iteratively reducing the stack via rules to new trees, moving forward in the sentence, and backtracking to previous states when we reach the end of the sentence but cannot create a single tree. \n", "\n", "More formally, the algorithm operates a machine with a state $S=(b,t,r)$ where $b$ is a buffer of remaining words to process, $t$ a stack of trees build so far, and $r$ a list of rules that remain to be checked for whether they can applied to the top elements of $t$. The algorithm also maintains a history $h$ of previously visited states. \n", "\n", "* **Input**: sentence $s$, Rules $R$.\n", "* **Initialisation**:\n", " * $b \\leftarrow s$\n", " * $r \\leftarrow R$\n", " * $t \\leftarrow ()$\n", " * $h \\leftarrow ()$\n", "* **while** $|b| > 0$ or $|r| > 0$ or both $|s| > 1$ and $|h| > 0$\n", " * **if** $|r| > 0$ reduce\n", " * **if** $|b| > 0$ shift\n", " * **else if** $|s| > 1$ back-track\n", " \n", "The core operations within the algorithm are:\n", "\n", "* **reduce**:\n", " * **while** $|r|>0$:\n", " * Let the first rule in $r$ be $r_1 = \\alpha \\leftarrow \\beta_1 \\ldots \\beta_n$\n", " * check whether $\\beta_1 \\ldots \\beta_n$ is equal to the labels of the top of the stack $t_1 \\ldots t_n$\n", " * If so, remember current state $S$ in history $h$, set $r \\leftarrow R$, pop $n$ elements from the tree stack $t$ and push the tree $(alpha, t_1 \\ldots t_n)$ onto $t$.\n", " * Otherwise: remove $r_1$ from the list of rules $r$.\n", "* **shift**:\n", " * append the first element $b_1$ of the buffer $b$ to $t$ and remove $b_1$ from $b$.\n", " * Initialise rules again by setting $r \\leftarrow R$\n", "* **back-track**:\n", " * set $G=h_\\text{last}$ where $h_\\text{last}$ is the last element from the history $h$ \n", " * Remove $h_\\text{last}$ from $h$. \n", " * Remove the first rule $r_1$ from the list of rules $r$.\n", "\n", "In Python code the parsing algorithm can be formulated as follows. Notice that because we are making in-place changes to the current state object, when recording states for later visualisation and for backtracking we need to perform deep-copy operations. In an actual parser this would soon become a major bottleneck, and in practice one would need to find more efficient ways to keep track of past states. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "import copy\n", "\n", "class State:\n", " \"\"\"\n", " Representation of the parsing state in bottom-up parsing with backtracking.\n", " \"\"\"\n", " def __init__(self, stack, buffer, rules):\n", " \"\"\"\n", " Create a new state.\n", " Args:\n", " stack: list of trees.\n", " buffer: list of remaining words.\n", " rules: rules left to process.\n", " \"\"\"\n", " self.stack = stack\n", " self.buffer = buffer\n", " self.rules = rules\n", " \n", "def bottom_up_parse(cfg, sentence):\n", " \"\"\"\n", " Parse the given sentence using the given CFG.\n", " Args:\n", " cfg: a CFG.\n", " sentence: a list of words.\n", " \"\"\"\n", " history = []\n", " state = State([], sentence, cfg.r)\n", " transitions = [(copy.deepcopy(state),'Init')]\n", " \n", " def reduce():\n", " \"\"\"\n", " Go over remaining rules until one can be found that combines the top elements into the stack.\n", " Changes the current state.\n", " \"\"\"\n", " while len(state.rules) > 0:\n", " lhs, rhs = state.rules[0]\n", " top = state.stack[-len(rhs):]\n", " top_labels = [parsing.get_label(node) for node in top]\n", " if tuple(top_labels) == rhs:\n", " history.append(copy.deepcopy(state))\n", " state.stack = state.stack[:-len(rhs)] + [(lhs, top)]\n", " state.rules = cfg.r\n", " transitions.append((copy.deepcopy(state),'Reduce'))\n", " else:\n", " state.rules = state.rules[1:]\n", " \n", " def shift():\n", " \"\"\"\n", " Move first element in the buffer onto the stack, reinitialise the remaing rules. \n", " \"\"\"\n", " state.stack = state.stack + [state.buffer[0]] \n", " state.buffer = state.buffer[1:]\n", " state.rules = cfg.r\n", " transitions.append((copy.deepcopy(state),'Shift'))\n", " \n", " def backtrack():\n", " \"\"\"\n", " backtrack to the last decision point in the history, and skip the rule that was used there.\n", " \"\"\"\n", " nonlocal state\n", " state = history[-1]\n", " state.rules = state.rules[1:]\n", " transitions.append((copy.deepcopy(state),'Backtrack'))\n", " del history[-1]\n", "\n", " while len(state.buffer) > 0 or (len(state.stack) > 1 and len(history) > 0) or len(state.rules) > 0:\n", " if len(state.rules) > 0:\n", " reduce()\n", " if len(state.buffer) > 0:\n", " shift()\n", " elif len(state.stack) > 1:\n", " backtrack()\n", " return transitions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us parse a simple sentence using this algorithm." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "ADJ\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "6\n", "\n", "VP_p\n", "\n", "\n", "6->3\n", "\n", "\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "7\n", "\n", "S\n", "\n", "\n", "7->2\n", "\n", "\n", "\n", "\n", "7->6\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sentence = ['Matko', 'raps', 'are', 'silly']\n", "transitions = bottom_up_parse(cfg, sentence)\n", "parsing.render_forest(transitions[-1][0].stack)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also inspect the individual transitions performed during parsing." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Matko raps are silly\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "\n", "Init
raps are silly\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "\n", "Shift
raps are silly\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "NP_s\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "\n", "Reduce
are silly\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "NP_s\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "raps\n", "\n", "\n", "\n", "Shift
silly\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "NP_s\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "raps\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "\n", "Shift
\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "NP_s\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "raps\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "\n", "Shift
\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "NP_s\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "raps\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "ADJ\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "\n", "Reduce
\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "NP_s\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "raps\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "ADJ\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "6\n", "\n", "VP_p\n", "\n", "\n", "6->3\n", "\n", "\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "\n", "Reduce
\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "NP_s\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "raps\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "ADJ\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "\n", "Backtrack
\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "NP_s\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "raps\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "\n", "Backtrack
raps are silly\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "\n", "Backtrack
are silly\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "\n", "Shift
are silly\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "\n", "Reduce
silly\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "\n", "Shift
\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "\n", "Shift
\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "ADJ\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "\n", "Reduce
\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "ADJ\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "6\n", "\n", "VP_p\n", "\n", "\n", "6->3\n", "\n", "\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "\n", "Reduce
\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "ADJ\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "6\n", "\n", "VP_p\n", "\n", "\n", "6->3\n", "\n", "\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "7\n", "\n", "S\n", "\n", "\n", "7->2\n", "\n", "\n", "\n", "\n", "7->6\n", "\n", "\n", "\n", "\n", "\n", "Reduce
" ], "text/plain": [ ".Output at 0x7f5e63cd4eb8>" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "parsing.render_transitions(transitions)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dynamic Programming for Parsing\n", "A problem with the bottom-up parser is the fact that, after back-tracking, it may redo many steps it had already before (can you find examples of this behaviour in the transitions above?). This suggests to remember steps and re-use then when needed. This is the general idea behind _dynamic programming_: caching and reusing computation whenever possible. \n", "\n", "### Chomsky Normal Form\n", "In the case of CFG parsing there exists a very effective dynamic program, the so-called Cocke–Younger–Kasami (CYK) algorithm. However, before we can apply this algorithm we need to _normalize_ the grammar. In particular, we need to make sure that each rule has one of the following forms:\n", "\n", "* \\\\(\\alpha \\rightarrow \\beta \\gamma\\\\) where \\\\(\\beta,\\gamma \\in N \\setminus \\\\{ S \\\\} \\\\). \n", "* \\\\(\\alpha \\rightarrow t\\\\) where \\\\(t \\in \\Sigma\\\\).\n", "\n", "In words: each rule is either binary and expands into two non-terminal non-Start symbols, or unary and expands into a word. This form is called _Chomsky Normal Form_ (CNF). \n", "\n", "Fortunately we can convert every CFG into an equivalent CFG in CNF, in the sense that any derivation or parse of sentence in one grammar can be loss-lessly converted to a derivation in the other grammar. This can be achieved through a few simple transformations. Here the rules on the left-hand side of the transformation arrow $\\Rightarrow$ will be replaced by the ones on the left-hand side: \n", "\n", "* $\\alpha \\rightarrow \\beta \\gamma \\delta \\Rightarrow \\alpha \\rightarrow \\beta\\alpha', \\alpha' \\rightarrow \\gamma \\delta$\n", "* $\\alpha \\rightarrow \\beta t \\Rightarrow \\alpha \\rightarrow \\beta \\alpha', \\alpha' \\rightarrow t$ where $t \\in \\Sigma$\n", "* $\\alpha \\rightarrow \\beta, \\beta \\rightarrow \\gamma \\Rightarrow \\alpha \\rightarrow \\gamma, \\beta \\rightarrow \\gamma$ \n", "\n", "We present this conversion in Python below, but omitted the third transformation which is not relevant for the grammar we are using here (Exercise: add this case)." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "def to_cnf(cfg):\n", " \"\"\"\n", " Convert a CFG into CNF.\n", " Args:\n", " cfg: a CFG.\n", " Returns:\n", " A CFG in CNF.\n", " \"\"\"\n", " # go through all rules, check if rule 1) or 2) can be applied, create new rules\n", " new_non_terminals = []\n", " \n", " def fresh_non_terminal(prefix):\n", " new_non_terminal = prefix + \"_\" + str(len(new_non_terminals))\n", " new_non_terminals.append(new_non_terminal)\n", " return new_non_terminal\n", " \n", " def transform_rule(rule):\n", " lhs, rhs = rule\n", " if len(rhs) > 2:\n", " # lhs -> r1 r2 ... rn\n", " new_lhs = fresh_non_terminal(lhs)\n", " new_rule_1 = (lhs, rhs[:1] + (new_lhs,))\n", " new_rule_2 = (new_lhs, rhs[1:])\n", " return transform_rule(new_rule_1) + transform_rule(new_rule_2)\n", " elif len(rhs) == 2 and any((n in cfg.sigma for n in rhs)):\n", " new_rules = []\n", " new_rhs = []\n", " for n in rhs:\n", " if n in cfg.sigma:\n", " new_lhs = fresh_non_terminal(lhs)\n", " new_rule = (new_lhs,[n])\n", " new_rules.append(new_rule)\n", " new_rhs.append(new_lhs)\n", " else:\n", " new_rhs.append(n)\n", " new_rules.append((lhs,new_rhs))\n", " return sum([transform_rule(r) for r in new_rules], [])\n", " else:\n", " return [rule]\n", " \n", " result = []\n", " for rule in cfg.r:\n", " result += transform_rule(rule)\n", "\n", " return CFG.from_rules(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us appyl this transformation to our running example grammar." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
SNP_p VP_p
SNP_s VP_s
NP_p_0Matko
NP_p_1raps
NP_pNP_p_0 NP_p_1
VP_p_2are
VP_pVP_p_2 ADJ
NP_sMatko
VP_s_4raps
VP_sVP_s_4 VP_s_3
VP_s_3_5in
VP_s_3_6StatNLP
VP_s_3VP_s_3_5 VP_s_3_6
ADJsilly
" ], "text/plain": [ "<__main__.CFG at 0x7f5e63cef438>" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cnf_cfg = to_cnf(cfg)\n", "cnf_cfg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### CYK algorithm\n", "The CYK algorithm caches, for each span in the sentence, all possible trees that can cover the span according to the CFG. The core abstraction that enables the CYK algorithm is a *chart*: a table that stores, for each span $(i,j)$ (where $i\\leq j$) of the sentence the set of possible non-terminal labels the grammar supports for this span together with a pointer to two chart cells that were used to produce the label. We provide the python class `parsing.Chart` that implements such a chart together with a way to visualise the triangular structure of the chart. " ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_s, NP_p_0NP_p_2
1: rapsVP_s_6, NP_p_1
2: are
3: silly
" ], "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "chart = parsing.Chart(sentence)\n", "chart.append_label(0,0,'NP_s')\n", "chart.append_label(0,0,'NP_p_0')\n", "chart.append_label(1,1,'VP_s_6')\n", "chart.append_label(1,1,'NP_p_1')\n", "chart.append_label(0,1,'NP_p_2', [(0,0,'NP_p_0'),(1,1,'NP_p_1')]) \n", "chart.mark(0, 1, 'NP_p_2')\n", "chart.mark_target(0,1)\n", "chart" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The CYK algorithm operates this chart as follows:\n", "\n", "* **Input**: sentence $s=s_1 \\ldots s_n$, CFG $G=(N, \\Sigma, R, S)$ in Chomsky Normal Form.\n", "* **Initialisation**:\n", " * $C(i,j)=\\emptyset$ for all $i < j$, $i,j \\in \\{1,\\ldots, n\\}$\n", " * $P(i,j,l)=\\emptyset$ for all $i < j$, $i,j \\in \\{1,\\ldots, n\\}$ and $l \\in N$\n", " * **for** $i = 1 \\ldots n$ \n", " * **for** every terminal rule $\\alpha \\rightarrow s_i \\in R$\n", " * $C(i,i) \\leftarrow \\{ \\alpha \\} $\n", "* **for** $w \\in 1, \\ldots, n$ \n", " * **for** $b \\in 1, \\ldots, n - 1$\n", " * $e \\leftarrow b + w$\n", " * **for** $m \\in b, \\ldots, e$\n", " * **for** $\\beta \\in C(b,m)$ and $\\gamma \\in C(m + 1, e)$ **if** $\\alpha \\rightarrow \\beta \\gamma \\in R$\n", " * $C(b,e) \\leftarrow C(b,e) \\cup \\{ \\alpha \\} $\n", " * $P(i,j,l) \\leftarrow P(i,j,l) \\cup \\{((b,m,\\beta),(m+1,e,\\gamma)) \\}$ \n", " \n", "We can implement this algorithm in Python as follows. Notice that we keep track of all steps of the algorithm for later introspection. In real applications this is not needed." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "def cyk(cfg, s):\n", " \"\"\"\n", " Performs the CYK algorithm on sentence `s` given CFG `cfg.\n", " Args:\n", " cfg: a CFG in CNF.\n", " s: a sentence as a list of tokens (strings).\n", " \"\"\"\n", " chart = parsing.Chart(s)\n", " trace = [copy.deepcopy(chart)]\n", " n = len(s)\n", " for i in range(0,n):\n", " chart.clear_cell_marks()\n", " chart.mark_target(i,i)\n", " trace.append(copy.deepcopy(chart))\n", " for non_terminal in [lhs for lhs, rhs in cfg.r if rhs == (s[i],)]:\n", " chart.append_label(i,i,non_terminal)\n", " chart.mark_cyk_terminal(i, non_terminal)\n", " trace.append(copy.deepcopy(chart))\n", " for width in range(1,n):\n", " for begin in range(0, n-width):\n", " end = begin + width\n", " for middle in range(begin, end):\n", " chart.mark_cyk_focus(begin, middle, end)\n", " trace.append(copy.deepcopy(chart)) \n", " for beta in chart.labels_at_cell(begin, middle):\n", " for gamma in chart.labels_at_cell(middle+1, end):\n", " chart.mark_cyk_source_focus(begin, middle, end, beta, gamma)\n", " trace.append(copy.deepcopy(chart))\n", " labels = [lhs for lhs, rhs in cfg.r if rhs == (beta, gamma)]\n", " for alpha in labels:\n", " chart.append_label(begin, end, alpha, [(begin,middle,beta),(middle+1,end,gamma)])\n", " chart.mark_cyk_rule(begin, end, alpha)\n", " trace.append(copy.deepcopy(chart))\n", " chart.clear_marks()\n", " trace.append(chart)\n", " return trace" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us apply the CYK algorithm to our running example sentence." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_pS
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
" ], "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trace = cyk(cnf_cfg, sentence)\n", "trace[-1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also visualise the final tree. Notice the additional non-terminal nodes necessary to present the grammar in CNF." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "NP_p_0\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "raps\n", "\n", "\n", "3\n", "\n", "NP_p_1\n", "\n", "\n", "3->2\n", "\n", "\n", "\n", "\n", "4\n", "\n", "NP_p\n", "\n", "\n", "4->1\n", "\n", "\n", "\n", "\n", "4->3\n", "\n", "\n", "\n", "\n", "5\n", "\n", "are\n", "\n", "\n", "6\n", "\n", "VP_p_2\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "7\n", "\n", "silly\n", "\n", "\n", "8\n", "\n", "ADJ\n", "\n", "\n", "8->7\n", "\n", "\n", "\n", "\n", "9\n", "\n", "VP_p\n", "\n", "\n", "9->6\n", "\n", "\n", "\n", "\n", "9->8\n", "\n", "\n", "\n", "\n", "10\n", "\n", "S\n", "\n", "\n", "10->4\n", "\n", "\n", "\n", "\n", "10->9\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "parse_result = trace[-1].derive_trees()[0]\n", "parsing.render_tree(parse_result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can remove all non-terminals not in the original grammar to get a representation of the parse in terms of that original grammar." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "Matko\n", "\n", "\n", "1\n", "\n", "raps\n", "\n", "\n", "2\n", "\n", "NP_p\n", "\n", "\n", "2->0\n", "\n", "\n", "\n", "\n", "2->1\n", "\n", "\n", "\n", "\n", "3\n", "\n", "are\n", "\n", "\n", "4\n", "\n", "silly\n", "\n", "\n", "5\n", "\n", "ADJ\n", "\n", "\n", "5->4\n", "\n", "\n", "\n", "\n", "6\n", "\n", "VP_p\n", "\n", "\n", "6->3\n", "\n", "\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "7\n", "\n", "S\n", "\n", "\n", "7->2\n", "\n", "\n", "\n", "\n", "7->6\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "parsing.render_tree(parsing.filter_non_terminals(parse_result, cfg.n))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can observe the individual steps of the CYK algorithm like so:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", " Previous\n", "  \n", " Next\n", "
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: Matko
1: raps
2: are
3: silly
1 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: Matko
1: raps
2: are
3: silly
2 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0
1: raps
2: are
3: silly
3 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: raps
2: are
3: silly
4 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: raps
2: are
3: silly
5 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: rapsNP_p_1
2: are
3: silly
6 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: rapsNP_p_1, VP_s_4
2: are
3: silly
7 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: rapsNP_p_1, VP_s_4
2: are
3: silly
8 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: silly
9 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: silly
10 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
11 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
12 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_s
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
13 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
14 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
15 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
16 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
17 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
18 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
19 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
20 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
21 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2
3: sillyADJ
22 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
23 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
24 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
25 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
26 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
27 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
28 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
29 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
30 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
31 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
32 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_p
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
33 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_pS
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
34 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_pS
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
35 / 36
\n", "
\n", "\n", "\n", "\n", "
0: Matko1: raps2: are3: silly
0: MatkoNP_p_0, NP_sNP_pS
1: rapsNP_p_1, VP_s_4
2: areVP_p_2VP_p
3: sillyADJ
36 / 36
\n", "
\n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "util.Carousel(trace)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Ambiguity \n", "For real world grammars many phrases have several legal parse trees. Which one is the correct one depends on the intent of the speaker. That said, in many cases it is quite obvious that one parse tree should be more likely, or have a higher _probability_. This suggest a probabilistic treatment of grammars and parsing. Before we introduce probabilistic CFGs, let us inspect a typical case of syntactic natural language ambiguity.\n", "\n", "Consider the following grammar and sentence." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "amb_cfg = CFG.from_rules([\n", " ('S', ['Subj','VP']),\n", " ('Subj', ['He']),\n", " ('Verb', ['shot']),\n", " ('VP', ['Verb', 'Obj']), ('VP', ['Verb', 'Obj', 'PP']),\n", " ('PP', ['in','his','pyjamas']),\n", " ('Obj', ['the','elephant']), ('Obj', ['the','elephant','PP'])\n", " ])\n", "amb_cnf_cfg = to_cnf(amb_cfg)\n", "amb_sentence = [\"He\", \"shot\", \"the\", \"elephant\", \"in\", \"his\", \"pyjamas\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can parse this sentence as before, using the CYK algorithm. This time there will be two parses though:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "He\n", "\n", "\n", "1\n", "\n", "Subj\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "shot\n", "\n", "\n", "3\n", "\n", "Verb\n", "\n", "\n", "3->2\n", "\n", "\n", "\n", "\n", "4\n", "\n", "the\n", "\n", "\n", "5\n", "\n", "elephant\n", "\n", "\n", "6\n", "\n", "in\n", "\n", "\n", "7\n", "\n", "his\n", "\n", "\n", "8\n", "\n", "pyjamas\n", "\n", "\n", "9\n", "\n", "PP\n", "\n", "\n", "9->6\n", "\n", "\n", "\n", "\n", "9->7\n", "\n", "\n", "\n", "\n", "9->8\n", "\n", "\n", "\n", "\n", "10\n", "\n", "Obj\n", "\n", "\n", "10->4\n", "\n", "\n", "\n", "\n", "10->5\n", "\n", "\n", "\n", "\n", "10->9\n", "\n", "\n", "\n", "\n", "11\n", "\n", "VP\n", "\n", "\n", "11->3\n", "\n", "\n", "\n", "\n", "11->10\n", "\n", "\n", "\n", "\n", "12\n", "\n", "S\n", "\n", "\n", "12->1\n", "\n", "\n", "\n", "\n", "12->11\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "amb_trace = cyk(amb_cnf_cfg, amb_sentence)\n", "amb_parse_results = amb_trace[-1].derive_trees()\n", "parsing.render_tree(parsing.filter_non_terminals(amb_parse_results[0],amb_cfg.n))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The second parse is:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "He\n", "\n", "\n", "1\n", "\n", "Subj\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "shot\n", "\n", "\n", "3\n", "\n", "Verb\n", "\n", "\n", "3->2\n", "\n", "\n", "\n", "\n", "4\n", "\n", "the\n", "\n", "\n", "5\n", "\n", "elephant\n", "\n", "\n", "6\n", "\n", "Obj\n", "\n", "\n", "6->4\n", "\n", "\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "7\n", "\n", "in\n", "\n", "\n", "8\n", "\n", "his\n", "\n", "\n", "9\n", "\n", "pyjamas\n", "\n", "\n", "10\n", "\n", "PP\n", "\n", "\n", "10->7\n", "\n", "\n", "\n", "\n", "10->8\n", "\n", "\n", "\n", "\n", "10->9\n", "\n", "\n", "\n", "\n", "11\n", "\n", "VP\n", "\n", "\n", "11->3\n", "\n", "\n", "\n", "\n", "11->6\n", "\n", "\n", "\n", "\n", "11->10\n", "\n", "\n", "\n", "\n", "12\n", "\n", "S\n", "\n", "\n", "12->1\n", "\n", "\n", "\n", "\n", "12->11\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "parsing.render_tree(parsing.filter_non_terminals(amb_parse_results[1],amb_cfg.n))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The example is an instance of _prepositional phrase attachment ambiguity_: the \"in his pyjamas\" phrase could both be part of the verb phrase, meaning that the shooting happened in pyjamas, or part of the noun phrase \"the elephant\", meaning that the elephant was in pyjamas. Both readings make syntactic sense, so there is nothing wrong with the grammar. However, we would like the machine to return the preferred reading when such sentence is parsed. For this we need to find a way to assign different readings different probabilities.\n", "\n", "## Probabilistic Context Free Grammars\n", "[Probabilistic Context Free Grammars](http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/pcfgs.pdf) (PFCGs) are Context Free Grammars in which rules have probabilities. More formally, a PCFG consists of \n", "\n", "* A Context Free Grammar \\\\(G(N,\\Sigma,R,S)\\\\).\n", "* A parameter \\\\(\\param(\\alpha \\rightarrow \\beta)\\\\) for each rule \\\\(\\alpha \\rightarrow \\beta \\in R\\\\). For each possible left hand side \\\\(\\alpha \\in N\\\\) we require \\\\(\\sum_\\beta \\param(\\alpha \\rightarrow \\beta) = 1\\\\).\n", "\n", "A PCFG defines a probability distribution over parse trees as follows. Given a parse tree \\\\(\\mathbf{t}\\\\) that contains the rules \\\\(\\alpha_1 \\rightarrow \\beta_1, \\ldots, \\alpha_n \\rightarrow \\beta_n\\\\), the probability of this tree under the PCFG is:\n", "$$\n", " \\newcommand{parse}{\\mathbf{t}}\n", " p_{\\param}(\\parse) = \\prod_i^n \\param(\\alpha_i \\rightarrow \\beta_i) \n", "$$\n", "\n", "Notice that we can develop and operate parsers with the structured prediction recipe. We have model \\\\(p\\\\), some parameters \\\\(\\params\\\\) that need to be estimated on a training set, and the prediction/search problem of finding the most likely parse tree given a sentence. The next sections will cover these aspects.\n", "\n", "Before we show examples, let us define a Python data structure for PCFGs. " ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "from collections import defaultdict\n", "class PCFG:\n", " def __init__(self, cfg, probs):\n", " \"\"\"\n", " Create a new PCFG.\n", " Args:\n", " cfg: a CFG object.\n", " probs: a dict from rules to probabilities. \n", " \"\"\"\n", " self.cfg = cfg\n", " self.probs = defaultdict(lambda: 1.0)\n", " self.probs.update(probs)\n", "\n", " def prob(self, rule):\n", " return self.probs[rule]\n", " \n", " def prob_rules(self):\n", " return ((lhs,self.prob((lhs,rhs)),rhs) for lhs,rhs in self.cfg.r)\n", " \n", " @classmethod\n", " def from_rules(cls, rules, s='S'):\n", " \"\"\"\n", " Creates a PCFG from a list of weighted rules.\n", " Args:\n", " rules: a list of rules of shape `(lhs,prob,rhs)` where `lhs` is a non-terminal, `prob` a real number\n", " `rhs` a list of right-hand side symbols the rule expands to.\n", " s: start symbol.\n", " Returns:\n", " PCFG with rules as specified, with terminals and non-terminals determined by which terminals are on the\n", " left hand side of the rules.\n", " \"\"\"\n", " non_terminals = {lhs for lhs,_,_ in rules}\n", " left_hand_sides = {node \n", " for _,_,rhs in rules \n", " for node in rhs}\n", " terminals = {n for n in left_hand_sides if n not in non_terminals} \n", " det_rules = [(lhs,rhs) for lhs,_,rhs in rules]\n", " cfg = CFG(non_terminals, terminals, det_rules, s)\n", " probs = [((lhs,tuple(rhs)),prob) for lhs,prob,rhs in rules]\n", " return cls(cfg,probs)\n", " \n", " def _repr_html_(self):\n", " \"\"\"\n", " Simple HTML representation of a PCFG for easy rendering in notebook.\n", " Returns:\n", " an HTML string representing the PCFG as table.\n", " \"\"\"\n", " rules_per_table = math.ceil(len(self.cfg.r) / 3)\n", " rules = [\"{}{}{}\".format(rule[0],self.prob(tuple(rule)),\" \".join(rule[1])) \n", " for rule in self.cfg.r]\n", " tables = [\"\"\"{}
\"\"\".format(\"\".join(rules[i:i+rules_per_table])) \n", " for i in range(0,len(rules),rules_per_table)]\n", " return \" \".join(tables)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's create a probabilistic version of the ambiguous CFG shown above." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
S1.0Subj VP
Subj1.0He
Verb1.0shot
VP0.3Verb Obj
VP0.7Verb Obj PP
PP1.0in his pyjamas
Obj0.5the elephant
Obj0.5the elephant PP
" ], "text/plain": [ "<__main__.PCFG at 0x7f5e63cadbe0>" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pcfg = PCFG.from_rules([\n", " ('S', 1.0, ['Subj','VP']),\n", " ('Subj', 1.0, ['He']),\n", " ('Verb', 1.0, ['shot']),\n", " ('VP', 0.3, ['Verb', 'Obj']), ('VP', 0.7, ['Verb', 'Obj', 'PP']),\n", " ('PP', 1.0, ['in','his','pyjamas']),\n", " ('Obj', 0.5, ['the','elephant']), ('Obj', 0.5, ['the','elephant','PP'])\n", " ])\n", "pcfg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prediction / Parsing\n", "\n", "Let us first focus on the prediction task: given a sentence, find the highest scoring parse tree consistent with this sentence. A tree is consistent with a sentence if its leaf terminal nodes are equal to the sentence. In a way, we have already solved a variant of this problem. We can consider a CFG as a deterministic distribution over trees, and finding _a_ highest scoring parse is equivalent to finding any legal tree—our bottom-up and dynamic program based algorithms hence performed a structured prediction.\n", "\n", "More formally, we can understand the PCFG distribution $\\prob_{\\params}(\\parse)$ as joint distribution $\\prob_{\\params}(\\x,\\y)$ over a non-terminal upper part of the tree $\\y$ (which includes for each of its own leaf nodes the number of terminal children it should have) and a terminal lower part $\\x$ corresponding to the input sentence. Both $\\x$ and $\\y$ form the parse tree $\\parse$. Notice that for a given sentence $\\x$ a *legal* upper tree $\\y$ needs to have a total number of children identical to the length of $\\x$. If we assume that the PCFG rules involving non-terminal nodes are in normal form ($\\alpha \\rightarrow t$ with $t \\in \\Sigma$) a legal $\\y$ needs to have as many children as there are tokens in $\\x$. Let us denote the set of legal (non-terminal) parse trees for a given sentence $\\x$ and grammar $G$ with $\\Ys(\\x,G)$.\n", "\n", "With the above definitions we can define the parsing or prediction problems as follows. \n", "\n", "$$\n", "\\argmax_{\\y \\in \\Ys(\\x,G)} \\prob_\\params(\\x | \\y) = \\argmax_{\\y \\in \\Ys(\\x,G)} \\prob_\\params(\\x , \\y)\n", "$$\n", "\n", "where the equality follows from $\\prob_\\params(\\x | \\y) \\propto \\prob_\\params(\\x , \\y)$.\n", "\n", "### CYK for PCFGs\n", "\n", "As it turns out, we can use a variant of the CYK algorithm to solve the prediction problem above if we again assume (without loss of generality) that the grammar is CNF. As first step we hence need to convert the PCFG into a normalised PCFG: " ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
S1.0Subj VP
Subj1.0He
Verb1.0shot
VP0.3Verb Obj
VP0.7Verb VP_0
VP_01.0Obj PP
PP_21.0in
PP1.0PP_2 PP_1
PP_1_31.0his
PP_1_41.0pyjamas
PP_11.0PP_1_3 PP_1_4
Obj_51.0the
Obj_61.0elephant
Obj0.5Obj_5 Obj_6
Obj_81.0the
Obj0.5Obj_8 Obj_7
Obj_7_91.0elephant
Obj_71.0Obj_7_9 PP
" ], "text/plain": [ "<__main__.PCFG at 0x7f5e63b9d898>" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def to_pcnf(pcfg):\n", " # go through all rules, check if rule 1) or 2) can be applied, create new rules\n", " new_non_terminals = []\n", "\n", " def fresh_non_terminal(prefix):\n", " new_non_terminal = prefix + \"_\" + str(len(new_non_terminals))\n", " new_non_terminals.append(new_non_terminal)\n", " return new_non_terminal\n", "\n", " def transform_rule(rule):\n", " lhs, prob, rhs = rule\n", " if len(rhs) > 2:\n", " # lhs -> r1 r2 ... rn\n", " new_lhs = fresh_non_terminal(lhs)\n", " new_rule_1 = (lhs, prob, rhs[:1] + (new_lhs,))\n", " new_rule_2 = (new_lhs, 1.0, rhs[1:])\n", " return transform_rule(new_rule_1) + transform_rule(new_rule_2)\n", " elif len(rhs) == 2 and any((n in pcfg.cfg.sigma for n in rhs)):\n", " new_rules = []\n", " new_rhs = []\n", " for n in rhs:\n", " if n in pcfg.cfg.sigma:\n", " new_lhs = fresh_non_terminal(lhs)\n", " new_rule = (new_lhs, 1.0, [n])\n", " new_rules.append(new_rule)\n", " new_rhs.append(new_lhs)\n", " else:\n", " new_rhs.append(n)\n", " new_rules.append((lhs, prob, new_rhs))\n", " return sum([transform_rule(r) for r in new_rules], [])\n", " else:\n", " return [(lhs,prob,rhs)]\n", "\n", " result = []\n", " for rule in pcfg.prob_rules():\n", " result += transform_rule(rule)\n", "\n", " return PCFG.from_rules(result)\n", " \n", "cnf_pcfg = to_pcnf(pcfg)\n", "cnf_pcfg\n", "# pcfg.cfg.r" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The core insight that enables this CYK variant to find the *globally optimal* parse $\\parse=(\\x,\\y)$ in polynomial time is the following. When composing the trees $t_1$ and $t_2$ of two consecutive spans $(i,m)$ and $(m,j)$, with labels $l_1$ and $l_2$, into a new tree $t$ for $(i,j)$ with label $l$, the probability of that new tree only depends on the probability of the child trees $p_1$ and $p_2$, and the probability of the rule $l\\rightarrow l_1 l_2$: $p_1 p_2 \\alpha(l\\rightarrow l_1 l_2)$. In particular, the precise structure of $t_1$ and $t_2$ does not matter, what matters is only their labels and probabilities. \n", "\n", "Now, assuming that $t_1$ and $t_2$ are the highest scoring trees with labels $l_1$ and $l_2$ for their respective spans, we know that $t$ has to be the highest scoring tree for $(i,j)$ for all trees that break up at $m$, and that have $l_1$ and $l_2$ as child labels. Consequently, when you search over all trees over $(i,j)$ with label $l$ you can search over all break points $m$, and all pairs of labels $l_1, l_2$, you will find the highest probability tree over $(i,j)$ with label $l$. \n", "\n", "Hence, the probabilistic version of the CYK algorithm proceeds almost exactly as the non-probabilistic version. Now we maintain for each span $(i,j)$ and each non-terminal label $l$ a (log) score $C(i,j,l)$. This score is equal to the highest score of any tree that spans $(i,j)$ and has the root label $l$. Notice that for $(i,j)$ and $l$ we remember the pair $P(i,j,l)$ of child cells and labels that produced the highest score." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The CYK algorithm for PCFGs operates as follows:\n", "\n", "* **Input**: sentence $s=s_1 \\ldots s_n$, CFG $G=(N, \\Sigma, R, S)$ in Chomsky Normal Form with probabilities $\\theta$.\n", "* **Initialisation**:\n", " * $C(i,j,l)=-\\infty$ for all $i < j$, $i,j \\in \\{1,\\ldots, n\\}$\n", " * $P(i,j,l)=\\emptyset$ for all $i < j$, $i,j \\in \\{1,\\ldots, n\\}$ and $l \\in N$\n", " * **for** $i = 1 \\ldots n$ \n", " * **for** every terminal rule $\\alpha \\rightarrow s_i \\in R$\n", " * $C(i,i,\\alpha) \\leftarrow \\log \\theta(\\alpha \\rightarrow s_i) $\n", "* **for** $w \\in 1, \\ldots, n$ \n", " * **for** $b \\in 1, \\ldots, n - 1$\n", " * $e \\leftarrow b + w$\n", " * **for** $m \\in b, \\ldots, e$\n", " * **for** $\\beta$ if $C(b,m,\\beta)>-\\infty$ and $\\gamma$ if $C(m+1,e,\\gamma)>-\\infty$ **if** $\\alpha \\rightarrow \\beta \\gamma \\in R$\n", " * $s \\leftarrow C(b,m,\\beta) + C(m+1,e,\\gamma) + \\log \\theta(\\alpha \\rightarrow \\beta \\alpha) $\n", " * **if** $s > C(b,e,\\alpha)$\n", " * $C(b,e,\\alpha) \\leftarrow s$\n", " * $P(i,j,l) \\leftarrow \\{((b,m,\\beta),(m+1,e,\\gamma)) \\}$ " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "import math\n", "\n", "def pcyk(pcfg, s):\n", " \"\"\"\n", " Performs the CYK algorithm on sentence `s` given a probabilistic CFG `pcfg`.\n", " Args:\n", " cfg: a PCFG in CNF.\n", " s: a sentence as a list of tokens (strings).\n", " \"\"\"\n", " chart = parsing.Chart(s)\n", " trace = [copy.deepcopy(chart)]\n", " n = len(s)\n", " for i in range(0,n):\n", " chart.clear_cell_marks()\n", " chart.mark_target(i,i)\n", " trace.append(copy.deepcopy(chart))\n", " for lhs,rhs in [(lhs,rhs) for lhs, rhs in pcfg.cfg.r if rhs == (s[i],)]:\n", " chart.update_label(i,i,lhs, math.log(pcfg.prob((lhs,rhs))))\n", " chart.mark_cyk_terminal(i, lhs)\n", " trace.append(copy.deepcopy(chart))\n", " for width in range(1,n):\n", " for begin in range(0, n-width):\n", " end = begin + width\n", " for middle in range(begin, end):\n", " chart.mark_cyk_focus(begin, middle, end)\n", " trace.append(copy.deepcopy(chart)) \n", " for beta in chart.labels_at_cell(begin, middle):\n", " for gamma in chart.labels_at_cell(middle+1, end):\n", " chart.mark_cyk_source_focus(begin, middle, end, beta, gamma)\n", " trace.append(copy.deepcopy(chart))\n", " labels = [lhs for lhs, rhs in pcfg.cfg.r if rhs == (beta, gamma)]\n", " for alpha in labels:\n", " prob = pcfg.prob((alpha,(beta,gamma)))\n", " new_score = math.log(prob) + chart.score(begin,middle,beta) + chart.score(middle+1,end,gamma)\n", " old_score = chart.score(begin,end,alpha)\n", " if new_score > old_score:\n", " chart.update_label(begin, end, alpha, new_score, [(begin,middle,beta),(middle+1,end,gamma)])\n", " chart.mark_cyk_rule(begin, end, alpha)\n", " trace.append(copy.deepcopy(chart))\n", " chart.clear_marks()\n", " trace.append(chart)\n", " return trace" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us run this algorithm on the previous sentence." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "%3\n", "\n", "\n", "0\n", "\n", "He\n", "\n", "\n", "1\n", "\n", "Subj\n", "\n", "\n", "1->0\n", "\n", "\n", "\n", "\n", "2\n", "\n", "shot\n", "\n", "\n", "3\n", "\n", "Verb\n", "\n", "\n", "3->2\n", "\n", "\n", "\n", "\n", "4\n", "\n", "the\n", "\n", "\n", "5\n", "\n", "elephant\n", "\n", "\n", "6\n", "\n", "Obj\n", "\n", "\n", "6->4\n", "\n", "\n", "\n", "\n", "6->5\n", "\n", "\n", "\n", "\n", "7\n", "\n", "in\n", "\n", "\n", "8\n", "\n", "his\n", "\n", "\n", "9\n", "\n", "pyjamas\n", "\n", "\n", "10\n", "\n", "PP\n", "\n", "\n", "10->7\n", "\n", "\n", "\n", "\n", "10->8\n", "\n", "\n", "\n", "\n", "10->9\n", "\n", "\n", "\n", "\n", "11\n", "\n", "VP\n", "\n", "\n", "11->3\n", "\n", "\n", "\n", "\n", "11->6\n", "\n", "\n", "\n", "\n", "11->10\n", "\n", "\n", "\n", "\n", "12\n", "\n", "S\n", "\n", "\n", "12->1\n", "\n", "\n", "\n", "\n", "12->11\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pcyk_trace = pcyk(cnf_pcfg, amb_sentence)\n", "parsing.render_tree(parsing.filter_non_terminals(pcyk_trace[-1].derive_trees()[0],pcfg.cfg.n))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can observe the progression of the algorithm as before." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", " Previous\n", "  \n", " Next\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: He
1: shot
2: the
3: elephant
4: in
5: his
6: pyjamas
1 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: He
1: shot
2: the
3: elephant
4: in
5: his
6: pyjamas
2 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shot
2: the
3: elephant
4: in
5: his
6: pyjamas
3 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shot
2: the
3: elephant
4: in
5: his
6: pyjamas
4 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: the
3: elephant
4: in
5: his
6: pyjamas
5 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: the
3: elephant
4: in
5: his
6: pyjamas
6 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00
3: elephant
4: in
5: his
6: pyjamas
7 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephant
4: in
5: his
6: pyjamas
8 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephant
4: in
5: his
6: pyjamas
9 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00
4: in
5: his
6: pyjamas
10 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: in
5: his
6: pyjamas
11 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: in
5: his
6: pyjamas
12 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: his
6: pyjamas
13 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: his
6: pyjamas
14 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamas
15 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamas
16 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
17 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
18 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
19 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
20 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
21 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
22 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
23 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
24 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
25 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
26 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
27 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
28 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
29 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
30 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
31 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
32 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
33 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
34 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00
6: pyjamasPP_1_4: 0.00
35 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
36 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
37 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
38 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
39 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
40 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
41 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
42 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
43 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
44 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
45 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
46 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
47 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
48 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
49 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
50 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
51 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
52 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
53 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
54 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
55 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
56 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
57 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
58 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
59 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
60 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
61 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
62 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
63 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
64 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
65 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
66 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
67 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
68 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
69 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
70 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
71 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
72 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
73 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
74 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
75 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
76 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
77 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
78 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
79 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
80 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
81 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
82 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
83 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
84 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
85 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
86 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
87 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
88 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
89 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
90 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
91 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
92 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
93 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
94 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
95 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.90
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
96 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
97 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
98 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
99 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
100 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
101 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
102 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
103 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
104 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90S: -1.05
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
105 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90S: -1.05
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
106 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90S: -1.05
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
107 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90S: -1.05
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
108 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90S: -1.05
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
109 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90S: -1.05
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
110 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90S: -1.05
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
111 / 112
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
0: He1: shot2: the3: elephant4: in5: his6: pyjamas
0: HeSubj: 0.00S: -1.90S: -1.05
1: shotVerb: 0.00VP: -1.90VP: -1.05
2: theObj_5: 0.00, Obj_8: 0.00Obj: -0.69Obj: -0.69, VP_0: -0.69
3: elephantObj_6: 0.00, Obj_7_9: 0.00Obj_7: 0.00
4: inPP_2: 0.00PP: 0.00
5: hisPP_1_3: 0.00PP_1: 0.00
6: pyjamasPP_1_4: 0.00
112 / 112
\n", "
\n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "util.Carousel(pcyk_trace)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Learning\n", "\n", "Learning in the context of PCFGs usually amounts to asking two questions:\n", "\n", "1. What should the rules in the grammar be?\n", "2. What should the probabilities associated with these rules be?\n", "\n", "Usually both questions are answered with the use of a training corpus of parse trees $\\train=(\\parse_1, \\ldots, \\parse_n)$. Such corpora exists for various domains and languages. Most famously, the [Penn Treebank Project](https://www.cis.upenn.edu/~treebank/) consists of a large number of parse trees for sentences in the 1989 Wall Street Journal (among other sources). The exist similar (but usually smaller) corpora for other languages (e.g. [Chinese](https://catalog.ldc.upenn.edu/LDC2013T21)) and other domains (e.g. [Biomedical Papers](www.nactem.ac.uk/aNT/genia.html)). The corpora are usually created by linguistic experts that produce parse trees for a list of sentences given to them. This is generally an expensive process as it need experts, and traditionally formed some bottleneck in parsing research. \n", "\n", "With a corpus $\\train$ we can answer the question of which rules to use by assessing what rules (sub-trees of parent $\\alpha$ and list of children $\\beta_1 \\ldots \\beta_m$) exists in the corpus. One could simply use all rules that appear in the data, or maybe filter using some threshold (TODO references?). \n", "\n", "To learn the parameters $\\params$ of the model we can again use the maximum-likelihood criterium:\n", "\n", "$$\n", "\\params^* = \\argmax_\\params \\sum_{\\parse \\in \\train} \\log \\prob_\\params(\\parse)\n", "$$\n", "\n", "As before, maximising this objective simply amounts to counting. This time we divide the number of times a rule has been observed in the data by the number of times the non-terminal has been seen in total.\n", "\n", "$$\n", " \\param(\\alpha \\rightarrow \\beta) = \\frac{\\counts{\\train}{\\alpha \\rightarrow \\beta}}{\\counts{\\train}{\\alpha}}\n", "$$\n", "\n", "We omit an implementation of this training regime as we have seen similar implementations in previous chapters (check for example the [language model](language-models.ipynb) chapter)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Advanced Topics\n", "* Discriminative Models for parsing\n", "* Unsupervised Parsing, grammar induction etc. \n", "* Semi-supervised Parsing\n", "* Dependency Parsing (see later)\n", "\n", "## Background Material\n", "* [Mike Collins' PCFG lecture](http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/pcfgs.pdf). \n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 1 }