{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Constituent Parsing Exercises\n",
"\n",
"\n",
"\n",
"In the lecture we took a look at a simple tokenizer and sentence segmenter. In this exercise we will expand our understanding of the problem by asking a few important questions, and looking at the problem from a different perspectives."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup 1: Load Libraries"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%capture\n",
"%load_ext autoreload\n",
"%autoreload 2\n",
"%matplotlib inline\n",
"# %cd .. \n",
"import sys\n",
"sys.path.append(\"..\")\n",
"import math \n",
"import statnlpbook.util as util\n",
"import statnlpbook.parsing as parsing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Task 1: Understanding parsing\n",
"\n",
"Be sure you understand [grammatical categories and structures](http://webdelprofesor.ula.ve/humanidades/azapata/materias/english_4/grammatical_categories_structures_and_syntactical_functions.pdf) and brush up on your [grammar skils](http://www.ucl.ac.uk/internet-grammar/intro/intro.htm).\n",
"\n",
"Then re-visit the [Enju online parser](http://www.nactem.ac.uk/enju/demo.html), and parse the following sentences...\n",
"\n",
"What is wrong with the parses of the following sentences? Are they correct?\n",
"- Fat people eat accumulates.\n",
"- The fat that people eat accumulates in their bodies.\n",
"- The fat that people eat is accumulating in their bodies.\n",
"\n",
"What about these, is the problem in the parser or in the sentence?\n",
" - The old man the boat.\n",
" - The old people man the boat. \n",
"\n",
"These were examples of garden path sentences, find out what that means.\n",
"\n",
"What about these sentences? Are their parses correct?\n",
" - Time flies like an arrow; fruit flies like a banana.\n",
" - We saw her duck."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Task 2: Parent Annotation\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Reminisce the lecture notes in parsing, and the mentioned parent annotation. (grand)*parents, matter - knowing who the parent is in a tree gives a bit of context information which can later help us with smoothing probabilities, and approaching context-dependent parsing.\n",
"\n",
"in that case, each non-terminal node should know it's parent. We'll do this exercise on a single tree, just to play around a bit with trees and their labeling.\n",
"\n",
"\n",
"Given the following tree:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = ('S', [('Subj', ['He']), ('VP', [('Verb', ['shot']), ('Obj', ['the', 'elephant']), ('PP', ['in', 'his', 'pyjamas'])])])\n",
"parsing.render_tree(x)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Construct an `annotate_parents` function which will take that tree, and annotate its parents. The final annotation result should look like this:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y = ('S^?', [('Subj^S', ['He']), ('VP^S', [('Verb^VP', ['shot']), ('Obj^VP', ['the', 'elephant']), ('PP^VP', ['in', 'his', 'pyjamas'])])])\n",
"parsing.render_tree(y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Solutions\n",
"\n",
"You can find the solutions to this exercises [here](parsing_solutions.ipynb)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.2"
}
},
"nbformat": 4,
"nbformat_minor": 1
}