# PCFG Parsing using NLTK

**(C) 2018-2024 by [Damir Cavar](http://damir.cavar.me/)**

**Download:** This and various other Jupyter notebooks are available from my [GitHub repo](https://github.com/dcavar/python-tutorial-notebooks).

**License:** [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/) ([CA BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/))

**Prerequisites:**

In [None]:
!pip install -U nltk

Install all thre NLTK data:

In [None]:
!python -m nltk.downloader all

or install just the Penn treebank portion:

In [None]:
!python -m nltk.downloader treebank

This notebook is based on [my](http://damir.cavar.me/) own NLTK notebooks and [Michael Elhadad](https://www.cs.bgu.ac.il/~elhadad/)'s notebook *[Constituent-based Syntactic Parsing with NLTK](https://www.cs.bgu.ac.il/~elhadad/nlp16/nltk-pcfg.html)*. See for related material:

1. [Python Parsing with NLTK](https://github.com/dcavar/python-tutorial-for-ipython/blob/master/notebooks/Python%20Parsing%20with%20NLTK.ipynb)
1. [Python Parsing with NLTK and Foma](https://github.com/dcavar/python-tutorial-for-ipython/blob/master/notebooks/Python%20Parsing%20with%20NLTK%20and%20Foma.ipynb)

This is a tutorial related to the discussion of grammar engineering and parsing in the class Alternative Syntactic Theories and [Advanced Natural Language Processing](http://damir.cavar.me/l645/) taught at [Indiana University at Bloomington](https://bloomington.iu.edu/) in Spring 2017 and Fall 2018, 2019, 2020, 2023.

The examples presuppose an installed Python 3.x [NLTK](https://www.nltk.org/) module with all the dependent modules and packages, as well as the [data set for NLTK](https://www.nltk.org/data.html).

## Using the Penn-Treebank

You can use the [NLTK](https://www.nltk.org/) *corpus* module to read and use the different corpora in the NLTK [data set](https://www.nltk.org/data.html).

In [1]:
import nltk.corpus

We can load the portion of the Penn Treebank:

In [2]:
ptb = nltk.corpus.treebank

The corpus consists of a collection of files. Each file has an ID (file name). You can print the IDs using the *fileids* method, selecting the top 20:

In [3]:
print(ptb.fileids()[:20])

['wsj_0001.mrg', 'wsj_0002.mrg', 'wsj_0003.mrg', 'wsj_0004.mrg', 'wsj_0005.mrg', 'wsj_0006.mrg', 'wsj_0007.mrg', 'wsj_0008.mrg', 'wsj_0009.mrg', 'wsj_0010.mrg', 'wsj_0011.mrg', 'wsj_0012.mrg', 'wsj_0013.mrg', 'wsj_0014.mrg', 'wsj_0015.mrg', 'wsj_0016.mrg', 'wsj_0017.mrg', 'wsj_0018.mrg', 'wsj_0019.mrg', 'wsj_0020.mrg']


We can print out the content of a particular file:

In [4]:
print(ptb.raw('wsj_0001.mrg'))


( (S 
 (NP-SBJ 
 (NP (NNP Pierre) (NNP Vinken) )
 (, ,) 
 (ADJP 
 (NP (CD 61) (NNS years) )
 (JJ old) )
 (, ,) )
 (VP (MD will) 
 (VP (VB join) 
 (NP (DT the) (NN board) )
 (PP-CLR (IN as) 
 (NP (DT a) (JJ nonexecutive) (NN director) ))
 (NP-TMP (NNP Nov.) (CD 29) )))
 (. .) ))
( (S 
 (NP-SBJ (NNP Mr.) (NNP Vinken) )
 (VP (VBZ is) 
 (NP-PRD 
 (NP (NN chairman) )
 (PP (IN of) 
 (NP 
 (NP (NNP Elsevier) (NNP N.V.) )
 (, ,) 
 (NP (DT the) (NNP Dutch) (VBG publishing) (NN group) )))))
 (. .) ))



As you can see, the file `wsj_0001.mrg` contains two sentence trees. You can access each sentence tree individually. Here we print out the first sentence:

In [5]:
print(ptb.parsed_sents('wsj_0001.mrg')[0])

(S
 (NP-SBJ
 (NP (NNP Pierre) (NNP Vinken))
 (, ,)
 (ADJP (NP (CD 61) (NNS years)) (JJ old))
 (, ,))
 (VP
 (MD will)
 (VP
 (VB join)
 (NP (DT the) (NN board))
 (PP-CLR (IN as) (NP (DT a) (JJ nonexecutive) (NN director)))
 (NP-TMP (NNP Nov.) (CD 29))))
 (. .))


And in the following code we print the second tree:

In [6]:
print(ptb.parsed_sents('wsj_0001.mrg')[1])

(S
 (NP-SBJ (NNP Mr.) (NNP Vinken))
 (VP
 (VBZ is)
 (NP-PRD
 (NP (NN chairman))
 (PP
 (IN of)
 (NP
 (NP (NNP Elsevier) (NNP N.V.))
 (, ,)
 (NP (DT the) (NNP Dutch) (VBG publishing) (NN group))))))
 (. .))


To see the tree structure as a graphic, you can apply the *draw()* method to the particular tree. The following code will open up a window with the drawing of the first tree in the file *wsj_0001.mrg*. This will pop up a window that might be hidden behind your notebook view in the browser!

In [7]:
ptb.parsed_sents('wsj_0001.mrg')[0].draw()

Notice that in the menu of the window you can export the drawing of the tree as a Postscript file.

[NLTK](https://www.nltk.org/) allows you to define your own tree using the bracketed notation and the *[Tree](http://www.nltk.org/_modules/nltk/tree.html)* class.

In [8]:
from nltk import Tree

We can define a string variable with a syntactic tree and create a Tree object by applying the *fromstring* method from the *Tree* class:

In [9]:
trees = "(S (NP (NN cats ) ) (VP (V chase ) (NP (NN mice ) ) ))"
myTree = Tree.fromstring(trees)

We can print the tree in bracketed notation:

In [10]:
print(myTree)

(S (NP (NN cats)) (VP (V chase) (NP (NN mice))))


The internal representation of the tree looks as follows:

In [11]:
print(myTree.__repr__())

Tree('S', [Tree('NP', [Tree('NN', ['cats'])]), Tree('VP', [Tree('V', ['chase']), Tree('NP', [Tree('NN', ['mice'])])])])


The tree can be written in LaTeX format for the *qtree* package:

In [12]:
print(myTree.pformat_latex_qtree())

\Tree [.S [.NP [.NN cats ] ] [.VP [.V chase ] [.NP [.NN mice ] ] ] ]


We can draw the tree by applying the *draw()* method to the tree object. The following code will open up a window with the drawing of the corresponding tree:

In [13]:
myTree.draw()

We can print the label of a tree or subtrees using the *label* method:

In [14]:
print(myTree.label())

S


A daughter of the root node of the tree can be accessed using the list index notation:

In [15]:
print(myTree[0])
print(myTree[1])

(NP (NN cats))
(VP (V chase) (NP (NN mice)))


We can print out the label of these subtrees by applying the *label* method to them:

In [16]:
print(myTree[0].label())
print(myTree[1].label())

NP
VP


We can traverse the tree using these bracketed index operators:

In [17]:
print(myTree[0])
print(myTree[0,0])
print(myTree[0,0,0])

(NP (NN cats))
(NN cats)
cats


Sections of the tree can be modified or replaced using the index notation. We replace the subject *(NN cats)* with *(NN dogs)* in the following code:

In [18]:
myTree[0,0] = Tree.fromstring('(NN dogs)')
print(myTree)

(S (NP (NN dogs)) (VP (V chase) (NP (NN mice))))


Various details about the tree can be accessed. We can extract the leaves of the tree using the *leaves* method:

In [19]:
print(myTree.leaves())

['dogs', 'chase', 'mice']


We can query the height of a tree:

In [20]:
print(myTree.height())

5


For specific purposes and some specific algorithms one might want to convert the tree to remove all unary relations between symbols. To collapse for example *(NP (NP (NN dogs )))* to *(NP+NP (NN cats ))*:

In [18]:
trees2 = "(S (NP (NP (NN cats ) ) ) (VP (V chase ) (NP (NN mice ) ) ))"
myTree2 = Tree.fromstring(trees2)
myTree2.collapse_unary()
print(myTree2)

(S (NP+NP (NN cats)) (VP (V chase) (NP (NN mice))))


We can convert a tree to Chomsky Normal Form as well. In the following tree we have a subject NP branching into three daughter nodes. This is converted to a binary branching structure:

In [19]:
trees2 = "(S (NP (NP (DET the ) (JJ big ) (NN cats ) ) ) (VP (V chase ) (NP (NN mice ) ) ))"
myTree2 = Tree.fromstring(trees2)
myTree2.chomsky_normal_form()
print(myTree2)

(S
 (NP (NP (DET the) (NP| (JJ big) (NN cats))))
 (VP (V chase) (NP (NN mice))))


We can generate all production rules for a tree:

In [20]:
myTree.productions()

[S -> NP VP,
 NP -> NN,
 NN -> 'cats',
 VP -> V NP,
 V -> 'chase',
 NP -> NN,
 NN -> 'mice']

Trees can contain nodes that are complex objects. Nodes do not have to be strings. In the following code we replace the root node of our tree with a tuple of string and integer:

In [21]:
myTree.set_label( ('S', 3) )
print(myTree)

(('S', 3) (NP (NN cats)) (VP (V chase) (NP (NN mice))))


Probabilistic rules or trees can be defined in the following way:

In [22]:
pt = nltk.tree.ProbabilisticTree('NP', ['DET', 'N'], prob=0.5)
print(pt)

(NP DET N) (p=0.5)


Draw the tree:

In [23]:
pt.draw()

In the previous sections (see links to the other notebooks above), we have seen how CFGs can be defined, read from a file, or used in a parser.

PCFG rules are defined in the same way. In addition a probability is assigned to each right-hand side of a rule. All probabilities for one particular left-hand side have to sum up to 1. The following code imports the PCFG class from NLTK:

In [24]:
from nltk import PCFG

We can define a grammar:

In [26]:
pcfg1 = PCFG.fromstring("""
 S -> NP VP [1.0]
 NP -> Det N [0.5]
 NP -> NP PP [0.25]
 NP -> N [0.25]
 PP -> P NP [1.0]
 VP -> VP PP [0.1] | V NP [0.7] | V [0.2]
 N -> 'woman' [0.3] | 'man' [0.3] | 'telescope' [0.3] | 'mixer' [0.1]
 Det -> 'the' [0.6] | 'a' [0.2] | 'my' [0.2]
 V -> 'killed' [0.35] | 'saw' [0.65]
 P -> 'with' [0.61] | 'under' [0.39]
""")

We can print out the productions:

In [27]:
print(pcfg1)

Grammar with 19 productions (start state = S)
 S -> NP VP [1.0]
 NP -> Det N [0.5]
 NP -> NP PP [0.25]
 NP -> N [0.25]
 PP -> P NP [1.0]
 VP -> VP PP [0.1]
 VP -> V NP [0.7]
 VP -> V [0.2]
 N -> 'woman' [0.3]
 N -> 'man' [0.3]
 N -> 'telescope' [0.3]
 N -> 'mixer' [0.1]
 Det -> 'the' [0.6]
 Det -> 'a' [0.2]
 Det -> 'my' [0.2]
 V -> 'killed' [0.35]
 V -> 'saw' [0.65]
 P -> 'with' [0.61]
 P -> 'under' [0.39]


We can import the portion of the Penn Treebank as *treebank*:

In [28]:
from nltk.corpus import treebank

We can now loop over all files, read the sentences in, extract all production rules from them, and aggregate the production rules in a list.

In [29]:
productions = []
for t in treebank.fileids():
 for x in treebank.parsed_sents(t):
 productions += x.productions()

We are printing the last 20 in the following:

In [30]:
#print(productions[-20:])
for i in range(10):
 print(productions[i])

S -> NP-SBJ VP .
NP-SBJ -> NP , ADJP ,
NP -> NNP NNP
NNP -> 'Pierre'
NNP -> 'Vinken'
, -> ','
ADJP -> NP JJ
NP -> CD NNS
CD -> '61'
NNS -> 'years'


At this point one could remove some productions from the collection or change them according to some goal. We could also import *Nonterminal* and *Production* from *NLTK* to create new production rules and tweak the grammar.

In [31]:
from nltk import Nonterminal, Production

To create a new *Production* object, it is necessary to use the *Nonterminal* object for symbols. The right-hand-side of production rules has to be a list or tuple. Terminals are simply strings, as in the following example.

In [32]:
p = Production(Nonterminal('NNP'), ["Seung"])

Verify the correct declaration of the production rule:

In [33]:
print(p)

NNP -> 'Seung'


We can now append the new production rule to the list of productions.

In [34]:
productions.append(p)

The appended production is now last in the list. We are printing out the last 20 productions:

In [35]:
print(productions[-20:])

[VP -> TO VP, TO -> 'to', VP -> VB NP PP-TMP, VB -> 'begin', NP -> NN, NN -> 'delivery', PP-TMP -> IN NP, IN -> 'in', NP -> NP PP, NP -> DT JJ NN, DT -> 'the', JJ -> 'first', NN -> 'quarter', PP -> IN NP, IN -> 'of', NP -> JJ NN, JJ -> 'next', NN -> 'year', . -> '.', NNP -> 'Seung']


To manipulate the probabilities, one could tweak the probability mass for the left-hand-side symbol, or simply append more such production rules before computing the frequencies and maximum likelihood estimates.

In [37]:
myTree = Tree.fromstring("(S (NP (NP (DET the ) (JJ big ) (NN cats ) ) ) (VP (V chase ) (NP (NN mice ) ) ))")
p1 = myTree.productions()

The production rules are:

In [38]:
print(p1)

[S -> NP VP, NP -> NP, NP -> DET JJ NN, DET -> 'the', JJ -> 'big', NN -> 'cats', VP -> V NP, V -> 'chase', NP -> NN, NN -> 'mice']


You can concatenate these productions with the existing productions from the Penn Treebank using a *+* operator in the two production lists:

In [39]:
productions = productions + p1
print(productions[-30:])

[VP -> TO VP, TO -> 'to', VP -> VB NP PP-TMP, VB -> 'begin', NP -> NN, NN -> 'delivery', PP-TMP -> IN NP, IN -> 'in', NP -> NP PP, NP -> DT JJ NN, DT -> 'the', JJ -> 'first', NN -> 'quarter', PP -> IN NP, IN -> 'of', NP -> JJ NN, JJ -> 'next', NN -> 'year', . -> '.', NNP -> 'Seung', S -> NP VP, NP -> NP, NP -> DET JJ NN, DET -> 'the', JJ -> 'big', NN -> 'cats', VP -> V NP, V -> 'chase', NP -> NN, NN -> 'mice']


To be able to induce the PCFG, we need to define a *Nonterminal* object with the start symbol as label:

In [40]:
from nltk import Nonterminal
S = Nonterminal('S')

We can now induce the PCFG using the *Nonterminal* start symbol and the list of production rules. We are printing out the first 20 productions here:

In [41]:
from nltk import induce_pcfg
grammar = induce_pcfg(S, productions)

We can normalize the trees by collapsing unary branches and converting them to Chomsky Normal Form before we extract the production rules and induce the grammar:

In [57]:
productions = []
for t in treebank.fileids():
 for x in treebank.parsed_sents(t):
 #x.collapse_unary(collapsePOS = False)
 #x.chomsky_normal_form(horzMarkov = 2)
 productions += x.productions()
#productions.append("N -> tie")
#productions.append("V -> killed")
myTree = Tree.fromstring("(S (NP (DET she ) ) (VP (V killed ) (NP (DET the ) (NN man ) (PP (P with ) (NP (DET the ) (N tie ) ) ) ) ) )")
productions += myTree.productions()
myTree = Tree.fromstring("(S (NP (DET she ) ) (VP (V killed ) (NP (DET the ) (NN man ) ) (PP (P with ) (NP (DET the ) (N tie ) ) ) ) )")
productions += myTree.productions()
grammar = induce_pcfg(S, productions)
print(grammar)

Grammar with 21776 productions (start state = S)
 S -> NP-SBJ VP . [0.16239]
 NP-SBJ -> NP , ADJP , [0.000392567]
 NP -> NNP NNP [0.0309313]
 NNP -> 'Pierre' [0.00010627]
 NNP -> 'Vinken' [0.00021254]
 , -> ',' [0.999795]
 ADJP -> NP JJ [0.014556]
 NP -> CD NNS [0.0109987]
 CD -> '61' [0.00141004]
 NNS -> 'years' [0.0190177]
 JJ -> 'old' [0.00411382]
 VP -> MD VP [0.0523015]
 MD -> 'will' [0.30205]
 VP -> VB NP PP-CLR NP-TMP [0.000137817]
 VB -> 'join' [0.00156617]
 NP -> DT NN [0.0851243]
 DT -> 'the' [0.49455]
 NN -> 'board' [0.00227825]
 PP-CLR -> IN NP [0.679445]
 IN -> 'as' [0.0337831]
 NP -> DT JJ NN [0.0311842]
 DT -> 'a' [0.229516]
 JJ -> 'nonexecutive' [0.000857045]
 NN -> 'director' [0.00243013]
 NP-TMP -> NNP CD [0.0437158]
 NNP -> 'Nov.' [0.00244421]
 CD -> '29' [0.00141004]
 . -> '.' [0.988126]
 NP-SBJ -> NNP NNP [0.0467155]
 NNP -> 'Mr.' [0.0398512]
 VP -> VBZ NP-PRD [0.0112321]
 VBZ -> 'is' [0.315765]
 NP-PRD -> NP PP [0.267857]
 NP -> NN [0.0467762]
 NN -> 'chairman' [0

## Parsing with PCFGs

NLTK provides various parser implementations. One that implements the Viterbi CKY n-best parses over a PCFG is available in the [parse.viterbi](http://www.nltk.org/_modules/nltk/parse/viterbi.html) module. (See [Michael Elhadad](https://www.cs.bgu.ac.il/~elhadad/)'s notebook *[Constituent-based Syntactic Parsing with NLTK](https://www.cs.bgu.ac.il/~elhadad/nlp16/nltk-pcfg.html)* for more details.)

In [69]:
print("Parse sentence using induced grammar:")

#parser.trace(3)

#sent = treebank.parsed_sents('wsj_0001.mrg')[0].leaves()
sent = "She killed the man with the tie".split()
print(sent)

from nltk.parse import ViterbiParser, InsideChartParser
#parser = nltk.pchart.InsideChartParser(grammar)
# parser = ViterbiParser(grammar)
parser = InsideChartParser(grammar)

for parse in parser.parse_all(sent):
 print(parse)

Parse sentence using induced grammar:
['She', 'killed', 'the', 'man', 'with', 'the', 'tie']


KeyboardInterrupt: 

The other parsers are defined in the *nltk.parse* module.

In [45]:
import sys, time
from nltk import tokenize
#from nltk.grammar import Nonterminal, toy_pcfg1
from nltk.parse import pchart
from nltk.parse import ViterbiParser

In [46]:
pcfg_rules = """
S -> NP VP
S -> NP VP
S -> NP VP
NP -> ART N
NP -> ART N
NP -> ART N
NP -> ART ADJ N
NP -> ART ADJ N
NP -> ADJ N
NP -> ART N PP
NP -> ART N PP
NP -> ART N PP
NP -> ART N PP
NP -> N PP
NP -> ART N S_REL
S_REL -> RELP S
NP -> PRON
PP -> P NP
VP -> V NP
VP -> V
VP -> V NP
VP -> V NP
VP -> V NP NP
VP -> V NP PP
""".split('\n')
pcfg_lexicon = """
RELP -> that
ART -> the
ART -> a
ART -> my
N -> John
N -> tie
N -> man
N -> woman
N -> telescope
V -> saw
V -> killed
P -> with
P -> in
PRON -> I
PRON -> she
PRON -> he
ADJ -> big
ADJ -> red
""".split('\n')
rules = [ (z[0].strip(), z[1].strip()) for z in [ y.split('->') for y in [ x for x in pcfg_rules if x.strip() ] ] ]
productions = [ Production(Nonterminal(p[0]), [ Nonterminal(x) for x in p[1].split() ]) for p in rules ]
lexicon = [ (z[0].strip(), z[1].strip()) for z in [ y.split('->') for y in [ x for x in pcfg_lexicon if x.strip() ] ] ]
productions += [ Production(Nonterminal(p[0]), [ p[1] ]) for p in lexicon ]

In [47]:
toy_pcfg1 = nltk.grammar.induce_pcfg(Nonterminal("S"), productions)

demos = [('I saw the man with the telescope', toy_pcfg1)]
sent, grammar = demos[0]

# Tokenize the sentence.
tokens = sent.split()

# Define a list of parsers. We'll use all parsers.
parsers = [
ViterbiParser(grammar),
pchart.InsideChartParser(grammar),
pchart.RandomChartParser(grammar),
pchart.UnsortedChartParser(grammar),
pchart.LongestChartParser(grammar),
pchart.InsideChartParser(grammar, beam_size = len(tokens)+1)
]

We can now loop over the different parsers and compare the output and performance:

In [48]:
# Run the parsers on the tokenized sentence.
from functools import reduce
times = []
average_p = []
num_parses = []
all_parses = {}
for parser in parsers:
 print('\ns: %s\nparser: %s\ngrammar: %s' % (sent,parser,grammar))
 parser.trace(3)
 t = time.time()
 parses = parser.parse_all(tokens)
 times.append(time.time()-t)
 if parses: 
 lp = len(parses)
 p = reduce(lambda a,b:a+b.prob(), parses, 0.0)
 else: 
 p = 0
 average_p.append(p)
 num_parses.append(len(parses))
 for p in parses: 
 all_parses[p.freeze()] = 1

# Print summary statistics
print()
print('-------------------------+------------------------------------------')
print(' Parser Beam | Time (secs) # Parses Average P(parse)')
print('-------------------------+------------------------------------------')
for i in range(len(parsers)):
 print('%19s %4d |%11.4f%11d%19.14f' % (parsers[i].__class__.__name__,
 getattr(parsers[0], "beam_size", 0),
 times[i], 
 num_parses[i], 
 average_p[i]))
parses = all_parses.keys()
if parses: 
 p = reduce(lambda a,b:a+b.prob(), parses, 0)/len(parses)
else: 
 p = 0
print('-------------------------+------------------------------------------')
print('%19s |%11s%11d%19.14f' % ('(All Parses)', 'n/a', len(parses), p))
print()

for parse in parses:
 print(parse)



s: I saw the man with the telescope
parser: >
grammar: Grammar with 32 productions (start state = S)
 S -> NP VP [1.0]
 NP -> ART N [0.230769]
 NP -> ART ADJ N [0.153846]
 NP -> ADJ N [0.0769231]
 NP -> ART N PP [0.307692]
 NP -> N PP [0.0769231]
 NP -> ART N S_REL [0.0769231]
 S_REL -> RELP S [1.0]
 NP -> PRON [0.0769231]
 PP -> P NP [1.0]
 VP -> V NP [0.5]
 VP -> V [0.166667]
 VP -> V NP NP [0.166667]
 VP -> V NP PP [0.166667]
 RELP -> 'that' [1.0]
 ART -> 'the' [0.333333]
 ART -> 'a' [0.333333]
 ART -> 'my' [0.333333]
 N -> 'John' [0.2]
 N -> 'tie' [0.2]
 N -> 'man' [0.2]
 N -> 'woman' [0.2]
 N -> 'telescope' [0.2]
 V -> 'saw' [0.5]
 V -> 'killed' [0.5]
 P -> 'with' [0.5]
 P -> 'in' [0.5]
 PRON -> 'I' [0.333333]
 PRON -> 'she' [0.333333]
 PRON -> 'he' [0.333333]
 ADJ -> 'big' [0.5]
 ADJ -> 'red' [0.5]
Inserting tokens into the most likely constituents table...
 Insert: |=......| I
 Insert: |.=.....| saw
 Insert: |..=....| the
 Insert: |...=...| man
 Insert: |....=..| with
 Insert: 

(C) 2018-2024 by [Damir Cavar](http://damir.cavar.me/) - [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/) ([CA BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)). Parts of the code are taken from [Michael Elhadad](https://www.cs.bgu.ac.il/~elhadad/)'s notebook *[Constituent-based Syntactic Parsing with NLTK](https://www.cs.bgu.ac.il/~elhadad/nlp16/nltk-pcfg.html)*. Please consult him for details about the copyright.