Coverage for nltk.parse.util : 75%
![](keybd_closed.png)
Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
# Natural Language Toolkit: Parser Utility Functions # # Author: Ewan Klein <ewan@inf.ed.ac.uk> # # Copyright (C) 2001-2012 NLTK Project # URL: <http://www.nltk.org/> # For license information, see LICENSE.TXT
Utility functions for parsers. """
parser=None, chart_class=None, beam_size=0, **load_args): """ Load a grammar from a file, and build a parser based on that grammar. The parser depends on the grammar format, and might also depend on properties of the grammar itself.
The following grammar formats are currently supported: - ``'cfg'`` (CFGs: ``ContextFreeGrammar``) - ``'pcfg'`` (probabilistic CFGs: ``WeightedGrammar``) - ``'fcfg'`` (feature-based CFGs: ``ContextFreeGrammar``)
:type grammar_url: str :param grammar_url: A URL specifying where the grammar is located. The default protocol is ``"nltk:"``, which searches for the file in the the NLTK data package. :type trace: int :param trace: The level of tracing that should be used when parsing a text. ``0`` will generate no tracing output; and higher numbers will produce more verbose tracing output. :param parser: The class used for parsing; should be ``ChartParser`` or a subclass. If None, the class depends on the grammar format. :param chart_class: The class used for storing the chart; should be ``Chart`` or a subclass. Only used for CFGs and feature CFGs. If None, the chart class depends on the grammar format. :type beam_size: int :param beam_size: The maximum length for the parser's edge queue. Only used for probabilistic CFGs. :param load_args: Keyword parameters used when loading the grammar. See ``data.load`` for more information. """ raise ValueError("The grammar must be a ContextFreeGrammar, " "or a subclass thereof.") if parser is None: parser = InsideChartParser return parser(grammar, trace=trace, beam_size=beam_size)
else: # Plain ContextFreeGrammar. if parser is None: parser = ChartParser if chart_class is None: chart_class = Chart return parser(grammar, trace=trace, chart_class=chart_class)
###################################################################### #{ Test Suites ######################################################################
""" Unit tests for CFG. """
""" Sentences in the test suite are divided into two classes: - grammatical (``accept``) and - ungrammatical (``reject``). If a sentence should parse accordng to the grammar, the value of ``trees`` will be a non-empty list. If a sentence should be rejected according to the grammar, then the value of ``trees`` will be None. """ print() print(sent) for tree in trees: print(tree) raise ValueError("Sentence '%s' failed to parse'" % sent) else: else: raise ValueError("Sentence '%s' received a parse'" % sent) else:
""" Parses a string with one test sentence per line. Lines can optionally begin with: - a bool, saying if the sentence is grammatical or not, or - an int, giving the number of parse trees is should have, The result information is followed by a colon, and then the sentence. Empty lines and lines beginning with a comment char are ignored.
:return: a list of tuple of sentences and expected results, where a sentence is a list of str, and a result is None, or bool, or int
:param comment_chars: ``str`` of possible comment characters. """ result = split_info[0] in ['True','true'] sentence = split_info[1] else: continue
# nose thinks it is a test |