Coverage for nltk.stem.regexp : 82%
![](keybd_closed.png)
Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
# Natural Language Toolkit: Stemmers # # Copyright (C) 2001-2012 NLTK Project # Author: Trevor Cohn <tacohn@cs.mu.oz.au> # Edward Loper <edloper@gradient.cis.upenn.edu> # Steven Bird <sb@csse.unimelb.edu.au> # URL: <http://www.nltk.org/> # For license information, see LICENSE.TXT
""" A stemmer that uses regular expressions to identify morphological affixes. Any substrings that match the regular expressions will be removed.
>>> from nltk.stem import RegexpStemmer >>> st = RegexpStemmer('ing$|s$|e$', min=4) >>> st.stem('cars') 'car' >>> st.stem('mass') 'mas' >>> st.stem('was') 'was' >>> st.stem('bee') 'bee' >>> st.stem('compute') 'comput'
:type regexp: str or regexp :param regexp: The regular expression that should be used to identify morphological affixes. :type min: int :param min: The minimum length of string to stem """
else:
return '<RegexpStemmer: %r>' % self._regexp.pattern
import doctest doctest.testmod(optionflags=doctest.NORMALIZE_WHITESPACE)
|