Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

# Natural Language Toolkit: Stemmers 

# 

# Copyright (C) 2001-2012 NLTK Project 

# Author: Trevor Cohn <tacohn@cs.mu.oz.au> 

#         Edward Loper <edloper@gradient.cis.upenn.edu> 

#         Steven Bird <sb@csse.unimelb.edu.au> 

# URL: <http://www.nltk.org/> 

# For license information, see LICENSE.TXT 

 

""" 

NLTK Stemmers 

 

Interfaces used to remove morphological affixes from words, leaving 

only the word stem.  Stemming algorithms aim to remove those affixes 

required for eg. grammatical role, tense, derivational morphology 

leaving only the stem of the word.  This is a difficult problem due to 

irregular words (eg. common verbs in English), complicated 

morphological rules, and part-of-speech and sense ambiguities 

(eg. ``ceil-`` is not the stem of ``ceiling``). 

 

StemmerI defines a standard interface for stemmers. 

""" 

 

from nltk.stem.api import StemmerI 

from nltk.stem.regexp import RegexpStemmer 

from nltk.stem.lancaster import LancasterStemmer 

from nltk.stem.isri import ISRIStemmer 

from nltk.stem.porter import PorterStemmer 

from nltk.stem.snowball import SnowballStemmer 

from nltk.stem.wordnet import WordNetLemmatizer 

from nltk.stem.rslp import RSLPStemmer 

 

 

if __name__ == "__main__": 

    import doctest 

    doctest.testmod(optionflags=doctest.NORMALIZE_WHITESPACE)