Coverage for nltk.classify.naivebayes : 92%
![](keybd_closed.png)
Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
# Natural Language Toolkit: Naive Bayes Classifiers # # Copyright (C) 2001-2012 NLTK Project # Author: Edward Loper <edloper@gradient.cis.upenn.edu> # URL: <http://www.nltk.org/> # For license information, see LICENSE.TXT
A classifier based on the Naive Bayes algorithm. In order to find the probability for a label, this algorithm first uses the Bayes rule to express P(label|features) in terms of P(label) and P(features|label):
| P(label) * P(features|label) | P(label|features) = ------------------------------ | P(features)
The algorithm then makes the 'naive' assumption that all features are independent, given the label:
| P(label) * P(f1|label) * ... * P(fn|label) | P(label|features) = -------------------------------------------- | P(features)
Rather than computing P(featues) explicitly, the algorithm just calculates the denominator for each label, and normalizes them so they sum to one:
| P(label) * P(f1|label) * ... * P(fn|label) | P(label|features) = -------------------------------------------- | SUM[l]( P(l) * P(f1|l) * ... * P(fn|l) ) """
##////////////////////////////////////////////////////// ## Naive Bayes Classifier ##//////////////////////////////////////////////////////
""" A Naive Bayes classifier. Naive Bayes classifiers are paramaterized by two probability distributions:
- P(label) gives the probability that an input will receive each label, given no information about the input's features.
- P(fname=fval|label) gives the probability that a given feature (fname) will receive a given value (fval), given that the label (label).
If the classifier encounters an input with a feature that has never been seen with any label, then rather than assigning a probability of 0 to all labels, it will ignore that feature.
The feature value 'None' is reserved for unseen feature values; you generally should not use 'None' as a feature value for one of your own features. """ """ :param label_probdist: P(label), the probability distribution over labels. It is expressed as a ``ProbDistI`` whose samples are labels. I.e., P(label) = ``label_probdist.prob(label)``.
:param feature_probdist: P(fname=fval|label), the probability distribution for feature values, given labels. It is expressed as a dictionary whose keys are ``(label, fname)`` pairs and whose values are ``ProbDistI`` objects over feature values. I.e., P(fname=fval|label) = ``feature_probdist[label,fname].prob(fval)``. If a given ``(label,fname)`` is not a key in ``feature_probdist``, then it is assumed that the corresponding P(fname=fval|label) is 0 for all values of ``fval``. """
# Discard any feature names that we've never seen before. # Otherwise, we'll just assign a probability of 0 to # everything. else: #print 'Ignoring unseen feature %s' % fname
# Find the log probabilty of each label, given the features. # Start with the log probability of the label itself.
# Then add in the log probability of features given labels. else: # nb: This case will never come up if the # classifier was created by # NaiveBayesClassifier.train(). logprob[label] += sum_logs([]) # = -INF.
# Determine the most relevant features, and display them.
if fval in cpdist[l,fname].samples()], key=labelprob) ratio = 'INF' else: cpdist[l0,fname].prob(fval)) (fname, fval, str(l1)[:6], str(l0)[:6], ratio)))
""" Return a list of the 'most informative' features used by this classifier. For the purpose of this function, the informativeness of a feature ``(fname,fval)`` is equal to the highest value of P(fname=fval|label), for any label, divided by the lowest value of P(fname=fval|label), for any label:
| max[ P(fname=fval|label1) / P(fname=fval|label2) ] """ # The set of (fname, fval) pairs used by this classifier. # The max & min probability associated w/ each (fname, fval) # pair. Maps (fname,fval) -> float.
features.discard(feature)
# Convert features to a list, & sort it by how informative # features are. key=lambda feature: minprob[feature]/maxprob[feature])
""" :param labeled_featuresets: A list of classified featuresets, i.e., a list of tuples ``(featureset, label)``. """
# Count up how many times each feature value occurred, given # the label and featurename. # Increment freq(fval|label, fname) # Record that fname can take the value fval. # Keep a list of all feature names.
# If a feature didn't have a value given for an instance, then # we assume that it gets the implicit value 'None.' This loop # counts up the number of 'missing' feature values for each # (label,fname) pair, and increments the count of the fval # 'None' by that amount.
# Create the P(label) distribution
# Create the P(fval|label, fname) distribution
##////////////////////////////////////////////////////// ## Demo ##//////////////////////////////////////////////////////
from nltk.classify.util import names_demo classifier = names_demo(NaiveBayesClassifier.train) classifier.show_most_informative_features()
demo()
|