Negation Tagging

In an effort to preserve the potential value of negation information while using dead-simple features, we tagged words between those expressing negation and the next punctuation mark with a postfix ``_NOT.'' This distinguishes sentences like ``That movie was very good'' and ``That movie was not very good.'' Diverging from Pang, we also added negation tags to bigrams.

Negation tagging did not appear to have a significant effect on the data. For all the classifiers, the results from negation tagged data were almost the same as the results from the raw data. Nevertheless, we used negation tagging for the remainder of the tests, as it did not seem to hurt performance or accuracy.

The ineffectiveness of negation tagging probably comes from a few sources. First, it increases the number of uncommon features, which, as discussed previously, harms effectiveness and cancels out the increase in semantic awareness. Second, the presence of a “not” does not always indicate negation. Rather, it is often used idiomatically, as in the example fragment ``with his distinctive, more often than not ingenious dialogue''. Finally, the method of tagging all words up to the next punctuation mark is suspect. Only a few words after the not are actually semantically negated, and these often occur after a comma or other punctuation mark.

Pranjal Vachaspati 2012-02-05