Position Tagging

Reviews are split into a beginning, middle, and end, so to see if one section carries more sentiment than another, we split the reviews into a first quarter, a middle half, and a last quarter and tagged the words in each section.

Position tagging was not helpful. For bigrams, it harmed performance by around 5% in most cases, and for unigrams, it was not helpful. If reviews end up not actually following the model specified or if the model has no bearing on where the relevant data is, position tagging will be harmful because it increases the dimensionality of the input without increasing the information content. We suspect that is the case here.



Pranjal Vachaspati 2012-02-05