Visualizing Shakespeare’s Sonic Signatures Alexander Eric Carlson Carleton College, United States of America ealexander@carleton.edu Nichols Elizabeth Carleton College, United States of America nicholsl@carleton.edu Bayer Estelle Carleton College, United States of America bayere@carleton.edu 2019-05-06T02:22:00Z Name, Institution
Street City Country Name

Converted from a Word document

Paper Poster sonic signatures visualization corpus and text analysis literary studies natural language processing linguistics English

Good authors imbue their characters with distinctive voices that are often discernible devoid of explicit dialog labels, both by their word choice as well as sometimes by the actual sound of the words. For instance, in Shakespeare's Othello, the speech of the titular character is said to be characterized by longer, rounder vowel sounds than the quick speech of his counterpart, Iago. This phenomenon provokes a wide variety of questions: Can we detect these differences in speech computationally? If so, what would it tell us about these characters? What would it tell us about the author?

We set out to see if we could detect a difference between Shakespearean characters based on the sounds they made: what we call their “sonic signatures” (though “phonetic signatures” might be more accurate). To be clear, we were interested in the sounds associated with the words themselves, not the voice of a particular actor. We used a library called NLTK (Loper and Bird, 2002) to convert plain text versions of every character's lines into strings of phonemes, or perceptually distinct units of sounds. For instance, the word “cheese” is made up of three phonemes, which can be written “CH IY Z.” Having converted each character into 39 counts of these different phonemes, we used a machine learning technique called Naive Bayes to create a classifier for differentiating them.

We initially chose to try and tell the difference not between individual characters but between different common character roles in Shakespearean plays: protagonists, antagonists, and fools. We achieved partial success, finding that we were able to predict a character's role based on the sounds they made significantly better than chance, though with far from perfect accuracy (87%). It may be possible to improve our accuracy, and perhaps train a classifier to detect more specific differences between individual characters, but this was not the immediate next step. Our results serve as a proof of concept, showing that we can detect differences in sonic signatures between characters. However, this had been reliant on modern pronunciation of the source text, which is sometimes quite different from how Shakespeare's actors might have pronounced his words. As such, we then converted our data to original pronunciations derived from a Shakespearean pronunciation dictionary (Crystal, 2016), and found that our accuracy changed very little.

Aside from problems with early modern pronunciation, a perfect classifier is not precisely the end goal. Rather, the motivating research question is what insight can differences in sonic signatures provide scholars regarding the differences between characters and authors. For instance, do some authors write such that their characters’ signatures are more distinguishable? For such insight, it is important to be able to present scholars with analysis of the important phonemes in the context of the passages themselves, such that they are able to apply practices of close reading to investigate differences.

To afford such investigation as it relates to a potential interpretation of individual phonemes, we have created a prototype of a visual tool for comparing the “lightness” and “darkness” of the speech of individual characters. Our tool, called Ophelia’s OH, has grown into a system for comparing the prevalence within characters' speech of vowel sounds that have found to be associated with “light” and “dark” connotations (Newman, 1931). Its colored visualizations highlight characters that use dramatically more or less of a particular vowel sound than the rest of the characters in the play (see Figure 1). Upon identifying a character of interest, a researcher can click to drill down into a view of that character’s lines, annotated with colored tags showing relative proportions of different phonemes within the passages themselves. Such “tagged text” representations help place vowel differences in more specific context within the plays.

Figure : Screenshot of Ophelia’s OH, a prototype for detecting differences in the “lightness” and “darkness” of characters' characteristic speech patterns.

Going forward, we plan to work with readers to refine Ophelia’s OH to better highlight the most important characters and phonemes, and extend it to be able to discern differences amongst authors. Finally, we mean to extend our scope to determine whether the ability to distinguish characters by sonic signature is done by different authors with differing levels of success.

Bibliography Crystal, D. (2016).  The Oxford dictionary of original Shakespearean pronunciation . Oxford University Press. Loper, E. and Bird, S. (2002). NLTK: the natural language toolkit.  arXiv preprint cs/0205028 . Newman, S.N. (1931). Further experiments in phonetic symbolism.  American Journal of Psychology 45 , pp.53-75.