"Sonic Materialization of Linguistic Data" (working title) Paraskevoudi Nadia Sonic Linguistics, Greece nadiaparask@gmail.com Alexandropoulos Timos Sonic Linguistics, Greece timosalexandropoulos@gmail.com 2014-12-19T13:50:00Z Paul Arthur, University of Western Sidney
Locked Bag 1797 Penrith NSW 2751 Australia Paul Arthur

Converted from a Word document

Paper Poster sonification social media twitter prosodic features stress corpora and corpus activities audio video multimedia prosodic studies software design and development text analysis interdisciplinary collaboration linguistics programming creative and performing arts including writing social media data mining / text mining English

The Problem of Sonification

Kramer Gregory (1994), in his book Auditory Display: Sonification, Audification, and Auditory Interfaces, defines sonification as ‘use of non-speech audio to convey information or perceptualize data’.

In our digital age we can store, edit, and examine almost all qualities and quantities as data. Sound itself can be considered as a pure stream of information able to be modulated, transformed, and analyzed in a lot of different ways.

The success of sonification occurs when the sound reveals one or more qualities of data or data reveals one or more qualities of sound. Thus, this kind of materialization of data is an interdisciplinary act that involves the proper analysis of data as well as the structure of sound.

While technology provides us with a wide variety of tools, the core of the problem still exists. As this kind of interdisciplinary knowledge is hard to be combined, there aren’t enough available tools that help artists to escape from an arbitrary mapping of data to sound qualities. This leads to arbitrary results both for the artist and the listener as the sonification process doesn’t take advantage of either the auditory perception properties or sound’s advantages in temporal, amplitude, and frequency resolution. As a result, in most cases, sonification fails its purpose, which ‘is to encode and convey information about an entire data set or relevant aspects of the data set’ (Hermann et al., 2011).

Sound and Linguistics

Sonic Materialization of Linguistic Data is a series of works and a research project aiming to provide sound artists with the tools for the proper linguistic analysis of the mined data.

In our age of constant connectivity, social media—and especially the text-based Twitter platform—can be considered as a monitor corpus that evolves perpetually and is in a process of constant change. In order to create new structures and transform this chaotic stream of data into new material—in our case, sound—it needs to be organized according to its different kind of properties, namely here, its linguistic aspects . With our Sonic Materialization of Linguistic Data work, we provide different software modules that can perform real-time linguistic analysis of data and output the result for sonification purposes.

Our software consists of different kind of modules from which the user can choose only one or a combination of more. Here we present the Stress Module. The program enables the user to aggregate data from different hashtag [#] feeds on Twitter in real time. The incoming data is being processed according to their linguistic features and in particular stress. The algorithm performs a series of tasks and extracts the stressed syllables of the aforementioned data. The output is a phonetic transcription code that represents each phoneme of the input Twitter feed. The encoded outputted list of data also includes suggestions for the sonic mapping that occurs from data’s linguistic features and the sound’s nature. For instance, the strong syllables are a numerical output that represents a longer sound event (time envelope), whereas the weaker syllables are a numerical representation of a briefer sound event. Similar kinds of optional mapping can also affect other sound features, such as pitch, timbre, ADSR envelopes, modulation, etc.

Stress, which can be considered as a prosodic feature, manifests itself in the speech stream in several ways. Stress patterns seem to be highly language dependent, considering that there is a dichotomy between stress-timed and syllable-timed languages. In stress-timed languages, primary stress occurs at regular intervals, regardless of the number of unstressed syllables in between, whereas in syllable-timed languages syllables tend to be equal in duration and therefore are inclined to follow each other at regular intervals of time. According to Halliday, ‘Salient syllables occur in stress-timed languages at regular intervals’ (1985, 272). Strong syllables bear primary or secondary stress and contain full vowels, whereas weak syllables are unstressed and contain short, central vowels.

Particularly in English, which is a stress language, speech rhythm has a characteristic pattern that is expressed in the opposition of strong versus weak syllables. Stressed syllables in English are louder, but they also tend to be longer and have a higher pitch. Despite the fact that stress can be also influenced by pragmatic factors such as emphasis, our project aims to capture the natural stress pattern of English in order to extract meaning from sound patterns, too, as they will be delineated by the phonetic structure of natural language.

Presentation

For the presentation of the project we are proposing a poster with the description of how exactly the software works and what its aim is. We also would like to include a pair of headphones and a small screen (or projector) in order to have the data analysis and the sonification process in real time for the audience to experience.

Bibliography Halliday, M. A. K. (1985). An Introduction to Functional Grammar. Arnold, London. Hermann, T., Hunt, A. and Neuhoff, J. G. (eds). (2011). The Sonification Handbook . Logos Verlag Berlin, Berlin. Kramer, G. (1994). Auditory Display: Sonification, Audification, and Auditory Interfaces. Santa Fe Institute Studies in the Sciences of Complexity, Proceedings Vol. XVIII. Addison Wesley, Reading, MA.