The Holy and the Godless - Cultural Stereotypes Featured in the Language of the Polish Medieval Hagiography. A Corpus-based Study. Ledzińska Anna Institute of Polish Language, Polish Academy of Sciences ledzinsk.a@gmail.com 2016-03-05T19:31:00Z Maciej Eder, Pedagogical University in Krakow Jan Rybicki, Jagiellonian University
Institute of Polish Studies Pedagogical University ul. Podchorazych 2 30-084 Krakow, Poland maciej.eder@ijp-pan.krakow.pl

Converted from a Word document

Paper Poster Middle Ages Hagiography Lexical studies Corpus studies corpora and corpus activities encoding - theory and practice lexicography literary studies medieval studies scholarly editing semantic analysis stylistics and stylometry text analysis philology xml authorship attribution / authority bibliographic methods / textual studies linguistics genre-specific studies: prose, poetry, drama English

My poster presents a work-in-progress focused on the language of the Polish medieval Latin hagiography. The main subject of my research are the words and expressions denoting the holy and the godless extracted from the corpus. Therefore my poster concentrates on four main topics:

1. The Corpus of the Polish Medieval Hagiography itself

2. Delimitation of the field of study within the linguistic material in relation to historical circumstances and cultural landscape of the Polish Middle Ages

3. Methodology of the research

4. Questions arising and first results.

1. At present the Corpus consists of three main types of texts – Vitae, Miracula, Translationes - Lives, Miracles and Translations of the bodies of the Saints. It comprises about thirty texts, it is nearly 500 000 words. Firstly, I would like to explain the reasons of such a selection of the components resulting in the expected homogeneity of the whole body of texts (from the linguistic as well as cultural point of view): a. established time frame and limitations concerning authorship and place where a given text was written (Starnawski, 1993), b. the phenomenon of mixing of the literary genres within particular groups of texts, c. last but not least - the preservation of the texts and the existence of critical editions of the majority of them. Secondly, I describe technical parameters of the Corpus (Piotrowski, 2012): XML format, microstructure and typography of the texts encoded according to the TEI Guidelines (P5), morphosyntactic mark-up with TreeTagger (Schmid, 1994). The Corpus is being analyzed using multi-tool platform TXM (Heiden., Magué, Pincemin, 2010).

2. The starting point of the investigation is the scheme of the seven virtues and seven vices with their subdivisions, popular in the European Middle Ages, sometimes represented in manuscripts in the form of two seven-branched, multi-leaved trees, each part of which is meticulously subscribed with a respective Latin expressions (Marchese, 2013). This traditional set of vices and virtues is being compared with the list of words of the Corpus. At the same time, apart from the historical and literary data mentioned above, other elements of character description, such as epithets, comparisons and attributes connected strictly with each of the saints and their opponents are searched and analyzed (Sinclair, 2003; Stubbs, 2001). As a result I expect to obtain a huge lexical resource containing the most eminent vocabulary of the subject, which will open the prospect of further research. It should be noted that the antithetic tension between the good and the bad one was particularly emphasized in the tradition about the Bishop Stanislaw murdered by the King Boleslaw II the Generous. Because of the great importance of the Saint himself and of the wide spread narration about the Episcopal/royal conflict, it is very interesting to observe how it is echoed (or ignored) in the histories of other saints.

3. Since the texts considered were written in a very long period - between the end of the 10th and the end of the15th century - the linguistic material is studied mainly in a diachronic perspective, although there are other possible applications, which require dividing the Corpus into parallel sub-corpora. The possible research questions in this case include: a. comparing and contrasting the image of male and female Saints, b. comparing Dominican, Franciscan and other traditions, c. showing the dialogue between the hagiographers of the two Patrons of the Polish Kingdom: St. Stanislaw and St. Adalbert, d. shedding some more light on Polish saints and sinners in the context of the European Middle Ages (in the future).

4. The main question is two-fold: what are the most frequent features and virtues ascribed to the Polish medieval saintly men and women and, on the contrary - features and vices used to describe sinners/bad people? Then, does it change in time and is this change congruent with what we know about the transformation of the models of sanctity and the image of "the Other"? All these problems, as well as other issues mentioned above (point 3) can only be solved by a complex analysis of textual data combining methods of the corpus linguistics and those of the literary and historic studies. The Corpus I present is meant to be a starting point for such complex studies.

Bibliography Gaşpar, C., Miladinov, M. and Wood, I. (2013). Saints of the Christianization Age of Central Europe: Tenth to Eleventh Centuries. Central European University Press. Heiden, S., Magué, J.-P. and Pincemin, B. (2010). TXM : Une plateforme logicielle open-source pour la textométrie - conception et développement. The 10 th International Conference on the Statistical Analysis of Textual Data - JADT 2010. Edizioni Universitarie di Lettere Economia Diritto, pp. 1021–32. Marchese, F. T. (2013). Virtues and Vices: Examples of Medieval Knowledge Visualization. Proceedings of the 17th International Conference on Information Visualization: IV'13. (London, UK, July 16-18, 2013), IEEE Computer Society. Los Alamitos, CA, pp. 359-65. McEnery, T. and Wilson, A. (2001). Corpus Linguistics: An Introduction. Edinburgh University Press. McEnery, T., Xiao, R. and Tono, Y. (2006). Corpus-based Language Studies: An Advanced Resource Book. Taylor & Francis. Piotrowski, M. (2012). Natural Language Processing for Historical Texts. Morgan & Claypool Publishers. Plezia, M. (1987). Średniowieczne żywoty i cuda patronów Polski. Instytut Wydawniczy Pax. Schmid, H. (1994). Probabilistic Part-of-Speech Tagging Using Decision Trees. In Proceedings of International Conference on New Methods in Language Processing. Manchester, pp. 44–49. Sinclair, J. (1991). Corpus, Concordance, Collocation. Oxford University Press. Sinclair, J. (2003). Reading concordances : an introduction. New York: Pearson/Longman. Starnawski, J. (1993). Drogi rozwojowe hagiografii polskiej i łacińskiej w wiekach średnich. Kraków: Polskie Towarzystwo Teologiczne.  Stubbs, M. (2001). Words and Phrases: Corpus Studies of Lexical Semantics. Wiley-Blackwell. Witkowska, A. (1999). Nasi święci: polski słownik hagiograficzny. Księgarnia Św. Wojciecha.