Toponyms as Entry Points into a Digital Edition: Mapping Die Fackel (1899-1936) Adrien Barbaresi adrien.barbaresi@oeaw.ac.at Austrian Academy of Sciences, Austria Introduction The significance of place names exceeds the usually admitted frame of deictic and indexical functions, as they enfold more than a mere reference in space. In the western tradition, a current of reflexion which seems to date back to the 1960s has provided the theoretical foundations of the “spatial turn”, whose epitome is the concept of space as emergent rather than existing a priori, and composed of relations rather than structures (Warf, 2009). The emergence of current named “GeoHumanities” (Dear et al., 2011) or “Spatial Humanities” (Bodenhammer et al., 2010), has prompted for a transfer of research objects between disciplines as well as an enforcement of the spatial turn in practice through specific methods of analysis. The common denominator consists in opening up new spaces and experimenting in a transdisciplinary perspective (Dominguez, 2011) in a field which has been evolving at an exponential pace since the last decade (Caquard and Cartwright, 2014). In this paper, I introduce a visualization of collocations of toponyms in the satirical literary magazine Die Fackel (“The Torch”), originally published and almost entirely written by the satirist and language critic Karl Kraus in Vienna from 1899 to 1936. This work carries heterogeneity at its core and contains a considerable variety of toponyms (Biber, 2001) which are highly significant because of the multinational nature of the Austro-Hungarian empire and the later formation of a territorially diminished state. In order to provide an additional, synthetic access to a digital edition of the work which is already available online (AAC-Fackel corpus), I set out on a distant reading experiment leading to maps meant to uncover patterns and specificities which are not easily retraceable during close reading. I focus on the concept of visualization, that is on the processes and not on the products (Crampton, 2001), and present them together with a critical apparatus, by giving a theoretical perspective on what is being shown and seen. In fact, digital methods in humanities ought to be criticized (Wulfman, 2014) and the cartographic enterprise bears both a thrill and a risk: “adding more to the world through abstraction”, and “adding to the riskiness of cartographic politics by proliferating yet more renders of the world” (Gerlach, 2014). Extraction of toponyms The particular task of finding place names in texts is commonly named place names extraction, toponym resolution, or geocoding. A first stage involves the identification of potential geographic references, while a second stage resides in a disambiguation process (Leetaru, 2012). Toponym resolution often relies on named-entity recognition and artificial intelligence (Leidner and Lieberman, 2011). However, knowledge-based methods using fine-grained data -for example from Wikipedia - have already been used with encouraging results (Hu et al., 2014). The present endeavor grounds on a specially curated gazetteer: during the 20th century there have been significant political changes in Central Europe that have severely affected toponyms, so that geographical databases lack coverage and detail. Consequently, the database developed at the Austrian Academy of Sciences (Academy Corpora) in cooperation with the Berlin-Brandenburg Academy of Sciences (Language Center) focuses on Europe and follows from a combination of approaches: gazetteers are curated in a semi-supervised way to account for historical differences, and current geographical information is used as a fallback. Wikidata API and the Geonames database are used to build the database semi-automatically. The tokenized files of works to be analyzed are filtered and matched with the database by finite-state automatons (Barbaresi and Biber, 2016): toponyms (single or multi-word expressions) are extracted using a sliding window. A cascade of filters is used: current and historical states; regions, important subparts of states, and regional landscapes; populated places; and geographical features. Disambiguation being a critical component (Leetaru, 2012), an algorithm similar to Pouliquen et al. (2006), who demonstrated that an acceptable precision can be reached that way, guesses the most probable entry based on distance to Vienna (Sinnott, 1984), contextual information (closest-country, last names resolved), and importance (place type, population count). The results are projected on a map of Europe using TileMill. From collocations to lines of thought In a further analysis, I visualize co-occurrences of extracted toponyms, which can be considered to be a subset of GeoCollocations (Bubenhofer, 2014), in order to draw sequences, airborne lines following their order of appearance. The word “network” is to be used with circumspection as Latour (1999) suggests. Although it is ubiquitous in the terminology of the spatial turn, the now predominant interpretation in the sense of the World Wide Web suggests an immediacy which is contrary to the acceptions it had before, so that the concept of “meshwork” is more appropriate (Ingold, 2007). I thus interpret Figure 1 as a general meshwork which makes it possible to visualize paths depicting chains of thought (Gedankengänge) as well as their intensity (well-trodden or seldom). If they may reveal spatial patterns that would otherwise remain hidden in texts (Bodenhammer et al., 2010), these linkages are also “mappings and tracing imposed on the data” (Wulfman, 2014) which are not meant to be interpreted without further filtering. [209-1] Figure 1. Unfiltered map of toponymic co-occurrences A rhizome as entry to Die Fackel That is why I refine the map by selecting a subset of the co-occurrences - the maximal distance between two extracted place names is fixed to twenty tokens -and by color-coding qualitative features - the typology of place names and the axis of time. The most frequent place names are printed out. Surfaces (regions for instance) cannot be represented as such because of historical evolutions and because of the difficulties of linking surfaces without tampering with map readability. Coastlines are depicted in gray to give a sense of orientation, no boundaries are drawn, as they are of a changing nature and may superimpose an artificial reading of the map (Smith 2005). [209-2] Figure 2. Refined analysis (size proportional to corpus frequency; yellow: sovereign territories; orange: regions; green: populated places; blue: geographical features; time axis rep resented by a gradient from light green to dark blue) The notion of rhizome has been used in corpus linguistics by Scharloth et al. (2013) to qualify discourses captured by collocation graphs, it has originally been coined by Deleuze and Guattari (1987 [1980]). This concept is particularly adequate for Kraus, as the Austrian satirist has always been concerned by the multiple aspects of discourse and reality. In addition, his work in Die Fackel evades distant reading processes because of the number of citations used and its ever present and extensive usage of parody. It would be vain to design an authoritative cartography of Die Fackel: following from the principles of heterogeneity and “asignifying rupture” (ibid.), the lines are frequently interrupted. Phenomena in the low-frequency range are filtered out by the human eye, but clusters and interpretation cues may emerge which provide a different access to the work. In this regard, Figure 2 depicts a rhizome connecting heterogeneous information, just as we are all “traversed by lines, geodesics, tropics, and zones marching to different beats and differing in nature” (ibid.). Conclusion I have presented a distant reading experiment which consists of connecting toponyms extracted and projected on maps. The latter are meant to be released as an additional feature to the existing digital edition. The Software and gazetteer will be made available under open-source licenses, for development files (please refer to the Github repository). The first example displays unfiltered lines of thought, whereas the second one grounds on a refined analysis and lets the practical image of a rhizome emerge. While Die Fackel criticizes mechanical, instrumental language (Hirt, 2002), the “well-informed” linguistic instruments can help materializing dots or sequences, but not without “human” intervention. The filtering steps on the projection echo the hermeneutic circle of the philological tradition; they also make computed information visible and apprehensible which could remain “blind” otherwise (Barbaresi, 2012). This is not an authoritative cartography of Die Fackel but rather an indirect depiction of the viewpoint of Kraus and his contemporaries. Drawing on Kraus’ vitriolic recording of political life, toponyms in Die Fackel tell a story about the ongoing reconfiguration of Europe. As the map conveys the uncanny sensation that the continent is a field on which points east and west are projected, the lines of force entangle European countries and capitals. Their spatial patterns document an inclination for major cultural centers, whereas the chronological dimension captures a major shift towards the end of publication: the force field intensifies as its range narrows, showing both the interplay of major European powers of the time and the emergence of transatlantic (westwards) and transeuropean (eastwards) relationships. This evolution can be read as an intensification of tensions and a prefiguration of other schemes, this time of military nature.