Toponyms as Entry Points into a Digital Edition: Mapping Die Fackel (1899-1936)

Adrien Barbaresi

adrien.barbaresi@oeaw.ac.at Austrian Academy of Sciences, Austria

Introduction

The significance of place names exceeds the usually admitted frame of deictic
and indexical functions, as they enfold more than a mere reference in space. In
the western tradition, a current of reflexion which seems to date back to the
1960s has provided the theoretical foundations of the “spatial turn”, whose
epitome is the concept of space as emergent rather than existing a priori, and
composed of relations rather than structures (Warf, 2009). The emergence of
current named “GeoHumanities” (Dear et al., 2011) or “Spatial Humanities”
(Bodenhammer et al., 2010), has prompted for a transfer of research objects
between disciplines as well as an enforcement of the spatial turn in practice
through specific methods of analysis. The common denominator consists in
opening up new spaces and experimenting in a transdisciplinary perspective
(Dominguez, 2011) in a field which has been evolving at an exponential pace
since the last decade (Caquard and Cartwright, 2014).

In this paper, I introduce a visualization of collocations of toponyms in the
satirical literary magazine Die Fackel (“The Torch”), originally published and
almost entirely written by the satirist and language critic Karl Kraus in
Vienna from 1899 to 1936. This work carries heterogeneity at its core
and contains a considerable variety of toponyms (Biber, 2001) which are highly
significant because of the multinational nature of the Austro-Hungarian
empire and the later formation of a territorially diminished state.

In order to provide an additional, synthetic access to a digital edition of the
work which is already available online (AAC-Fackel corpus), I set out on
a distant reading experiment leading to maps meant to uncover patterns and
specificities which are not easily retraceable during close reading. I focus on
the concept of visualization, that is on the processes and not on the products
(Crampton, 2001), and present them together with a critical apparatus, by
giving a theoretical perspective on what is being shown and seen. In fact,
digital methods in humanities ought to be criticized (Wulfman, 2014) and the
cartographic enterprise bears both a thrill and a risk: “adding more to the
world through abstraction”, and “adding to the riskiness of cartographic
politics by proliferating yet more renders of the world” (Gerlach, 2014). 
Extraction of toponyms

The particular task of finding place names in texts is commonly named place
names extraction, toponym resolution, or geocoding. A first stage involves
the identification of potential geographic references, while a second stage
resides in a disambiguation process (Leetaru, 2012). Toponym resolution
often relies on named-entity recognition and artificial intelligence (Leidner
and Lieberman, 2011). However, knowledge-based methods using fine-grained data
-for example from Wikipedia - have already been used with encouraging results
(Hu et al., 2014).

The present endeavor grounds on a specially curated gazetteer: during the 20th
century there have been significant political changes in Central Europe that
have severely affected toponyms, so that geographical databases lack coverage
and detail. Consequently, the database developed at the Austrian Academy of
Sciences (Academy Corpora) in cooperation with the Berlin-Brandenburg Academy
of Sciences (Language Center) focuses on Europe and follows from a combination
of approaches: gazetteers are curated in a semi-supervised way to account
for historical differences, and current geographical information is used as a
fallback. Wikidata API and the Geonames database are used to build the
database semi-automatically.

The tokenized files of works to be analyzed are filtered and matched with the
database by finite-state automatons (Barbaresi and Biber, 2016): toponyms 
(single or multi-word expressions) are extracted using a sliding window. A
cascade of filters is used: current and historical states; regions, important
subparts of states, and regional landscapes; populated places; and geographical
features. Disambiguation being a critical component (Leetaru, 2012), an
algorithm similar to Pouliquen et al. (2006), who demonstrated that
an acceptable precision can be reached that way, guesses the most probable
entry based on distance to Vienna (Sinnott, 1984), contextual information
(closest-country, last names resolved), and importance (place type, population
count). The results are projected on a map of Europe using TileMill.

From collocations to lines of thought

In a further analysis, I visualize co-occurrences of extracted toponyms, which
can be considered to be a subset of GeoCollocations (Bubenhofer, 2014),
in order to draw sequences, airborne lines following their order of appearance.
The word “network” is to be used with circumspection as Latour (1999)
suggests. Although it is ubiquitous in the terminology of the spatial turn, the
now predominant interpretation in the sense of the World Wide Web suggests
an immediacy which is contrary to the acceptions it had before, so that the
concept of “meshwork” is more appropriate (Ingold, 2007). I thus interpret
Figure 1 as a general meshwork which makes it possible to visualize paths
depicting chains of thought (Gedankengänge) as well as their intensity
(well-trodden or seldom). If they may reveal spatial patterns that would
otherwise remain hidden in texts (Bodenhammer et al., 2010), these linkages are
also “mappings and tracing imposed on the data” (Wulfman, 2014) which are not
meant to be interpreted without further filtering.

[209-1]

Figure 1. Unfiltered map of toponymic co-occurrences

A rhizome as entry to Die Fackel

That is why I refine the map by selecting a subset of the co-occurrences - the
maximal distance between two extracted place names is fixed to twenty tokens
-and by color-coding qualitative features - the typology of place names and the
axis of time. The most frequent place names are printed out. Surfaces (regions
for instance) cannot be represented as such because of historical evolutions
and because of the difficulties of linking surfaces without tampering with
map readability. Coastlines are depicted in gray to give a sense of
orientation, no boundaries are drawn, as they are of a changing nature and may
superimpose an artificial reading of the map (Smith 2005).

[209-2]

Figure 2. Refined analysis (size proportional to corpus frequency; yellow:
sovereign territories; orange: regions; green: populated places; blue:
geographical features; time axis rep

resented by a gradient from light green to dark blue)

The notion of rhizome has been used in corpus linguistics by Scharloth et al.
(2013) to qualify discourses captured by collocation graphs, it has originally
been coined by Deleuze and Guattari (1987 [1980]). This concept is particularly
adequate for Kraus, as the Austrian satirist has always been concerned by the
multiple aspects of discourse and reality. In addition, his work in Die Fackel
evades distant reading processes because of the number of citations used and
its ever present and extensive usage of parody. It would be vain to design an
authoritative cartography of Die Fackel: following from the principles of
heterogeneity and “asignifying rupture” (ibid.), the lines are frequently
interrupted. Phenomena in the low-frequency range are filtered out by the human
eye, but clusters and interpretation cues may emerge which provide a different
access to the work. In this regard, Figure 2 depicts a rhizome connecting
heterogeneous information, just as we are all “traversed by lines, geodesics,
tropics, and zones marching to different beats and differing in nature” 
(ibid.).

Conclusion

I have presented a distant reading experiment which consists of connecting
toponyms extracted and projected on maps. The latter are meant to be
released as an additional feature to the existing digital edition. The Software
and gazetteer will be made available under open-source licenses, for
development files (please refer to the Github repository). The first example
displays unfiltered lines of thought, whereas the second one grounds on a
refined analysis and lets the practical image of a rhizome emerge. While
Die Fackel criticizes mechanical, instrumental language (Hirt, 2002), the
“well-informed” linguistic instruments can help materializing dots or
sequences, but not without “human” intervention. The filtering steps on the
projection echo the hermeneutic circle of the philological tradition; they also
make computed information visible and apprehensible which could remain “blind”
otherwise (Barbaresi, 2012).

This is not an authoritative cartography of Die

Fackel but rather an indirect depiction of the viewpoint of Kraus and his
contemporaries. Drawing on Kraus’ vitriolic recording of political life,
toponyms in Die Fackel tell a story about the ongoing reconfiguration of
Europe. As the map conveys the uncanny sensation that the continent is a field
on which points east and west are projected, the lines of force entangle
European countries and capitals. Their spatial patterns document an inclination
for major cultural centers, whereas the chronological dimension captures a
major shift towards the end of publication: the force field intensifies as its
range narrows, showing both the interplay of major European powers of the time
and the emergence of transatlantic (westwards) and transeuropean (eastwards)
 relationships. This evolution can be read as an intensification of tensions
and a prefiguration of other schemes, this time of military nature.