Converted from a Word document
Research projects often create domain-specific annotation vocabularies. This poster will present approaches to the modelling and migration of encoded charter data that arose during the migration of the Charters Encoding Initiative (CEI: www.cei.lmu.de) to be compliant with the current version of the Text Encoding Initiative (TEI P5: www.tei-c.org/ ). It is part of a project to migrate and enhance encoded charter descriptions from the virtual charter platform monasterium.net in order to provide a well documented, reusable environment that prolongs the data life cycle (cf. Buddenbohm et al. 2016; Wissik, Ďurčo 2015; Büttner et al. 2011; Flanders, Muñoz 2015). The migration follows established principles of data sustainability and interoperability (cf. Cremer 2015 et al.). It relies on the ingest of data from Monasterium.net into the GAMS (Geisteswissenschaftliches Asset Management System), a Fedora-Commons-based certified trusted digital repository at the University of Graz that handles preservation and publication, and also provides benefits like data visibility, unique handle references (handle.net), and the provision of interoperable data via OAI-PMH (Stigler / Steiner 2018).
To achieve this, a new data model extension had to be developed in order to support both scholarly needs and the careful curation of data. The project evaluated which concepts from the charter domain are of wider importance. The new TEI P5 extension for charter-specific data, based upon the existing CEI, has to structure the data in a context-neutral manner that supports encoding diverse historical periods and regions using diplomatic TEI markup (Vogeler 2018), including Ethiopian royal acts (Wion 2018), Nepalese charters from Mustang (Ramble 2018), and early modern grants of arms from (e.g.) Marburg ( 1581-12-14_Marburg ). This justifies a data model extension in order to support both scholarly needs and the careful curation of data. It includes elements new to the TEI to model:
Domain-specific annotation can be achieved additionally through the creation of structured ontologies, e.g. describing methods of authentication, types of manuscript additions, and heraldic blazonry to support the typing and subtyping of data. This enhances the possibility of semantic use in the principle of Linked Open Data (LoD).
Faced with heterogeneous data from a variety of sources (direct-entry from archives, web scraping, digitization of catalogues, and carefully hand-crafted born-digital description), the project involves a series of transformations where charter encoding is re-imagined and the CEI is re-modelled and transformed to the TEI P5 in a context-sensitive manner (see Ambrosio et al. 2014). This entails:
The new data model will be tested through the development of facet-based search and predefined queries and visualization based upon scholarly needs of target audience of diplomatics, legal history, and art history scholars. All of this is part of a long-term project to develop tools that enable end users such as archives and individual scholars, as well as the repository monasterium.net, to describe historic legal data in a structured manner that is semantically interoperable with other historical data.