Converted from a Word document
The main goal of the
https://www.correspsearch.net For a general overview see Dumont (2016).correspSearch web service
Gathering a large quantity of suitable metadata about published letters is the precondition and one of the basic functions of the web service
https://www.tei-c.org/activities/sig/correspondence/ https://correspsearch.net/index.xql?id=participate_cmi-format&l=decorrespSearch. The format used in a data aggregation project such as ours has to fulfill a wide array of requirements, ranging from interoperability with existing standards to ease of use and a straightforward creation process. The TEI Correspondence SIG
Since the aggregated data is stored decentralized with each edition, participating editions and data providers preserve full ownership and control over their data. Based on the TEI-XML standard, CMIF follows strong guidelines and utilizes the strengths of XML to prevent faulty or ambiguous data while taking into account the heterogeneity of metadata. Furthermore, CMIF requires the usage of authority files for names and places in order to account for their ambiguity and linguistic heterogeneity (see Stadler, 2012). All in all, CMIF successfully provides a model for describing correspondences in an easy and rather flat hierarchical way, while relying on a strict ruleset in favor of standardized and machine-processable data.
As the metadata aggregated by our service is licensed under CC BY, it remains open and thus stands for an Open Access-based approach to correspondence metadata aggregation. Open Data provides an easier ground on which research can be conducted without having to take care of licensing beforehand and with a much larger data pool. The analysis of a specific network of letters, as for example in
csLink, can only benefit from this approach. In addition and in accordance to the licensing for the aggregated data, all software developed by
correspSearch is published as Free Software.
The
https://correspsearch.net/creator/index.xql, https://github.com/correspSearch/CMIF-Creator https://www.dnb.de/ https://www.geonames.org/ https://lobid.org/gnd/apiCMIF Creator
CMIF Creator we offer a clean graphical user interface to enter available metadata and transform the entered data to valid XML. Lowering the barrier of generating and contributing data has been a key factor for successful and lively external data contributions in the last years. As the
CMIF Creator is implemented as a browser-based application, all entered and processed data is saved locally and thus stays within the control of the user at any time. As CMIF heavily relies in authority control data, authority files data for names and places can be acquired directly from GND
CMIF Creator also offers a validation service as well as the option to locally save drafts in JSON format so that work can be continued at any time. The final CMIF files can then be provided for aggregation into the
correspSearch database. The benefits of using the
CMIF Creator are obvious when it comes to the actual experiences we made: Besides the low barriers for data entry – it does not require any prior knowledge and experience of TEI-XML – the average time for a student assistant to process and enter the metadata of a single letter out of a printed letter edition is approximately 30 to 60 seconds, depending on the necessity to further disambiguate authority control data. Thus, large quantities of letters can be processed in a reasonable amount of time. The output format is standardized and does not deviate from TEI specifications, which reduces the incidence of errors in the final XML.
Another potential of the open data approach with a rather analytical purpose is followed in our development of the application
https://github.com/correspSearch/csLink https://edition-humboldt.de https://weber-gesamtausgabe.de/de/IndexcsLink
csLink is a widget for websites that can be implemented and included in existing digital editions of letters. It establishes a “network of letters”, displaying the correspondence network of a certain person and reaching beyond the scope of a single edition of letters. Customized by optional parameters given,
csLink provides a list of other letters from the same network of letters, as well as a list of persons, who are part of the wider correspondence network. By relying on CC BY licensed metadata the widget is available for anyone interested in such a visual representation of the correspondence network belonging to a person. Being able to acquire a visual impression of different letter networks and the corresponding persons offers immediate epistemological gains in the study of complex network relationships. Utilizing the aggregated metadata enables
csLink to situate letters in a single edition within a wider context of correspondence and communication. Examples for applications of
csLink are the digital editions
humboldt digital
, as well as
Weber Gesamtausgabe
In opposition to closed data services, the open data approach of CMIF and its potential in enabling any edition (be it printed or digital) to provide their very own correspondence metadata is extremely beneficial to mapping wider networks of communication. Since any letter network to be explored (for example in
csLink) only shows the information that is available as data already entered into
correspSearch from letters that are already edited and in some way published, a larger base of people actually committing reliable data substantially improves the database and reliability of epistemological gains from this data (see Grüntgens/Schrade, 2016). The development and usage of the
CMIF Creator has proven very valuable for increasing the amount of aggregated metadata. Further addition of data and increasing connections between the letters in our data form a kind of crowd based validated open metadata that does not rely on a single contributor or institution. It is thus not only beneficial for network analysis in a strict analytical sense but also contributes to and implements an agenda to further establish open data principles in the digital humanities.
As a leading provider for the decentralized aggregation of metadata from letter editions with the purpose to facilitate research on letter editions and correspondence networks on the basis of a standardised and open XML-format,
correspSeach, together with
CMIF and
correspDesc, were awarded with the Rahtz Price for TEI Ingenuity 2018.