{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Unit 10\n", "\n", "## Ongoing research related to automatic collation\n", "\n", "In alphabetical order:\n", "\n", "- Andrews: Stemmaweb\n", "- Birnbaum: Multilingual collation\n", "- Carbé: Collating born digital files of the italian writer Francesco Pecoraro\n", "- Olsson: eXist DB plugin for collation\n", "- Palladino: Intra-language text alignment, iAligner\n", "- Parina: Translations as language contact phenomena\n", "- Rockenberger: The *Ethica Complementoria* – Reconstructing Textual Transmission & Revision in Early Modern Print\n", "- Scacchi: From *Branchie* to *Woobinda*, collations for the twentieth century\n", "- Smith: Collation editor\n", "- Thomas: Humboldt's Kosmos lectures\n", "\n", "\n", "### Stemmaweb\n", "#### Tara Andrews\n", "\n", "---\n", "\n", "### Multilingual collation\n", "#### David J. Birnbaum\n", "\n", "(Collaborative project with Hanne Martine Eckhoff, University of Tromsø)\n", "\n", "##### About the project\n", "\n", "* **Source:** _Codex Suprasliensis_, xi c. Now divided between Warsaw, Ljuljana, and St. Petersburg.\n", "* **Language:** Old Church Slavonic (OCS)\n", "* **Contents:** 24 vitae of Christian saints for the month of March and 23 homilies for triodion cycle of the church year\n", "* **Genre:** Lectionary menaeum (or panegyric), combined with homilies from the movable Easter cycle, most of which were written by or are attributed to John Chrysostom.\n", "* **Edition:** http://suprasliensis.obdurodon.org/\n", "\n", "##### Linguistic annotation\n", "Linguistic annotation tool and repository: TOROT (https://nestor.uit.no/), which is a continuation of PROIEL http://proiel.github.io/\n", "\n", "##### Morphological analysis\n", "\n", "\n", "##### Syntactic analysis\n", "\n", "\n", "##### XML export\n", "<token id=\"1686779\"\n", "**form=\"Блаженꙑи\"**\n", "lemma=\"блаженъ\"\n", "**part-of-speech=\"A-\"**\n", "presentation-before=\"\"\n", "morphology=\"-s---mnpwi\"\n", "head-id=\"1686780\"\n", "relation=\"atr\"\n", "presentation-after=\" \"\n", "citation-part=\"8\"\n", "part=\"1\"\n", "folio=\"061\"\n", "side=\"r\"\n", "line=\"16\"\n", "linebreak=\"false\"/>\n", "\n", "##### CollateX input\n", "`{\"t\":\"Блаженꙑи\", \"n\":\"A-\"}`\n", "\n", "##### Collation\n", "\n", "Note the correct alignment of the last four words and the misalignment at the beginning of the line (Greek has an article; OCS does‘t).\n", "\n", "\n", "---\n", "\n", "### Collating born digital files of the italian writer Francesco Pecoraro\n", "#### Emmanuela Carbé\n", "PAD - Pavia Archivi Digitali is a project of the University of Pavia to collect and preserve born digital materials provided by Italian authors, journalists and cultural personalities. The most difficult acquisition for PAD has been that of Francesco Pecoraro’s archive. He debuted with the collection of short stories *Dove credi di andare* (Pecoraro 2007), a collection of writings from his blog in *Questa e altre preistorie* (Pecoraro 2009), the poems in *Primordio Vertebrale* (Pecoraro 2011) and the novel *La vita in tempo di pace* (Pecoraro 2013) which became in 2014 a relevant literary case – soon to be published in English, French and Dutch. At the workshop I would like to attempt the collation of some of Pecoraro's born digital drafts of *La vita in tempo di pace*.\n", "\n", "---\n", "\n", "### eXist DB plugin for collation\n", "#### Leif-Jöran Olsson\n", "\n", "---\n", "### Intra-language text alignment: iAligner\n", "#### Chiara Palladino\n", "The presentation introduces an in-development tool for intra-language text alignment,\n", "iAligner ( http://i-alignment.com/ ). The tool uses syntax-based dynamic programming\n", "methods to compute the optimal alignment of two or more parallel texts in the same language.\n", "We will introduce the subject of intra-language text alignment, report shortly on the chosen\n", "algorithm and its modifications, then focus on some use cases from the field of historical\n", "languages, with particular regard to Ancient Greek and Latin.\n", "\n", "---\n", "\n", "### Translations as language contact phenomena: studies in lexical, grammatical and stylistic interference in Middle Welsh religious texts\n", "#### Elena Parina\n", "In our project at the Philipps-Universität Marburg we analyse linguistic properties of religious texts translated into Welsh in the 14th century. These texts are often found in more than one manuscript, so we need to collate them in order to understand the relationship between different text witnesses. Some of the texts were translated by previous scholars, thus a collation with translations into modern languages is useful for us. At the core of our research are tables collating Latin sources and their Middle Welsh translations, this comparison is accompanied however with several caveats, which I would like to discuss shortly.\n", "\n", "--- \n", "\n", "### The *Ethica Complementoria* – Reconstructing Textual Transmission & Revision in Early Modern Print\n", "#### Annika Rockenberger\n", "This project is part of a larger editorial endeavor which has a (German) website, http://greflinger.hypotheses.org/\n", "\n", "---\n", "\n", "### From *Branchie* to *Woobinda*: collations for the twentieth century\n", "#### Alessia Scacchi\n", "Participate in the Dixit workshop for a scholar of the twentieth century’s literature is something bizarre yet is curious to test this tool, CollateX, on a textual matter that is studied too little in a philological perspective. So I landed in Amsterdam starting from collaborations with Giuseppe Gigliozzi and automatic study of the texts in the nineties, with Domenico Fiormonte for Digitalvariants, through the comparison and study of textual frequencies for my dissertation and doctorate.\n", "\n", "Here I propose a study that goes to reuse and somehow rethink a research’s textual corpus started a few years ago. The goal is to assess whether and to what extent, in what ways and why they change the printed editions of the 900 when authors pass from the small to the mainstream publishing.\n", "\n", "In this sense, I analyze two novels born in the postmodern climate: Niccolò Ammanniti's *Branchie* and Aldo Nove's *Woobinda*, two key novels for generations, that have marked with their narrative skills a road never travelled before: the metropolitan cannibalism. It has involved a large part of the narrative production in the 90's.\n", "I want to replace in automatic the traditional method I used, but what is the added value, which are the difficulty in viewing and interpretation of textual material processed with CollateX? This analysis can alter the paradigm of literary criticism or not? These are the questions which I am going to respond once reworked and made to settle the content of the course we are about to follow.\n", "\n", "---\n", "\n", "### Collation editor\n", "#### Catherine Smith\n", "The collation editor is a wrapper around collateX which provides a graphical user interface allowing editors to work interactively with the output of the collation. It supports drag and drop regularisation/normalisation, allows misalignments to be corrected, allows variant unit length to be set and the readings to be sorted into the order required by the editor. It was originally designed for the creation of the Editio Critica Maior of the Greek New Testament but has also been used by other projects at the University of Birmingham. The code is available on github at www.github.com/itsee-birmingham/collation_editor.\n", "\n", "---\n", "\n", "### Collating witnesses of Humboldt's Kosmos lectures\n", "#### Christian Thomas\n", "I am working with lecture notebooks by attendees of Alexander von Humboldt's so-called 'Kosmos-Lectures', held in two distinct courses in Berlin in 1827/28. They have been published as TEI-XML editions by the Hidden Kosmos project, http://www.culture.hu-berlin.de/hidden-kosmos (Humboldt-University) in cooperation with the Deutsches Textarchiv, http://www.deutschestextarchiv.de/ (Berlin-Brandenburg Academy of Sciences and Humanities). The notebooks altogether show a great variety and are as such no material for automated collation. However, certain 'iconic' passages are more similar, as the example below shows, and therefore can be aligned with tools loke CollateX. Another set of comparable, in the sense of collatable, witnesses are two notebooks of which one is the copy of the other. Between other notebooks, several individual sessions have been copied, oviously when one student missed a lesson and asked his neighbour for his notes. To find these sessions, I have learned to use Copyfind, http://plagiarism.bloomfieldmedia.com/wordpress/software/wcopyfind/ (Thanks again, @mhbeals, for this valuable suggestion!). I have worked intensely on the preparation of the TEI-XML-encoded witnesses in order to allow for the best collation results. In the workshop I figured out how to (in theory) work directly with these XML-documents. But this has proven to be beyond my capacities (and comparing 180+ pages of transcribed text is beyond what collatex can process in good time...), I will instead use an XSLT to produce plain text from my XML and then: collate(my_collation)!\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 1 }