{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Collation outputs\n", "\n", "\n", "- Introduction\n", "- In practice\n", " - Table: HTML\n", " - Table: JSON\n", " - Table: XML and XML/TEI\n", " - Graph: SVG\n", "- Exercise\n", "- What's next\n", "\n", "---\n", "\n", "\n", "## Introduction\n", "\n", "In this tutorial we will be trying different outputs for our collation, meaning different graphical representations, formats and visualizations of the result.\n", "\n", "The visualization of the collation result is an open discussion: several possibilities have been suggested and used and new ones are always being proposed. When the output of the collation is a printed format, such as a book, it is rare to see anything different from the traditional critical apparatus. Now that output formats are more frequently digital (or at least have a digital component), collation tools tend to offer more than one visualization option. This is the case for both Juxta and CollateX. The different visualizations are not incompatible; on the contrary, they can be complementary, highlighting different aspects of the result and suitable for different users or different stages of the workflow.\n", "\n", "In the previous tutorials we used the alignment table and the graph. The alignment table, in use since the 1960's, is the equivalent of the matrix of bioinformatic for sequence alignment (for example, strings of DNA). In contrast, the graph is meant to represent the fluidity of the text and its variation. The idea of a graph-oriented model for expressing textual variance has been originally developed by Desmond Schmidt [(2008)](http://multiversiondocs.blogspot.it/2008/03/whats-multi-version-document.html). You can refer to [this video](https://vimeo.com/114242362), for a presentation on *Apparatus vs. Graph – an Interface as Scholarly Argument* by Tara Andrews and Joris van Zundert.\n", "Other outputs, such as the histogram and the side-by-side visualization offered by Juxta, allow users to visualize the result of the comparison between two witnesses only. This reflects the way the algorithm is built and shows that the graphical representation is connected with the approach to collation that informs the software.\n", "\n", "CollateX has two main ways to conceive of the collation result: as a **table** (with many different formatting options) and as a **graph**:\n", "- table formats\n", " - plain text table (no need to specify the output)\n", " - HTML table (output='**html**')\n", " - HTML vertical table with colors (output='**html2**')\n", " - JSON (output='**json**')\n", " - XML (output='**xml**')\n", " - XML/TEI (output='**tei**')\n", "- graph format\n", " - SVG (output='**svg**')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## In practice\n", "\n", "Even though we have already encountered some of these outputs, it is worth going through them one more time focussing on part of the code that needs to change to produce the different formats. \n", "\n", "### Table: plain text\n", "\n", "In this tutorial we will use some simple texts already used in the previous tutorial: the *fox and dog* example.\n", "\n", "Let's start with the most simple output, for which we don't need to specify any output format (note that you can name the variable containing the output anything you like, but in this tutorial we call it *alignment_table*, *table* or *graph*)\n", "\n", "In the code cell below the lines starting with a hash (#) are comments and are not executed. They are there in this instance to help you remember what the different parts of the code do. You do not need to use them in your notebook (although sometimes it is helpful to add comments to your code so you remember what things do)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+---+-----+-------+-------+---------------------+------+------+\n", "| A | The | quick | brown | fox jumped over the | lazy | dog. |\n", "| B | The | - | brown | fox jumped over the | - | dog. |\n", "| C | The | bad | - | fox jumped over the | lazy | dog. |\n", "+---+-----+-------+-------+---------------------+------+------+\n" ] } ], "source": [ "#import the collatex library\n", "from collatex import *\n", "#create an instance of the collateX engine\n", "collation = Collation()\n", "#add witnesses to the collateX instance\n", "collation.add_plain_witness( \"A\", \"The quick brown fox jumped over the lazy dog.\")\n", "collation.add_plain_witness( \"B\", \"The brown fox jumped over the dog.\" )\n", "collation.add_plain_witness( \"C\", \"The bad fox jumped over the lazy dog.\" )\n", "#collate the witnesses and store the result in a vaiable called 'table'\n", "#as we have not specified an output this will be sored in plain text\n", "table = collate(collation)\n", "#print the collation result\n", "print(table)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Table: HTML\n", "\n", "Now let's try a different output. This time we still want a table format but instead of it being in plain text we would like it exported in HTML (the markup language used for web pages), and we would like it to be displayed vertically with nice colors to highlight the comparison. To achieve this all you need to do is add the keyword *output* to the *collate* command and give it that value *html2*." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
A | \n", "B | \n", "C | \n", "
---|---|---|
The | \n", "The | \n", "The | \n", "
quick | \n", "- | \n", "bad | \n", "
brown | \n", "brown | \n", "- | \n", "
fox jumped over the | \n", "fox jumped over the | \n", "fox jumped over the | \n", "
lazy | \n", "- | \n", "lazy | \n", "
dog. | \n", "dog. | \n", "dog. | \n", "
The