Converted from a Word document
The open-source GPL-licensed TXM software provides a classical toolbox for text analysis and mining composed of a versatile and efficient full text search engine, text reading and browsing, video playing, sub-corpus and partition building, co-occurrence analysis, factorial analysis, and clustering. It is available as a desktop application for Windows, Mac, or Linux, as well as a web portal software for a server accessed through a web browser (Heiden, 2010). The TXM platform can be downloaded for free at http://sf.net/projects/txm.
An originality of TXM is the ability to apply analytic tools on a large spectrum of encoding formats from XML-TEI encoded sources to basic raw text, through a high-level user GUI in a desktop software or as a web portal.
This poster will introduce the recent integration of processing capabilities for two very different modalities of textual data into TXM:
• Written texts associated with their facsimile-scanned images or media files through pagination.
• Speech transcriptions associated with their recordings—video or audio media files—through synchronization.
Each textual modality is managed through the unique pivot XML-TEI TXM source format designed for TXM.
The poster will demonstrate a live session of
• Navigation between KWIC concordances and the synoptic display of text editions with facsimile and critical edition containing the occurrence of pivots in the online TXM portal version (see Figure 1).
• Navigation between KWIC concordances and playing the video excerpts corresponding to the occurrence of pivots in the desktop version (see Figure 2).
Figure 1. Browsing the ‘Quest del saint Graal’ manuscript edition online in a TXM portal. Lower part: A KWIC concordance of the ‘“Lancelot” word followed by a verb’ pattern. Upper part: A synoptic view of the edition of the ‘Queste del saint Graal’ manuscript, composed from left to right of the facsimile image and of three different levels of diplomatic editions, with the sixth concordance hit highlighted in pink in each diplomatic level. The browser used in this screenshot is Firefox. The ‘Queste del saint Graal’ edition can be accessed in a TXM portal at http://txm.bfm-corpus.org/?command=documentation&path=/GRAAL.
Figure 2. Browsing the video and the transcription of dialogs of a physics course in college in TXM desktop version.
Lower part: A KWIC concordance of the ‘“lumière” word followed by a verb’ pattern.
Upper part: A synoptic view of the edition of the transcription of dialogs of a physics course in college (Tiberghien et al., 2012) on the right and the video recording of the course on the left, with the third hit of the concordance highlighted in pink. The desktop TXM used in this scenario is the Linux version.
The demonstration will be based on the desktop version of TXM and on the portal version of TXM.