{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"right\" src=\"images/tf-small.png\" width=\"128\"/>\n",
    "<img align=\"right\" src=\"images/etcbc.png\"/>\n",
    "<img align=\"right\" src=\"images/dans-small.png\"/>\n",
    "\n",
    "You might want to consider the [start](search.ipynb) of this tutorial.\n",
    "\n",
    "Short introductions to other TF datasets:\n",
    "\n",
    "* [Dead Sea Scrolls](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/dss.ipynb),\n",
    "* [Old Babylonian Letters](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/oldbabylonian.ipynb),\n",
    "or the\n",
    "* [Quran](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/quran.ipynb)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Annotation outside TF\n",
    "\n",
    "Task:\n",
    "\n",
    "* prepare a text file based on TF data.\n",
    "* annotate the text file by assigning values to pieces of text\n",
    "* generate TF features based on these annotations\n",
    "\n",
    "We use a device in Text-Fabric that has been developed for this kind of round-trip:\n",
    "the [Recorder](https://annotation.github.io/text-fabric/tf/convert/recorder.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Incantation\n",
    "\n",
    "The ins and outs of installing Text-Fabric, getting the corpus, and initializing a notebook are\n",
    "explained in the [start tutorial](start.ipynb)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-05-24T10:06:39.818664Z",
     "start_time": "2018-05-24T10:06:39.796588Z"
    }
   },
   "outputs": [],
   "source": [
    "from tf.app import use\n",
    "from tf.convert.recorder import Recorder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "**Locating corpus resources ...**"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">app:</b> <span title=\"rv1.8=#gd905e3fb6e80d0fa537600337614adc2af157309 offline under /Users/me/text-fabric-data/github\">~/text-fabric-data/github/ETCBC/bhsa/app</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">data:</b> <span title=\"rv1.8=#gd905e3fb6e80d0fa537600337614adc2af157309 offline under /Users/me/text-fabric-data/github\">~/text-fabric-data/github/ETCBC/bhsa/tf/2021</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">data:</b> <span title=\"rv2.1=#gaba4367b49750089e4e4122415a77cac43bd97bc offline under /Users/me/text-fabric-data/github\">~/text-fabric-data/github/ETCBC/phono/tf/2021</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">data:</b> <span title=\"rv2.1=#gf45f6cc3c4f933dba6e649f49cdb14a40dcf333f offline under /Users/me/text-fabric-data/github\">~/text-fabric-data/github/ETCBC/parallels/tf/2021</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "            <b>Text-Fabric:</b> <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/tf/cheatsheet.html\" title=\"text-fabric-api\">Text-Fabric API 12.0.4</a>, <a target=\"_blank\" href=\"https://github.com/ETCBC/bhsa/blob/master/app\" title=\"ETCBC/bhsa app\">ETCBC/bhsa/app  v3</a>, <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/tf/about/searchusage.html\" title=\"Search Templates Introduction and Reference\">Search Reference</a><br>\n",
       "            <b>Data:</b> <a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/\" title=\"provenance of BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis\">ETCBC - bhsa 2021</a>, <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/tf/writing/hebrew.html\" title=\"How TF features represent text\">Character table</a>, <a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/0_home\" title=\"ETCBC - bhsa feature documentation\">Feature docs</a><br>\n",
       "            <details class=\"nodeinfo\"><summary><b>Node types</b></summary>\n",
       "<table class=\"nodeinfo\">\n",
       "    <tr>\n",
       "        <th>Name</th>\n",
       "        <th># of nodes</th>\n",
       "        <th># slots/node</th>\n",
       "        <th>% coverage</th>\n",
       "    </tr>\n",
       "\n",
       "<tr>\n",
       "    <th>book</th>\n",
       "    <td>39</td>\n",
       "    <td>10938.21</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>chapter</th>\n",
       "    <td>929</td>\n",
       "    <td>459.19</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>lex</th>\n",
       "    <td>9230</td>\n",
       "    <td>46.22</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>verse</th>\n",
       "    <td>23213</td>\n",
       "    <td>18.38</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>half_verse</th>\n",
       "    <td>45179</td>\n",
       "    <td>9.44</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>sentence</th>\n",
       "    <td>63717</td>\n",
       "    <td>6.70</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>sentence_atom</th>\n",
       "    <td>64514</td>\n",
       "    <td>6.61</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>clause</th>\n",
       "    <td>88131</td>\n",
       "    <td>4.84</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>clause_atom</th>\n",
       "    <td>90704</td>\n",
       "    <td>4.70</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>phrase</th>\n",
       "    <td>253203</td>\n",
       "    <td>1.68</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>phrase_atom</th>\n",
       "    <td>267532</td>\n",
       "    <td>1.59</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>subphrase</th>\n",
       "    <td>113850</td>\n",
       "    <td>1.42</td>\n",
       "    <td>38</td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th><i>word</i></th>\n",
       "    <td>426590</td>\n",
       "    <td>1.00</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "</table></details>\n",
       "            <b>Sets:</b> no custom sets<br>\n",
       "            <b>Features:</b><br>\n",
       "<details><summary><b>Parallel Passages</b></summary>\n",
       "    <div class=\"fcorpus\">\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat edge\">\n",
       "<a target=\"_blank\" href=\"https://nbviewer.jupyter.org/github/ETCBC/parallels/blob/master/programs/parallels.ipynb\" title=\"~/text-fabric-data/github/ETCBC/parallels/tf/2021/crossref.tf\">crossref</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> 🆗 links between similar passages</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "    </div>\n",
       "</details>\n",
       "\n",
       "<details><summary><b>BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis</b></summary>\n",
       "    <div class=\"fcorpus\">\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/book\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/book.tf\">book</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ book name in Latin (Genesis; Numeri; Reges1; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/book@ll\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/book@am.tf\">book@ll</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ book name in amharic (ኣማርኛ)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/chapter\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/chapter.tf\">chapter</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> ✅ chapter number (1; 2; 3; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/code\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/code.tf\">code</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> ✅ identifier of a clause atom relationship (0; 74; 367; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/det\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/det.tf\">det</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ determinedness of phrase(atom) (det; und; NA.)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/domain\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/domain.tf\">domain</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/freq_lex\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/freq_lex.tf\">freq_lex</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> ✅ frequency of lexemes</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/function\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/function.tf\">function</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/g_cons\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/g_cons.tf\">g_cons</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/g_cons_utf8\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/g_cons_utf8.tf\">g_cons_utf8</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/g_lex\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/g_lex.tf\">g_lex</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/g_lex_utf8\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/g_lex_utf8.tf\">g_lex_utf8</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/g_word\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/g_word.tf\">g_word</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/g_word_utf8\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/g_word_utf8.tf\">g_word_utf8</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/gloss\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/gloss.tf\">gloss</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> 🆗 english translation of lexeme (beginning create god(s))</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/gn\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/gn.tf\">gn</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ grammatical gender (m; f; NA; unknown.)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/label\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/label.tf\">label</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ (half-)verse label (half verses: A; B; C; verses:  GEN 01,02)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/language\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/language.tf\">language</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ of word or lexeme (Hebrew; Aramaic.)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/lex\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/lex.tf\">lex</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/lex_utf8\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/lex_utf8.tf\">lex_utf8</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/ls\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/ls.tf\">ls</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ lexical set, subclassification of part-of-speech (card; ques; mult)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/nametype\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/nametype.tf\">nametype</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ⚠️ named entity type (pers; mens; gens; topo; ppde.)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/nme\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/nme.tf\">nme</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/nu\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/nu.tf\">nu</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ grammatical number (sg; du; pl; NA; unknown.)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/number\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/number.tf\">number</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> ✅ sequence number of an object within its context</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/otype\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/otype.tf\">otype</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> </span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/pargr\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/pargr.tf\">pargr</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> 🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/pdp\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/pdp.tf\">pdp</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/pfm\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/pfm.tf\">pfm</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ preformative consonantal-transliterated (absent; n/a; J, ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/prs\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/prs.tf\">prs</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/prs_gn\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/prs_gn.tf\">prs_gn</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ pronominal suffix gender (m; f; NA; unknown.)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/prs_nu\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/prs_nu.tf\">prs_nu</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ pronominal suffix number (sg; du; pl; NA; unknown.)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/prs_ps\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/prs_ps.tf\">prs_ps</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ pronominal suffix person (p1; p2; p3; NA; unknown.)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/ps\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/ps.tf\">ps</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ grammatical person (p1; p2; p3; NA; unknown.)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/qere\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/qere.tf\">qere</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ word pointed-transliterated masoretic reading correction</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/qere_trailer\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/qere_trailer.tf\">qere_trailer</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ interword material -pointed-transliterated (Masoretic correction)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/qere_trailer_utf8\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/qere_trailer_utf8.tf\">qere_trailer_utf8</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ interword material -pointed-transliterated (Masoretic correction)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/qere_utf8\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/qere_utf8.tf\">qere_utf8</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ word pointed-Hebrew masoretic reading correction</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/rank_lex\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/rank_lex.tf\">rank_lex</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> ✅ ranking of lexemes based on freqnuecy</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/rela\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/rela.tf\">rela</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/sp\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/sp.tf\">sp</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ part-of-speech (art; verb; subs; nmpr, ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/st\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/st.tf\">st</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ state of a noun (a (absolute); c (construct); e (emphatic).)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/tab\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/tab.tf\">tab</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> ✅ clause atom: its level in the linguistic embedding</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/trailer\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/trailer.tf\">trailer</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ interword material pointed-transliterated (& 00 05 00_P ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/trailer_utf8\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/trailer_utf8.tf\">trailer_utf8</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ interword material pointed-Hebrew (־ ׃)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/txt\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/txt.tf\">txt</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/typ\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/typ.tf\">typ</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/uvf\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/uvf.tf\">uvf</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/vbe\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/vbe.tf\">vbe</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ verbal ending consonantal-transliterated (n/a; W; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/vbs\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/vbs.tf\">vbs</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ root formation consonantal-transliterated (absent; n/a; H; ...)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/verse\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/verse.tf\">verse</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> ✅ verse number</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/voc_lex\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/voc_lex.tf\">voc_lex</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/voc_lex_utf8\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/voc_lex_utf8.tf\">voc_lex_utf8</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/vs\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/vs.tf\">vs</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ verbal stem (qal; piel; hif; apel; pael)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/vt\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/vt.tf\">vt</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ✅ verbal tense (perf; impv; wayq; infc)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat edge\">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/mother\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/mother.tf\">mother</a>\n",
       "</div>\n",
       "<div class=\"fmono\">none</div>\n",
       "\n",
       "<span> ✅ linguistic dependency between textual objects</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat edge\">\n",
       "<a target=\"_blank\" href=\"https://ETCBC.github.io/bhsa/features/oslots\" title=\"~/text-fabric-data/github/ETCBC/bhsa/tf/2021/oslots.tf\">oslots</a>\n",
       "</div>\n",
       "<div class=\"fmono\">none</div>\n",
       "\n",
       "<span> </span>\n",
       "\n",
       "</div>\n",
       "\n",
       "    </div>\n",
       "</details>\n",
       "\n",
       "<details><summary><b>Phonetic Transcriptions</b></summary>\n",
       "    <div class=\"fcorpus\">\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb\" title=\"~/text-fabric-data/github/ETCBC/phono/tf/2021/phono.tf\">phono</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> 🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb\" title=\"~/text-fabric-data/github/ETCBC/phono/tf/2021/phono_trailer.tf\">phono_trailer</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> 🆗 interword material in phonological transcription</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "    </div>\n",
       "</details>\n",
       "\n",
       "            <b>Settings:</b><br><details ><summary><b>specified</b></summary><ol><li><b>apiVersion</b>: <code>3</code></li><li><b>appName</b>: <code>ETCBC/bhsa</code></li><li><b>appPath</b>: <code>/Users/me/text-fabric-data/github/ETCBC/bhsa/app</code></li><li><b>commit</b>: <code>gd905e3fb6e80d0fa537600337614adc2af157309</code></li><li><b>css</b>: <code>''</code></li><li><details><summary><b>dataDisplay</b>:</summary><ul><li><details><summary><b>exampleSectionHtml</b>:</summary><code>&lt;code>Genesis 1:1&lt;/code> (use &lt;a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names&lt;/a>)</code></details></li><li><details><summary><b>excludedFeatures</b>:</summary><ul><li><code>g_uvf_utf8</code></li><li><code>g_vbs</code></li><li><code>kq_hybrid</code></li><li><code>languageISO</code></li><li><code>g_nme</code></li><li><code>lex0</code></li><li><code>is_root</code></li><li><code>g_vbs_utf8</code></li><li><code>g_uvf</code></li><li><code>dist</code></li><li><code>root</code></li><li><code>suffix_person</code></li><li><code>g_vbe</code></li><li><code>dist_unit</code></li><li><code>suffix_number</code></li><li><code>distributional_parent</code></li><li><code>kq_hybrid_utf8</code></li><li><code>crossrefSET</code></li><li><code>instruction</code></li><li><code>g_prs</code></li><li><code>lexeme_count</code></li><li><code>rank_occ</code></li><li><code>g_pfm_utf8</code></li><li><code>freq_occ</code></li><li><code>crossrefLCS</code></li><li><code>functional_parent</code></li><li><code>g_pfm</code></li><li><code>g_nme_utf8</code></li><li><code>g_vbe_utf8</code></li><li><code>kind</code></li><li><code>g_prs_utf8</code></li><li><code>suffix_gender</code></li><li><code>mother_object_type</code></li></ul></details></li><li><details><summary><b>noneValues</b>:</summary><ul><li><code>none</code></li><li><code>unknown</code></li><li><i>no value</i></li><li><code>NA</code></li></ul></details></li></ul></details></li><li><details><summary><b>docs</b>:</summary><ul><li><b>docBase</b>: <code>{docRoot}/{repo}</code></li><li><b>docExt</b>: <code>''</code></li><li><b>docPage</b>: <code>''</code></li><li><b>docRoot</b>: <code>https://{org}.github.io</code></li><li><b>featurePage</b>: <code>0_home</code></li></ul></details></li><li><b>interfaceDefaults</b>: <code>{}</code></li><li><b>isCompatible</b>: <code>True</code></li><li><b>local</b>: <code>local</code></li><li><b>localDir</b>: <code>/Users/me/text-fabric-data/github/ETCBC/bhsa/_temp</code></li><li><details><summary><b>provenanceSpec</b>:</summary><ul><li><b>corpus</b>: <code>BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis</code></li><li><b>doi</b>: <code>10.5281/zenodo.1007624</code></li><li><details><summary><b>moduleSpecs</b>:</summary><ul><li><details><summary>:</summary><ul><li><b>backend</b>: <i>no value</i></li><li><b>corpus</b>: <code>Phonetic Transcriptions</code></li><li><details><summary><b>docUrl</b>:</summary><code>https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb</code></details></li><li><b>doi</b>: <code>10.5281/zenodo.1007636</code></li><li><b>org</b>: <code>ETCBC</code></li><li><b>relative</b>: <code>/tf</code></li><li><b>repo</b>: <code>phono</code></li></ul></details></li><li><details><summary>:</summary><ul><li><b>backend</b>: <i>no value</i></li><li><b>corpus</b>: <code>Parallel Passages</code></li><li><details><summary><b>docUrl</b>:</summary><code>https://nbviewer.jupyter.org/github/ETCBC/parallels/blob/master/programs/parallels.ipynb</code></details></li><li><b>doi</b>: <code>10.5281/zenodo.1007642</code></li><li><b>org</b>: <code>ETCBC</code></li><li><b>relative</b>: <code>/tf</code></li><li><b>repo</b>: <code>parallels</code></li></ul></details></li></ul></details></li><li><b>org</b>: <code>ETCBC</code></li><li><b>relative</b>: <code>/tf</code></li><li><b>repo</b>: <code>bhsa</code></li><li><b>version</b>: <code>2021</code></li><li><b>webBase</b>: <code>https://shebanq.ancient-data.org/hebrew</code></li><li><b>webHint</b>: <code>Show this on SHEBANQ</code></li><li><b>webLang</b>: <code>la</code></li><li><b>webLexId</b>: <code>True</code></li><li><details><summary><b>webUrl</b>:</summary><code>{webBase}/text?book=&lt;1>&amp;chapter=&lt;2>&amp;verse=&lt;3>&amp;version={version}&amp;mr=m&amp;qw=q&amp;tp=txt_p&amp;tr=hb&amp;wget=v&amp;qget=v&amp;nget=vt</code></details></li><li><b>webUrlLex</b>: <code>{webBase}/word?version={version}&amp;id=&lt;lid></code></li></ul></details></li><li><b>release</b>: <code>v1.8</code></li><li><details><summary><b>typeDisplay</b>:</summary><ul><li><details><summary><b>clause</b>:</summary><ul><li><b>label</b>: <code>{typ} {rela}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>clause_atom</b>:</summary><ul><li><b>hidden</b>: <code>True</code></li><li><b>label</b>: <code>{code}</code></li><li><b>level</b>: <code>1</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>half_verse</b>:</summary><ul><li><b>hidden</b>: <code>True</code></li><li><b>label</b>: <code>{label}</code></li><li><b>style</b>: <code>''</code></li><li><b>verselike</b>: <code>True</code></li></ul></details></li><li><details><summary><b>lex</b>:</summary><ul><li><b>featuresBare</b>: <code>gloss</code></li><li><b>label</b>: <code>{voc_lex_utf8}</code></li><li><b>lexOcc</b>: <code>word</code></li><li><b>style</b>: <code>orig</code></li><li><b>template</b>: <code>{voc_lex_utf8}</code></li></ul></details></li><li><details><summary><b>phrase</b>:</summary><ul><li><b>label</b>: <code>{typ} {function}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>phrase_atom</b>:</summary><ul><li><b>hidden</b>: <code>True</code></li><li><b>label</b>: <code>{typ} {rela}</code></li><li><b>level</b>: <code>1</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>sentence</b>:</summary><ul><li><b>label</b>: <code>{number}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>sentence_atom</b>:</summary><ul><li><b>hidden</b>: <code>True</code></li><li><b>label</b>: <code>{number}</code></li><li><b>level</b>: <code>1</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>subphrase</b>:</summary><ul><li><b>hidden</b>: <code>True</code></li><li><b>label</b>: <code>{number}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>word</b>:</summary><ul><li><b>features</b>: <code>pdp vs vt</code></li><li><b>featuresBare</b>: <code>lex:gloss</code></li></ul></details></li></ul></details></li><li><b>writing</b>: <code>hbo</code></li></ol></details>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<style>tr.tf.ltr, td.tf.ltr, th.tf.ltr { text-align: left ! important;}\n",
       "tr.tf.rtl, td.tf.rtl, th.tf.rtl { text-align: right ! important;}\n",
       "@font-face {\n",
       "  font-family: \"Gentium Plus\";\n",
       "  src: local('Gentium Plus'), local('GentiumPlus'),\n",
       "    url('/browser/static/fonts/GentiumPlus-R.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/GentiumPlus-R.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"Ezra SIL\";\n",
       "  src: local('Ezra SIL'), local('EzraSIL'),\n",
       "    url('/browser/static/fonts/SILEOT.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/SILEOT.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"SBL Hebrew\";\n",
       "  src: local('SBL Hebrew'), local('SBLHebrew'),\n",
       "    url('/browser/static/fonts/SBL_Hbrw.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/SBL_Hbrw.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"Estrangelo Edessa\";\n",
       "  src: local('Estrangelo Edessa'), local('EstrangeloEdessa');\n",
       "    url('/browser/static/fonts/SyrCOMEdessa.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/SyrCOMEdessa.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: AmiriQuran;\n",
       "  font-style: normal;\n",
       "  font-weight: 400;\n",
       "  src: local('Amiri Quran'), local('AmiriQuran'),\n",
       "    url('/browser/static/fonts/AmiriQuran.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/AmiriQuran.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: AmiriQuranColored;\n",
       "  font-style: normal;\n",
       "  font-weight: 400;\n",
       "  src: local('Amiri Quran Colored'), local('AmiriQuranColored'),\n",
       "    url('/browser/static/fonts/AmiriQuranColored.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/AmiriQuranColored.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"Santakku\";\n",
       "  src: local('Santakku'),\n",
       "    url('/browser/static/fonts/Santakku.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/Santakku.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"SantakkuM\";\n",
       "  src: local('SantakkuM'),\n",
       "    url('/browser/static/fonts/SantakkuM.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/SantakkuM.woff?raw=true') format('woff');\n",
       "}\n",
       "/* bypassing some classical notebook settings */\n",
       "div#notebook {\n",
       "  line-height: unset;\n",
       "}\n",
       "/* neutral text */\n",
       ".txtn,.txtn a:visited,.txtn a:link {\n",
       "    font-family: sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* transcription text */\n",
       ".txtt,.txtt a:visited,.txtt a:link {\n",
       "    font-family: monospace;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* source text */\n",
       ".txto,.txto a:visited,.txto a:link {\n",
       "    font-family: serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* phonetic text */\n",
       ".txtp,.txtp a:visited,.txtp a:link {\n",
       "    font-family: Gentium, sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* original script text */\n",
       ".txtu,.txtu a:visited,.txtu a:link {\n",
       "    font-family: Gentium, sans-serif;\n",
       "    font-size: medium;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* hebrew */\n",
       ".txtu.hbo,.lex.hbo {\n",
       "    font-family: \"Ezra SIL\", \"SBL Hebrew\", sans-serif;\n",
       "    font-size: large;\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* syriac */\n",
       ".txtu.syc,.lex.syc {\n",
       "    font-family: \"Estrangelo Edessa\", sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* neo aramaic */\n",
       ".txtu.cld,.lex.cld {\n",
       "    font-family: \"CharisSIL-R\", sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* standard arabic */\n",
       ".txtu.ara,.lex.ara {\n",
       "    font-family: \"AmiriQuran\", sans-serif;\n",
       "    font-size: large;\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* cuneiform */\n",
       ".txtu.akk,.lex.akk {\n",
       "    font-family: Santakku, sans-serif;\n",
       "    font-size: large;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* greek */\n",
       ".txtu.grc,.lex.grc a:link {\n",
       "    font-family: Gentium, sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "a:hover {\n",
       "    text-decoration: underline | important;\n",
       "    color: #0000ff | important;\n",
       "}\n",
       ".ltr {\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".rtl {\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".ubd {\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".col {\n",
       "   display: inline-block;\n",
       "}\n",
       ".features {\n",
       "    font-family: monospace;\n",
       "    font-size: medium;\n",
       "    font-weight: bold;\n",
       "    color: var(--features);\n",
       "    display: flex;\n",
       "    flex-flow: column nowrap;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "    padding: 2px;\n",
       "    margin: 2px;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    border: var(--meta-width) solid var(--meta-color);\n",
       "    border-radius: var(--meta-width);\n",
       "}\n",
       ".features div,.features span {\n",
       "    padding: 0;\n",
       "    margin: -2px 0;\n",
       "}\n",
       ".features .f {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    font-weight: normal;\n",
       "    color: #5555bb;\n",
       "}\n",
       ".features .xft {\n",
       "  color: #000000;\n",
       "  background-color: #eeeeee;\n",
       "  font-size: medium;\n",
       "  margin: 2px 0px;\n",
       "}\n",
       ".features .xft .f {\n",
       "  color: #000000;\n",
       "  background-color: #eeeeee;\n",
       "  font-size: small;\n",
       "  font-weight: normal;\n",
       "}\n",
       ".tfsechead {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    font-weight: bold;\n",
       "    color: var(--tfsechead);\n",
       "    unicode-bidi: embed;\n",
       "    text-align: start;\n",
       "}\n",
       ".structure {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    font-weight: bold;\n",
       "    color: var(--structure);\n",
       "    unicode-bidi: embed;\n",
       "    text-align: start;\n",
       "}\n",
       ".comments {\n",
       "    display: flex;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "    flex-flow: column nowrap;\n",
       "}\n",
       ".nd, a:link.nd {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    color: var(--node);\n",
       "    vertical-align: super;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".nde, a:link.nde {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    color: var(--node);\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".etf {\n",
       "    font-size: normal;\n",
       "    border-radius: 0.2rem;\n",
       "    border: 1pt solid white;\n",
       "    padding: 0 0.2rem ! important;\n",
       "    margin: 0 0.2rem ! important;\n",
       "}\n",
       ".etfx {\n",
       "    font-size: x-large;\n",
       "}\n",
       ".lex {\n",
       "  color: var(--lex-color);;\n",
       "}\n",
       "#colormapplus, #colormapmin, .ecolormapmin {\n",
       "  font-weight: bold;\n",
       "  border-radius: 0.1rem;\n",
       "  background-color: #eeeeff;\n",
       "  padding: 0 1rem;\n",
       "  margin: 0 1rem;\n",
       "}\n",
       ".clr {\n",
       "  font-style: italic;\n",
       "  font-size: small;\n",
       "}\n",
       ".clmap,.eclmap {\n",
       "  padding: 0;\n",
       "}\n",
       ".children,.children.ltr {\n",
       "    display: flex;\n",
       "    border: 0;\n",
       "    background-color: #ffffff;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "}\n",
       ".children.stretch {\n",
       "    align-items: stretch;\n",
       "}\n",
       ".children.hor {\n",
       "    flex-flow: row nowrap;\n",
       "}\n",
       ".children.hor.wrap {\n",
       "    flex-flow: row wrap;\n",
       "}\n",
       ".children.ver {\n",
       "    flex-flow: column nowrap;\n",
       "}\n",
       ".children.ver.wrap {\n",
       "    flex-flow: column wrap;\n",
       "}\n",
       ".contnr {\n",
       "    width: fit-content;\n",
       "    display: flex;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "    flex-flow: column nowrap;\n",
       "    background: #ffffff none repeat scroll 0 0;\n",
       "    padding:  10px 2px 2px 2px;\n",
       "    margin: 16px 2px 2px 2px;\n",
       "    border-style: solid;\n",
       "    font-size: small;\n",
       "}\n",
       ".contnr.trm {\n",
       "    background-attachment: local;\n",
       "}\n",
       ".contnr.cnul {\n",
       "    padding:  0;\n",
       "    margin: 0;\n",
       "    border-style: solid;\n",
       "    font-size: xx-small;\n",
       "}\n",
       ".contnr.cnul,.lbl.cnul {\n",
       "    border-color: var(--border-color-nul);\n",
       "    border-width: var(--border-width-nul);\n",
       "    border-radius: var(--border-width-nul);\n",
       "}\n",
       ".contnr.c0,.lbl.c0 {\n",
       "    border-color: var(--border-color0);\n",
       "    border-width: var(--border-width0);\n",
       "    border-radius: var(--border-width0);\n",
       "}\n",
       ".contnr.c1,.lbl.c1 {\n",
       "    border-color: var(--border-color1);\n",
       "    border-width: var(--border-width1);\n",
       "    border-radius: var(--border-width1);\n",
       "}\n",
       ".contnr.c2,.lbl.c2 {\n",
       "    border-color: var(--border-color2);\n",
       "    border-width: var(--border-width2);\n",
       "    border-radius: var(--border-width2);\n",
       "}\n",
       ".contnr.c3,.lbl.c3 {\n",
       "    border-color: var(--border-color3);\n",
       "    border-width: var(--border-width3);\n",
       "    border-radius: var(--border-width3);\n",
       "}\n",
       ".contnr.c4,.lbl.c4 {\n",
       "    border-color: var(--border-color4);\n",
       "    border-width: var(--border-width4);\n",
       "    border-radius: var(--border-width4);\n",
       "}\n",
       "span.plain {\n",
       "    /*display: inline-block;*/\n",
       "    display: inline-flex;\n",
       "    flex-flow: row wrap;\n",
       "    white-space: pre-wrap;\n",
       "}\n",
       "span.break {\n",
       "  flex-basis: 100%;\n",
       "  height: 0;\n",
       "}\n",
       ".plain {\n",
       "    background-color: #ffffff;\n",
       "}\n",
       ".plain.l,.contnr.l,.contnr.l>.lbl {\n",
       "    border-left-style: dotted\n",
       "}\n",
       ".plain.r,.contnr.r,.contnr.r>.lbl {\n",
       "    border-right-style: dotted\n",
       "}\n",
       ".plain.lno,.contnr.lno,.contnr.lno>.lbl {\n",
       "    border-left-style: none\n",
       "}\n",
       ".plain.rno,.contnr.rno,.contnr.rno>.lbl {\n",
       "    border-right-style: none\n",
       "}\n",
       ".plain.l {\n",
       "    padding-left: 4px;\n",
       "    margin-left: 2px;\n",
       "    border-width: var(--border-width-plain);\n",
       "}\n",
       ".plain.r {\n",
       "    padding-right: 4px;\n",
       "    margin-right: 2px;\n",
       "    border-width: var(--border-width-plain);\n",
       "}\n",
       ".lbl {\n",
       "    font-family: monospace;\n",
       "    margin-top: -24px;\n",
       "    margin-left: 20px;\n",
       "    background: #ffffff none repeat scroll 0 0;\n",
       "    padding: 0 6px;\n",
       "    border-style: solid;\n",
       "    display: block;\n",
       "    color: var(--label)\n",
       "}\n",
       ".lbl.trm {\n",
       "    background-attachment: local;\n",
       "    margin-top: 2px;\n",
       "    margin-left: 2px;\n",
       "    padding: 2px 2px;\n",
       "    border-style: none;\n",
       "}\n",
       ".lbl.cnul {\n",
       "    font-size: xx-small;\n",
       "}\n",
       ".lbl.c0 {\n",
       "    font-size: small;\n",
       "}\n",
       ".lbl.c1 {\n",
       "    font-size: small;\n",
       "}\n",
       ".lbl.c2 {\n",
       "    font-size: medium;\n",
       "}\n",
       ".lbl.c3 {\n",
       "    font-size: medium;\n",
       "}\n",
       ".lbl.c4 {\n",
       "    font-size: large;\n",
       "}\n",
       ".occs, a:link.occs {\n",
       "    font-size: small;\n",
       "}\n",
       "\n",
       "/* PROVENANCE */\n",
       "\n",
       "div.prov {\n",
       "\tmargin: 40px;\n",
       "\tpadding: 20px;\n",
       "\tborder: 2px solid var(--fog-rim);\n",
       "}\n",
       "div.pline {\n",
       "\tdisplay: flex;\n",
       "\tflex-flow: row nowrap;\n",
       "\tjustify-content: stretch;\n",
       "\talign-items: baseline;\n",
       "}\n",
       "div.p2line {\n",
       "\tmargin-left: 2em;\n",
       "\tdisplay: flex;\n",
       "\tflex-flow: row nowrap;\n",
       "\tjustify-content: stretch;\n",
       "\talign-items: baseline;\n",
       "}\n",
       "div.psline {\n",
       "\tdisplay: flex;\n",
       "\tflex-flow: row nowrap;\n",
       "\tjustify-content: stretch;\n",
       "\talign-items: baseline;\n",
       "\tbackground-color: var(--gold-mist-back);\n",
       "}\n",
       "div.pname {\n",
       "\tflex: 0 0 5rem;\n",
       "\tfont-weight: bold;\n",
       "}\n",
       "div.pval {\n",
       "    flex: 1 1 auto;\n",
       "}\n",
       "\n",
       "/* KEYBOARD */\n",
       ".ccoff {\n",
       "  background-color: inherit;\n",
       "}\n",
       ".ccon {\n",
       "  background-color: yellow ! important;\n",
       "}\n",
       ".ccon,.ccoff {\n",
       "  padding: 0.2rem;\n",
       "  margin: 0.2rem;\n",
       "  border: 0.1rem solid var(--letter-box-border);\n",
       "  border-radius: 0.1rem;\n",
       "}\n",
       ".ccline {\n",
       "  font-size: xx-large ! important;\n",
       "  font-weight: bold;\n",
       "  line-height: 2em ! important;\n",
       "}\n",
       "/* TF header */\n",
       "\n",
       "summary {\n",
       "  /* needed to override the normalize.less\n",
       "   * in the classical jupyter notebook\n",
       "   */\n",
       "  display: list-item ! important;\n",
       "}\n",
       "\n",
       ".fcorpus {\n",
       "  display: flex;\n",
       "  flex-flow: column nowrap;\n",
       "  justify-content: flex-start;\n",
       "  align-items: flex-start;\n",
       "  align-content: flex-start;\n",
       "  overflow: auto;\n",
       "}\n",
       ".frow {\n",
       "  display: flex;\n",
       "  flex-flow: row nowrap;\n",
       "  justify-content: flex-start;\n",
       "  align-items: flex-start;\n",
       "  align-content: flex-start;\n",
       "}\n",
       ".fmeta {\n",
       "  display: flex;\n",
       "  flex-flow: column nowrap;\n",
       "  justify-content: flex-start;\n",
       "  align-items: flex-start;\n",
       "  align-content: flex-start;\n",
       "}\n",
       ".fmetarow {\n",
       "  display: flex;\n",
       "  flex-flow: row nowrap;\n",
       "  justify-content: flex-start;\n",
       "  align-items: flex-start;\n",
       "  align-content: flex-start;\n",
       "}\n",
       ".fmetakey {\n",
       "  min-width: 8em;\n",
       "  font-family: monospace;\n",
       "}\n",
       ".fnamecat {\n",
       "  min-width: 8em;\n",
       "}\n",
       ".fnamecat.edge {\n",
       "  font-weight: bold;\n",
       "  font-style: italic;\n",
       "}\n",
       ".fmono {\n",
       "    font-family: monospace;\n",
       "}\n",
       "\n",
       ":root {\n",
       "\t--node:               hsla(120, 100%,  20%, 1.0  );\n",
       "\t--label:              hsla(  0, 100%,  20%, 1.0  );\n",
       "\t--tfsechead:          hsla(  0, 100%,  25%, 1.0  );\n",
       "\t--structure:          hsla(120, 100%,  25%, 1.0  );\n",
       "\t--features:           hsla(  0,   0%,  30%, 1.0  );\n",
       "  --text-color:         hsla( 60,  80%,  10%, 1.0  );\n",
       "  --lex-color:          hsla(220,  90%,  60%, 1.0  );\n",
       "  --meta-color:         hsla(  0,   0%,  90%, 0.7  );\n",
       "  --meta-width:         3px;\n",
       "  --border-color-nul:   hsla(  0,   0%,  90%, 0.5  );\n",
       "  --border-color0:      hsla(  0,   0%,  90%, 0.9  );\n",
       "  --border-color1:      hsla(  0,   0%,  80%, 0.9  );\n",
       "  --border-color2:      hsla(  0,   0%,  70%, 0.9  );\n",
       "  --border-color3:      hsla(  0,   0%,  80%, 0.8  );\n",
       "  --border-color4:      hsla(  0,   0%,  60%, 0.9  );\n",
       "\t--letter-box-border:  hsla(  0,   0%,  80%, 0.5  );\n",
       "  --border-width-nul:   2px;\n",
       "  --border-width0:      2px;\n",
       "  --border-width1:      3px;\n",
       "  --border-width2:      4px;\n",
       "  --border-width3:      6px;\n",
       "  --border-width4:      5px;\n",
       "  --border-width-plain: 2px;\n",
       "}\n",
       ".hl {\n",
       "  background-color: var(--hl-strong);\n",
       "}\n",
       "span.hl {\n",
       "\tbackground-color: var(--hl-strong);\n",
       "\tborder-width: 0;\n",
       "\tborder-radius: 2px;\n",
       "\tborder-style: solid;\n",
       "}\n",
       "div.contnr.hl,div.lbl.hl {\n",
       "  background-color: var(--hl-strong);\n",
       "}\n",
       "div.contnr.hl {\n",
       "  border-color: var(--hl-rim) ! important;\n",
       "\tborder-width: 4px ! important;\n",
       "}\n",
       "\n",
       "span.hlbx {\n",
       "\tborder-color: var(--hl-rim);\n",
       "\tborder-width: 4px ! important;\n",
       "\tborder-style: solid;\n",
       "\tborder-radius: 6px;\n",
       "  padding: 4px;\n",
       "  margin: 4px;\n",
       "}\n",
       ".ehl {\n",
       "  background-color: var(--ehl-strong);\n",
       "}\n",
       "\n",
       ":root {\n",
       "\t--hl-strong:        hsla( 60, 100%,  70%, 0.9  );\n",
       "\t--hl-rim:           hsla( 55,  80%,  50%, 1.0  );\n",
       "\t--ehl-strong:       hsla(240, 100%,  70%, 0.9  );\n",
       "}\n",
       "</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<script>\n",
       "globalThis.copyChar = (el, c) => {\n",
       "    for (const el of document.getElementsByClassName('ccon')) {\n",
       "        el.className = 'ccoff'\n",
       "    }\n",
       "    el.className = 'ccon'\n",
       "    navigator.clipboard.writeText(String.fromCharCode(c))\n",
       "}\n",
       "</script>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div><b>Text-Fabric API:</b> names <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/tf/cheatsheet.html\" title=\"doc\">N F E L T S C TF Fs Fall Es Eall Cs Call</a> directly usable</div><hr>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "A = use(\"ETCBC/bhsa\", hoist=globals())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We work with Genesis 1 (in fact, only the first 10 clauses)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "gen1 = T.nodeFromSection((\"Genesis\", 1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We prepare our portion of text for annotation outside TF.\n",
    "\n",
    "What needs to happen is, that we produce a text file and that we remember the positions of the relevant\n",
    "nodes in that text file."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Recorder is a new thing in TF (in development) that lets you create a string from nodes,\n",
    "where the positions of the nodes in that string are remembered.\n",
    "You may add all kinds of material in between the texts of the nodes.\n",
    "And it is up to you how you represent the nodes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We start a recorder."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "rec = Recorder()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can add strings to the recorder, and we can tell nodes to start and to stop.\n",
    "\n",
    "We add clause atoms and phrase atoms to the recorder."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "LIMIT = 10\n",
    "\n",
    "for (i, cla) in enumerate(L.d(gen1, otype=\"clause_atom\")):\n",
    "    if i >= LIMIT:  # only first ten clause atoms\n",
    "        break\n",
    "\n",
    "    # we want a label in front of each clause atom\n",
    "    label = \"{} {}:{}\".format(*T.sectionFromNode(cla))\n",
    "    rec.add(f\"{label}@{i} \")\n",
    "\n",
    "    # we start a clause atom node:\n",
    "    #   until we end this node, all text that we add counts as material for this clause atom\n",
    "    rec.start(cla)\n",
    "\n",
    "    for pa in L.d(cla, otype=\"phrase_atom\"):\n",
    "        # we start a phrase node\n",
    "        #   until we end this node, all text that we add also counts as material for this phrase atom\n",
    "        rec.start(pa)\n",
    "\n",
    "        # we add text, it belongs to the current clause atom and to the current phrase atom\n",
    "        rec.add(T.text(pa, fmt=\"text-trans-plain\"))\n",
    "\n",
    "        # we end the phrase atom\n",
    "        rec.end(pa)\n",
    "\n",
    "    # we end the clause atom\n",
    "    rec.end(cla)\n",
    "\n",
    "    # very clause atom on its own line\n",
    "    #  this return character does not belong to any node\n",
    "    rec.add(\"\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can print the recorded text."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Genesis 1:1@0 BR>CJT BR> >LHJM >T HCMJM W>T H>RY00 \n",
      "Genesis 1:2@1 WH>RY HJTH THW WBHW \n",
      "Genesis 1:2@2 WXCK <L&PNJ THWM \n",
      "Genesis 1:2@3 WRWX >LHJM MRXPT <L&PNJ HMJM00 \n",
      "Genesis 1:3@4 WJ>MR >LHJM \n",
      "Genesis 1:3@5 JHJ >WR \n",
      "Genesis 1:3@6 WJHJ&>WR00 \n",
      "Genesis 1:4@7 WJR> >LHJM >T&H>WR \n",
      "Genesis 1:4@8 KJ&VWB \n",
      "Genesis 1:4@9 WJBDL >LHJM BJN H>WR WBJN HXCK00 \n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(rec.text())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can print the recorded node positions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pos 14: frozenset({904776, 515690})\n",
      "pos 15: frozenset({904776, 515690})\n",
      "pos 16: frozenset({904776, 515690})\n",
      "pos 17: frozenset({904776, 515690})\n",
      "pos 18: frozenset({904776, 515690})\n",
      "pos 19: frozenset({904776, 515690})\n",
      "pos 20: frozenset({904776, 515690})\n",
      "pos 21: frozenset({904777, 515690})\n",
      "pos 22: frozenset({904777, 515690})\n",
      "pos 23: frozenset({904777, 515690})\n",
      "pos 24: frozenset({904777, 515690})\n",
      "pos 25: frozenset({515690, 904778})\n",
      "pos 26: frozenset({515690, 904778})\n",
      "pos 27: frozenset({515690, 904778})\n",
      "pos 28: frozenset({515690, 904778})\n",
      "pos 29: frozenset({515690, 904778})\n",
      "pos 30: frozenset({515690, 904778})\n",
      "pos 31: frozenset({515690, 904779})\n",
      "pos 32: frozenset({515690, 904779})\n",
      "pos 33: frozenset({515690, 904779})\n",
      "pos 34: frozenset({515690, 904779})\n",
      "pos 35: frozenset({515690, 904779})\n",
      "pos 36: frozenset({515690, 904779})\n",
      "pos 37: frozenset({515690, 904779})\n",
      "pos 38: frozenset({515690, 904779})\n",
      "pos 39: frozenset({515690, 904779})\n",
      "pos 40: frozenset({515690, 904779})\n",
      "pos 41: frozenset({515690, 904779})\n",
      "pos 42: frozenset({515690, 904779})\n",
      "pos 43: frozenset({515690, 904779})\n",
      "pos 44: frozenset({515690, 904779})\n",
      "pos 45: frozenset({515690, 904779})\n",
      "pos 46: frozenset({515690, 904779})\n",
      "pos 47: frozenset({515690, 904779})\n",
      "pos 48: frozenset({515690, 904779})\n",
      "pos 49: frozenset({515690, 904779})\n",
      "pos 50: frozenset({515690, 904779})\n",
      "pos 66: frozenset({515691, 904780})\n",
      "pos 67: frozenset({515691, 904781})\n",
      "pos 68: frozenset({515691, 904781})\n",
      "pos 69: frozenset({515691, 904781})\n",
      "pos 70: frozenset({515691, 904781})\n",
      "pos 71: frozenset({515691, 904781})\n",
      "pos 72: frozenset({515691, 904782})\n",
      "pos 73: frozenset({515691, 904782})\n",
      "pos 74: frozenset({515691, 904782})\n",
      "pos 75: frozenset({515691, 904782})\n",
      "pos 76: frozenset({515691, 904782})\n",
      "pos 77: frozenset({515691, 904783})\n",
      "pos 78: frozenset({515691, 904783})\n",
      "pos 79: frozenset({515691, 904783})\n",
      "pos 80: frozenset({515691, 904783})\n",
      "pos 81: frozenset({515691, 904783})\n",
      "pos 82: frozenset({515691, 904783})\n",
      "pos 83: frozenset({515691, 904783})\n",
      "pos 84: frozenset({515691, 904783})\n",
      "pos 85: frozenset({515691, 904783})\n",
      "pos 101: frozenset({904784, 515692})\n",
      "pos 102: frozenset({904785, 515692})\n",
      "pos 103: frozenset({904785, 515692})\n",
      "pos 104: frozenset({904785, 515692})\n",
      "pos 105: frozenset({904785, 515692})\n",
      "pos 106: frozenset({904786, 515692})\n",
      "pos 107: frozenset({904786, 515692})\n",
      "pos 108: frozenset({904786, 515692})\n",
      "pos 109: frozenset({904786, 515692})\n",
      "pos 110: frozenset({904786, 515692})\n",
      "pos 111: frozenset({904786, 515692})\n",
      "pos 112: frozenset({904786, 515692})\n",
      "pos 113: frozenset({904786, 515692})\n",
      "pos 114: frozenset({904786, 515692})\n",
      "pos 115: frozenset({904786, 515692})\n",
      "pos 116: frozenset({904786, 515692})\n",
      "pos 117: frozenset({904786, 515692})\n",
      "pos 133: frozenset({904787, 515693})\n",
      "pos 134: frozenset({904788, 515693})\n",
      "pos 135: frozenset({904788, 515693})\n",
      "pos 136: frozenset({904788, 515693})\n",
      "pos 137: frozenset({904788, 515693})\n",
      "pos 138: frozenset({904788, 515693})\n",
      "pos 139: frozenset({904788, 515693})\n",
      "pos 140: frozenset({904788, 515693})\n",
      "pos 141: frozenset({904788, 515693})\n",
      "pos 142: frozenset({904788, 515693})\n",
      "pos 143: frozenset({904788, 515693})\n",
      "pos 144: frozenset({515693, 904789})\n",
      "pos 145: frozenset({515693, 904789})\n",
      "pos 146: frozenset({515693, 904789})\n",
      "pos 147: frozenset({515693, 904789})\n",
      "pos 148: frozenset({515693, 904789})\n",
      "pos 149: frozenset({515693, 904789})\n",
      "pos 150: frozenset({515693, 904790})\n",
      "pos 151: frozenset({515693, 904790})\n",
      "pos 152: frozenset({515693, 904790})\n",
      "pos 153: frozenset({515693, 904790})\n",
      "pos 154: frozenset({515693, 904790})\n",
      "pos 155: frozenset({515693, 904790})\n",
      "pos 156: frozenset({515693, 904790})\n",
      "pos 157: frozenset({515693, 904790})\n",
      "pos 158: frozenset({515693, 904790})\n",
      "pos 159: frozenset({515693, 904790})\n",
      "pos 160: frozenset({515693, 904790})\n",
      "pos 161: frozenset({515693, 904790})\n",
      "pos 162: frozenset({515693, 904790})\n",
      "pos 163: frozenset({515693, 904790})\n",
      "pos 179: frozenset({515694, 904791})\n",
      "pos 180: frozenset({904792, 515694})\n",
      "pos 181: frozenset({904792, 515694})\n",
      "pos 182: frozenset({904792, 515694})\n",
      "pos 183: frozenset({904792, 515694})\n",
      "pos 184: frozenset({904792, 515694})\n",
      "pos 185: frozenset({904793, 515694})\n",
      "pos 186: frozenset({904793, 515694})\n",
      "pos 187: frozenset({904793, 515694})\n",
      "pos 188: frozenset({904793, 515694})\n",
      "pos 189: frozenset({904793, 515694})\n",
      "pos 190: frozenset({904793, 515694})\n",
      "pos 206: frozenset({904794, 515695})\n",
      "pos 207: frozenset({904794, 515695})\n",
      "pos 208: frozenset({904794, 515695})\n",
      "pos 209: frozenset({904794, 515695})\n",
      "pos 210: frozenset({904795, 515695})\n",
      "pos 211: frozenset({904795, 515695})\n",
      "pos 212: frozenset({904795, 515695})\n",
      "pos 213: frozenset({904795, 515695})\n",
      "pos 229: frozenset({515696, 904796})\n",
      "pos 230: frozenset({515696, 904797})\n",
      "pos 231: frozenset({515696, 904797})\n",
      "pos 232: frozenset({515696, 904797})\n",
      "pos 233: frozenset({515696, 904797})\n",
      "pos 234: frozenset({515696, 904798})\n",
      "pos 235: frozenset({515696, 904798})\n",
      "pos 236: frozenset({515696, 904798})\n",
      "pos 237: frozenset({515696, 904798})\n",
      "pos 238: frozenset({515696, 904798})\n",
      "pos 239: frozenset({515696, 904798})\n",
      "pos 255: frozenset({515697, 904799})\n",
      "pos 256: frozenset({904800, 515697})\n",
      "pos 257: frozenset({904800, 515697})\n",
      "pos 258: frozenset({904800, 515697})\n",
      "pos 259: frozenset({904800, 515697})\n",
      "pos 260: frozenset({515697, 904801})\n",
      "pos 261: frozenset({515697, 904801})\n",
      "pos 262: frozenset({515697, 904801})\n",
      "pos 263: frozenset({515697, 904801})\n",
      "pos 264: frozenset({515697, 904801})\n",
      "pos 265: frozenset({515697, 904801})\n",
      "pos 266: frozenset({515697, 904802})\n",
      "pos 267: frozenset({515697, 904802})\n",
      "pos 268: frozenset({515697, 904802})\n",
      "pos 269: frozenset({515697, 904802})\n",
      "pos 270: frozenset({515697, 904802})\n",
      "pos 271: frozenset({515697, 904802})\n",
      "pos 272: frozenset({515697, 904802})\n",
      "pos 273: frozenset({515697, 904802})\n",
      "pos 289: frozenset({515698, 904803})\n",
      "pos 290: frozenset({515698, 904803})\n",
      "pos 291: frozenset({515698, 904803})\n",
      "pos 292: frozenset({515698, 904804})\n",
      "pos 293: frozenset({515698, 904804})\n",
      "pos 294: frozenset({515698, 904804})\n",
      "pos 295: frozenset({515698, 904804})\n",
      "pos 311: frozenset({515699, 904805})\n",
      "pos 312: frozenset({515699, 904806})\n",
      "pos 313: frozenset({515699, 904806})\n",
      "pos 314: frozenset({515699, 904806})\n",
      "pos 315: frozenset({515699, 904806})\n",
      "pos 316: frozenset({515699, 904806})\n",
      "pos 317: frozenset({515699, 904807})\n",
      "pos 318: frozenset({515699, 904807})\n",
      "pos 319: frozenset({515699, 904807})\n",
      "pos 320: frozenset({515699, 904807})\n",
      "pos 321: frozenset({515699, 904807})\n",
      "pos 322: frozenset({515699, 904807})\n",
      "pos 323: frozenset({904808, 515699})\n",
      "pos 324: frozenset({904808, 515699})\n",
      "pos 325: frozenset({904808, 515699})\n",
      "pos 326: frozenset({904808, 515699})\n",
      "pos 327: frozenset({904808, 515699})\n",
      "pos 328: frozenset({904808, 515699})\n",
      "pos 329: frozenset({904808, 515699})\n",
      "pos 330: frozenset({904808, 515699})\n",
      "pos 331: frozenset({904808, 515699})\n",
      "pos 332: frozenset({904808, 515699})\n",
      "pos 333: frozenset({904808, 515699})\n",
      "pos 334: frozenset({904808, 515699})\n",
      "pos 335: frozenset({904808, 515699})\n",
      "pos 336: frozenset({904808, 515699})\n",
      "pos 337: frozenset({904808, 515699})\n",
      "pos 338: frozenset({904808, 515699})\n",
      "pos 339: frozenset({904808, 515699})\n",
      "pos 340: frozenset({904808, 515699})\n",
      "pos 341: frozenset({904808, 515699})\n",
      "pos 342: frozenset({904808, 515699})\n",
      "pos 343: frozenset({904808, 515699})\n"
     ]
    }
   ],
   "source": [
    "print(\"\\n\".join(f\"pos {i}: {p}\" for (i, p) in enumerate(rec.positions()) if p))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can write the recorded text and the positions to two files:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "lines_to_next_cell": 2
   },
   "outputs": [],
   "source": [
    "rec.write(\"data/gen1.txt\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Genesis 1:1@0 BR>CJT BR> >LHJM >T HCMJM W>T H>RY00 \n",
      "Genesis 1:2@1 WH>RY HJTH THW WBHW \n",
      "Genesis 1:2@2 WXCK <L&PNJ THWM \n",
      "Genesis 1:2@3 WRWX >LHJM MRXPT <L&PNJ HMJM00 \n",
      "Genesis 1:3@4 WJ>MR >LHJM \n",
      "Genesis 1:3@5 JHJ >WR \n",
      "Genesis 1:3@6 WJHJ&>WR00 \n",
      "Genesis 1:4@7 WJR> >LHJM >T&H>WR \n",
      "Genesis 1:4@8 KJ&VWB \n",
      "Genesis 1:4@9 WJBDL >LHJM BJN H>WR WBJN HXCK00 \n"
     ]
    }
   ],
   "source": [
    "!head -n 10 data/gen1.txt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "904776\t515690\n",
      "904776\t515690\n",
      "904776\t515690\n",
      "904776\t515690\n",
      "904776\t515690\n",
      "904776\t515690\n",
      "904776\t515690\n",
      "904777\t515690\n",
      "904777\t515690\n",
      "904777\t515690\n",
      "904777\t515690\n",
      "515690\t904778\n",
      "515690\t904778\n",
      "515690\t904778\n",
      "515690\t904778\n",
      "515690\t904778\n"
     ]
    }
   ],
   "source": [
    "!head -n 30 data/gen1.txt.pos"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we produce a (fake) annotation file, based on the text.\n",
    "\n",
    "The file is tab delimited, the columns are:\n",
    "\n",
    "* start character position\n",
    "* end character position\n",
    "* feature 1 value\n",
    "* feature 2 value\n",
    "* etc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "source": [
    "We annotate as follows:\n",
    "\n",
    "* every word that starts with a `B` gets `bword=1`\n",
    "* every word that ends with a `T` gets `tword=1`\n",
    "\n",
    "Then we want every phrase with a b-word to get `bword=1` and likewise\n",
    "every clause with a b-word to get `bword=1`,\n",
    "and the same for `tword`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "def annotate(fileName):\n",
    "    annotations = {}\n",
    "\n",
    "    with open(fileName) as fh:\n",
    "        pos = 0\n",
    "        for line in fh:\n",
    "            words = line.split(\" \")\n",
    "\n",
    "            for word in words[0:2]:\n",
    "                lWord = len(word)\n",
    "                pos += lWord + 1\n",
    "            for word in words[2:]:\n",
    "                word = word.rstrip(\"\\n\")\n",
    "                lWord = len(word)\n",
    "                start = pos\n",
    "                end = pos + lWord - 1\n",
    "                pos += lWord + 1\n",
    "                if lWord:\n",
    "                    if word[0] == \"B\":\n",
    "                        annotations.setdefault((start, end), {})[\"bword\"] = 1\n",
    "                    if word[-1] == \"T\":\n",
    "                        annotations.setdefault((start, end), {})[\"tword\"] = 1\n",
    "\n",
    "    with open(f\"{fileName}.ann\", \"w\") as fh:\n",
    "        fh.write(\"start\\tend\\tbword\\ttword\\n\")\n",
    "        for ((start, end), features) in annotations.items():\n",
    "            row = \"\\t\".join(\n",
    "                str(a)\n",
    "                for a in (\n",
    "                    start,\n",
    "                    end,\n",
    "                    features.get(\"bword\", \"\"),\n",
    "                    features.get(\"tword\", \"\"),\n",
    "                )\n",
    "            )\n",
    "            fh.write(f\"{row}\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "annotate(\"data/gen1.txt\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is the annotation file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "start\tend\tbword\ttword\n",
      "14\t19\t1\t1\n",
      "21\t23\t1\t\n",
      "31\t32\t\t1\n",
      "40\t42\t\t1\n",
      "144\t148\t\t1\n",
      "323\t325\t1\t\n"
     ]
    }
   ],
   "source": [
    "!cat data/gen1.txt.ann"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we want to feed back these annotations as TF features on `phrase_atom` and `clause_atom` nodes.\n",
    "\n",
    "Our recorder knows how to do that."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "features = rec.makeFeatures(\"data/gen1.txt.ann\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's see."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{904776: '1', 515690: '1', 904777: '1', 904808: '1', 515699: '1'}"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "features[\"bword\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{904776: '1', 515690: '1', 904779: '1', 904789: '1', 515693: '1'}"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "features[\"tword\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "bword phrase_atom 904776: BR>CJT \n",
      "bword clause_atom 515690: BR>CJT BR> >LHJM >T HCMJM W>T H>RY00 \n",
      "bword phrase_atom 904777: BR> \n",
      "bword phrase_atom 904808: BJN H>WR WBJN HXCK00 \n",
      "bword clause_atom 515699: WJBDL >LHJM BJN H>WR WBJN HXCK00 \n",
      "tword phrase_atom 904776: BR>CJT \n",
      "tword clause_atom 515690: BR>CJT BR> >LHJM >T HCMJM W>T H>RY00 \n",
      "tword phrase_atom 904779: >T HCMJM W>T H>RY00 \n",
      "tword phrase_atom 904789: MRXPT \n",
      "tword clause_atom 515693: WRWX >LHJM MRXPT <L&PNJ HMJM00 \n"
     ]
    }
   ],
   "source": [
    "for feat in (\"bword\", \"tword\"):\n",
    "    for n in features[feat]:\n",
    "        print(f'{feat} {F.otype.v(n)} {n}: {T.text(n, fmt=\"text-trans-plain\")}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What if we want to transform the annotations to word features instead to features on phrase and clause atoms?\n",
    "\n",
    "Then we should record the text differently.\n",
    "\n",
    "We only add slots to the mix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "rec = Recorder()\n",
    "LIMIT = 10\n",
    "\n",
    "for (i, cla) in enumerate(L.d(gen1, otype=\"clause_atom\")):\n",
    "    if i >= LIMIT:\n",
    "        break\n",
    "    label = \"{} {}:{}\".format(*T.sectionFromNode(cla))\n",
    "    rec.add(f\"{label}@{i} \")\n",
    "\n",
    "    for w in L.d(cla, otype=\"word\"):\n",
    "        rec.start(w)\n",
    "        rec.add(T.text(w, fmt=\"text-trans-plain\"))\n",
    "        rec.end(w)\n",
    "\n",
    "    rec.add(\"\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It gives the same text:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Genesis 1:1@0 BR>CJT BR> >LHJM >T HCMJM W>T H>RY00 \n",
      "Genesis 1:2@1 WH>RY HJTH THW WBHW \n",
      "Genesis 1:2@2 WXCK <L&PNJ THWM \n",
      "Genesis 1:2@3 WRWX >LHJM MRXPT <L&PNJ HMJM00 \n",
      "Genesis 1:3@4 WJ>MR >LHJM \n",
      "Genesis 1:3@5 JHJ >WR \n",
      "Genesis 1:3@6 WJHJ&>WR00 \n",
      "Genesis 1:4@7 WJR> >LHJM >T&H>WR \n",
      "Genesis 1:4@8 KJ&VWB \n",
      "Genesis 1:4@9 WJBDL >LHJM BJN H>WR WBJN HXCK00 \n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(rec.text())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "but the node positions are different:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pos 14: frozenset({1})\n",
      "pos 15: frozenset({2})\n",
      "pos 16: frozenset({2})\n",
      "pos 17: frozenset({2})\n",
      "pos 18: frozenset({2})\n",
      "pos 19: frozenset({2})\n",
      "pos 20: frozenset({2})\n",
      "pos 21: frozenset({3})\n",
      "pos 22: frozenset({3})\n",
      "pos 23: frozenset({3})\n",
      "pos 24: frozenset({3})\n",
      "pos 25: frozenset({4})\n",
      "pos 26: frozenset({4})\n",
      "pos 27: frozenset({4})\n",
      "pos 28: frozenset({4})\n",
      "pos 29: frozenset({4})\n",
      "pos 30: frozenset({4})\n",
      "pos 31: frozenset({5})\n",
      "pos 32: frozenset({5})\n",
      "pos 33: frozenset({5})\n",
      "pos 34: frozenset({6})\n",
      "pos 35: frozenset({7})\n",
      "pos 36: frozenset({7})\n",
      "pos 37: frozenset({7})\n",
      "pos 38: frozenset({7})\n",
      "pos 39: frozenset({7})\n",
      "pos 40: frozenset({8})\n",
      "pos 41: frozenset({9})\n",
      "pos 42: frozenset({9})\n",
      "pos 43: frozenset({9})\n",
      "pos 44: frozenset({10})\n",
      "pos 45: frozenset({11})\n",
      "pos 46: frozenset({11})\n",
      "pos 47: frozenset({11})\n",
      "pos 48: frozenset({11})\n",
      "pos 49: frozenset({11})\n",
      "pos 50: frozenset({11})\n",
      "pos 66: frozenset({12})\n",
      "pos 67: frozenset({13})\n",
      "pos 68: frozenset({14})\n",
      "pos 69: frozenset({14})\n",
      "pos 70: frozenset({14})\n",
      "pos 71: frozenset({14})\n",
      "pos 72: frozenset({15})\n",
      "pos 73: frozenset({15})\n",
      "pos 74: frozenset({15})\n",
      "pos 75: frozenset({15})\n",
      "pos 76: frozenset({15})\n",
      "pos 77: frozenset({16})\n",
      "pos 78: frozenset({16})\n",
      "pos 79: frozenset({16})\n",
      "pos 80: frozenset({16})\n",
      "pos 81: frozenset({17})\n",
      "pos 82: frozenset({18})\n",
      "pos 83: frozenset({18})\n",
      "pos 84: frozenset({18})\n",
      "pos 85: frozenset({18})\n",
      "pos 101: frozenset({19})\n",
      "pos 102: frozenset({20})\n",
      "pos 103: frozenset({20})\n",
      "pos 104: frozenset({20})\n",
      "pos 105: frozenset({20})\n",
      "pos 106: frozenset({21})\n",
      "pos 107: frozenset({21})\n",
      "pos 108: frozenset({21})\n",
      "pos 109: frozenset({22})\n",
      "pos 110: frozenset({22})\n",
      "pos 111: frozenset({22})\n",
      "pos 112: frozenset({22})\n",
      "pos 113: frozenset({23})\n",
      "pos 114: frozenset({23})\n",
      "pos 115: frozenset({23})\n",
      "pos 116: frozenset({23})\n",
      "pos 117: frozenset({23})\n",
      "pos 133: frozenset({24})\n",
      "pos 134: frozenset({25})\n",
      "pos 135: frozenset({25})\n",
      "pos 136: frozenset({25})\n",
      "pos 137: frozenset({25})\n",
      "pos 138: frozenset({26})\n",
      "pos 139: frozenset({26})\n",
      "pos 140: frozenset({26})\n",
      "pos 141: frozenset({26})\n",
      "pos 142: frozenset({26})\n",
      "pos 143: frozenset({26})\n",
      "pos 144: frozenset({27})\n",
      "pos 145: frozenset({27})\n",
      "pos 146: frozenset({27})\n",
      "pos 147: frozenset({27})\n",
      "pos 148: frozenset({27})\n",
      "pos 149: frozenset({27})\n",
      "pos 150: frozenset({28})\n",
      "pos 151: frozenset({28})\n",
      "pos 152: frozenset({28})\n",
      "pos 153: frozenset({29})\n",
      "pos 154: frozenset({29})\n",
      "pos 155: frozenset({29})\n",
      "pos 156: frozenset({29})\n",
      "pos 157: frozenset({30})\n",
      "pos 158: frozenset({31})\n",
      "pos 159: frozenset({31})\n",
      "pos 160: frozenset({31})\n",
      "pos 161: frozenset({31})\n",
      "pos 162: frozenset({31})\n",
      "pos 163: frozenset({31})\n",
      "pos 179: frozenset({32})\n",
      "pos 180: frozenset({33})\n",
      "pos 181: frozenset({33})\n",
      "pos 182: frozenset({33})\n",
      "pos 183: frozenset({33})\n",
      "pos 184: frozenset({33})\n",
      "pos 185: frozenset({34})\n",
      "pos 186: frozenset({34})\n",
      "pos 187: frozenset({34})\n",
      "pos 188: frozenset({34})\n",
      "pos 189: frozenset({34})\n",
      "pos 190: frozenset({34})\n",
      "pos 206: frozenset({35})\n",
      "pos 207: frozenset({35})\n",
      "pos 208: frozenset({35})\n",
      "pos 209: frozenset({35})\n",
      "pos 210: frozenset({36})\n",
      "pos 211: frozenset({36})\n",
      "pos 212: frozenset({36})\n",
      "pos 213: frozenset({36})\n",
      "pos 229: frozenset({37})\n",
      "pos 230: frozenset({38})\n",
      "pos 231: frozenset({38})\n",
      "pos 232: frozenset({38})\n",
      "pos 233: frozenset({38})\n",
      "pos 234: frozenset({39})\n",
      "pos 235: frozenset({39})\n",
      "pos 236: frozenset({39})\n",
      "pos 237: frozenset({39})\n",
      "pos 238: frozenset({39})\n",
      "pos 239: frozenset({39})\n",
      "pos 255: frozenset({40})\n",
      "pos 256: frozenset({41})\n",
      "pos 257: frozenset({41})\n",
      "pos 258: frozenset({41})\n",
      "pos 259: frozenset({41})\n",
      "pos 260: frozenset({42})\n",
      "pos 261: frozenset({42})\n",
      "pos 262: frozenset({42})\n",
      "pos 263: frozenset({42})\n",
      "pos 264: frozenset({42})\n",
      "pos 265: frozenset({42})\n",
      "pos 266: frozenset({43})\n",
      "pos 267: frozenset({43})\n",
      "pos 268: frozenset({43})\n",
      "pos 269: frozenset({44})\n",
      "pos 270: frozenset({45})\n",
      "pos 271: frozenset({45})\n",
      "pos 272: frozenset({45})\n",
      "pos 273: frozenset({45})\n",
      "pos 289: frozenset({46})\n",
      "pos 290: frozenset({46})\n",
      "pos 291: frozenset({46})\n",
      "pos 292: frozenset({47})\n",
      "pos 293: frozenset({47})\n",
      "pos 294: frozenset({47})\n",
      "pos 295: frozenset({47})\n",
      "pos 311: frozenset({48})\n",
      "pos 312: frozenset({49})\n",
      "pos 313: frozenset({49})\n",
      "pos 314: frozenset({49})\n",
      "pos 315: frozenset({49})\n",
      "pos 316: frozenset({49})\n",
      "pos 317: frozenset({50})\n",
      "pos 318: frozenset({50})\n",
      "pos 319: frozenset({50})\n",
      "pos 320: frozenset({50})\n",
      "pos 321: frozenset({50})\n",
      "pos 322: frozenset({50})\n",
      "pos 323: frozenset({51})\n",
      "pos 324: frozenset({51})\n",
      "pos 325: frozenset({51})\n",
      "pos 326: frozenset({51})\n",
      "pos 327: frozenset({52})\n",
      "pos 328: frozenset({53})\n",
      "pos 329: frozenset({53})\n",
      "pos 330: frozenset({53})\n",
      "pos 331: frozenset({53})\n",
      "pos 332: frozenset({54})\n",
      "pos 333: frozenset({55})\n",
      "pos 334: frozenset({55})\n",
      "pos 335: frozenset({55})\n",
      "pos 336: frozenset({55})\n",
      "pos 337: frozenset({56})\n",
      "pos 338: frozenset({57})\n",
      "pos 339: frozenset({57})\n",
      "pos 340: frozenset({57})\n",
      "pos 341: frozenset({57})\n",
      "pos 342: frozenset({57})\n",
      "pos 343: frozenset({57})\n"
     ]
    }
   ],
   "source": [
    "print(\"\\n\".join(f\"pos {i}: {p}\" for (i, p) in enumerate(rec.positions()) if p))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have produced the same text,\n",
    "so we can use the earlier annotation file to create word features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "features = rec.makeFeatures(\"data/gen1.txt.ann\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1: '1', 2: '1', 3: '1', 51: '1'}"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "features[\"bword\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1: '1', 2: '1', 5: '1', 8: '1', 9: '1', 27: '1'}"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "features[\"tword\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "bword word 1: B\n",
      "bword word 2: R>CJT \n",
      "bword word 3: BR> \n",
      "bword word 51: BJN \n",
      "tword word 1: B\n",
      "tword word 2: R>CJT \n",
      "tword word 5: >T \n",
      "tword word 8: W\n",
      "tword word 9: >T \n",
      "tword word 27: MRXPT \n"
     ]
    }
   ],
   "source": [
    "for feat in (\"bword\", \"tword\"):\n",
    "    for n in features[feat]:\n",
    "        print(f'{feat} {F.otype.v(n)} {n}: {T.text(n, fmt=\"text-trans-plain\")}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Explanation:\n",
    "\n",
    "The annotator just looked at the string `BR>CJT` without knowing that it is two words."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "start\tend\tbword\ttword\n",
      "14\t19\t1\t1\n",
      "21\t23\t1\t\n",
      "31\t32\t\t1\n",
      "40\t42\t\t1\n",
      "144\t148\t\t1\n",
      "323\t325\t1\t\n"
     ]
    }
   ],
   "source": [
    "!cat data/gen1.txt.ann"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So it has annotated pos 14-19 as a `bword` and as a `tword`.\n",
    "\n",
    "But TF knows that 14-19 are slots 1 and 2, so when the annotations are applied,\n",
    "slots 1 and 2 are both set to `bwords` and `twords`.\n",
    "\n",
    "We can remedy the situation by producing an other text to the annotator, one where\n",
    "slots are always separated by a space.\n",
    "\n",
    "Lets do that by always adding a space, so real words are separated by two spaces."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "rec = Recorder()\n",
    "LIMIT = 10\n",
    "\n",
    "for (i, cla) in enumerate(L.d(gen1, otype=\"clause_atom\")):\n",
    "    if i >= LIMIT:\n",
    "        break\n",
    "    label = \"{} {}:{}\".format(*T.sectionFromNode(cla))\n",
    "    rec.add(f\"{label}@{i} \")\n",
    "\n",
    "    for w in L.d(cla, otype=\"word\"):\n",
    "        rec.start(w)\n",
    "        rec.add(T.text(w, fmt=\"text-trans-plain\") + \" \")\n",
    "        rec.end(w)\n",
    "\n",
    "    rec.add(\"\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is the text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Genesis 1:1@0 B R>CJT  BR>  >LHJM  >T  H CMJM  W >T  H >RY00  \n",
      "Genesis 1:2@1 W H >RY  HJTH  THW  W BHW  \n",
      "Genesis 1:2@2 W XCK  <L& PNJ  THWM  \n",
      "Genesis 1:2@3 W RWX  >LHJM  MRXPT  <L& PNJ  H MJM00  \n",
      "Genesis 1:3@4 W J>MR  >LHJM  \n",
      "Genesis 1:3@5 JHJ  >WR  \n",
      "Genesis 1:3@6 W JHJ& >WR00  \n",
      "Genesis 1:4@7 W JR>  >LHJM  >T& H >WR  \n",
      "Genesis 1:4@8 KJ& VWB  \n",
      "Genesis 1:4@9 W JBDL  >LHJM  BJN  H >WR  W BJN  H XCK00  \n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(rec.text())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We write the text to file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "rec.write(\"data/gen1wx.txt\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We run our annotator again, because we have a different text:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "annotate(\"data/gen1wx.txt\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is the new annotation file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "start\tend\tbword\ttword\n",
      "14\t14\t1\t\n",
      "16\t20\t\t1\n",
      "23\t25\t1\t\n",
      "35\t36\t\t1\n",
      "49\t50\t\t1\n",
      "99\t101\t1\t\n",
      "170\t174\t\t1\n",
      "373\t375\t1\t\n",
      "387\t389\t1\t\n"
     ]
    }
   ],
   "source": [
    "!cat data/gen1wx.txt.ann"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The features are no surprise:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "features = rec.makeFeatures(\"data/gen1wx.txt.ann\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1: '1', 3: '1', 18: '1', 51: '1', 55: '1'}"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "features[\"bword\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{2: '1', 5: '1', 9: '1', 27: '1'}"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "features[\"tword\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "bword word 1: B\n",
      "bword word 3: BR> \n",
      "bword word 18: BHW \n",
      "bword word 51: BJN \n",
      "bword word 55: BJN \n",
      "tword word 2: R>CJT \n",
      "tword word 5: >T \n",
      "tword word 9: >T \n",
      "tword word 27: MRXPT \n"
     ]
    }
   ],
   "source": [
    "for feat in (\"bword\", \"tword\"):\n",
    "    for n in features[feat]:\n",
    "        print(f'{feat} {F.otype.v(n)} {n}: {T.text(n, fmt=\"text-trans-plain\")}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# All steps\n",
    "\n",
    "* **[start](start.ipynb)** your first step in mastering the bible computationally\n",
    "* **[display](display.ipynb)** become an expert in creating pretty displays of your text structures\n",
    "* **[search](search.ipynb)** turbo charge your hand-coding with search templates\n",
    "* **[export Excel](exportExcel.ipynb)** make tailor-made spreadsheets out of your results\n",
    "* **[share](share.ipynb)** draw in other people's data and let them use yours\n",
    "* **[export](export.ipynb)** export your dataset as an Emdros database\n",
    "* **annotate** annotate plain text by means of other tools and import the annotations as TF features\n",
    "* **[volumes](volumes.ipynb)** work with selected books only\n",
    "* **[trees](trees.ipynb)** work with the BHSA data as syntax trees\n",
    "\n",
    "CC-BY Dirk Roorda"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.1"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}