{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "You might want to consider the [start](search.ipynb) of this tutorial.\n", "\n", "Short introductions to other TF datasets:\n", "\n", "* [Dead Sea Scrolls](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/dss.ipynb),\n", "* [Old Babylonian Letters](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/oldbabylonian.ipynb),\n", "or the\n", "* [Quran](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/quran.ipynb)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Sets and queries\n", "\n", "You can pass custom sets to the search function, as we have seen in [advanced](searchAdvanced.ipynb).\n", "Now we want to give a real-world example of that, and also show how you can prepare sets for use\n", "in the TF browser.\n", "\n", "## Chapters with only \"frequent\" words\n", "\n", "The following task comes from the department of education:\n", "\n", "*Find the chapters without more than 20 rare words, where a rare word has a frequency (as lexeme) of less than 70.*\n", "\n", "A question posed by Oliver Glanz." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import os" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "from tf.app import use\n", "from tf.lib import writeSets, readSets" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "TF-app: ~/text-fabric-data/etcbc/bhsa/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/etcbc/bhsa/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/etcbc/phono/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/etcbc/parallels/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "This is Text-Fabric 9.2.3\n", "Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html\n", "\n", "122 features found and 0 ignored\n" ] }, { "data": { "text/html": [ "Text-Fabric: Text-Fabric API 9.2.3, etcbc/bhsa/app v3, Search Reference
Data: BHSA, Character table, Feature docs
Features:
\n", "
Parallel Passages\n", "
\n", "\n", "
\n", "
\n", "crossref\n", "
\n", "
int
\n", "
\n", " 🆗 links between similar passages\n", "
\n", "\n", "
\n", "
author:
\n", "
BHSA Data: Constantijn Sikkel; Parallels Notebook: Dirk Roorda, Martijn Naaijer
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:40:46Z
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
Parallels notebook, see https://github.com/ETCBC/parallels
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "\n", "
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis\n", "
\n", "\n", "
\n", "
\n", "book\n", "
\n", "
str
\n", "
\n", " ✅ book name in Latin (Genesis; Numeri; Reges1; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:55Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "book@ll\n", "
\n", "
str
\n", "
\n", " ✅ book name in amharic (ኣማርኛ)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:20:27Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
language:
\n", "
ኣማርኛ
\n", "
\n", "\n", "
\n", "
languageCode:
\n", "
am
\n", "
\n", "\n", "
\n", "
languageEnglish:
\n", "
amharic
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
book names from wikipedia and other sources
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "chapter\n", "
\n", "
int
\n", "
\n", " ✅ chapter number (1; 2; 3; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:55Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "code\n", "
\n", "
int
\n", "
\n", " ✅ identifier of a clause atom relationship (0; 74; 367; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:56Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "det\n", "
\n", "
str
\n", "
\n", " ✅ determinedness of phrase(atom) (det; und; NA.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:56Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "domain\n", "
\n", "
str
\n", "
\n", " ✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:57Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "freq_lex\n", "
\n", "
int
\n", "
\n", " ✅ frequency of lexemes\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:24:45Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed on the basis of the ETCBC core set of features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "function\n", "
\n", "
str
\n", "
\n", " ✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:57Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_cons\n", "
\n", "
str
\n", "
\n", " ✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:57Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_cons_utf8\n", "
\n", "
str
\n", "
\n", " ✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:58Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_lex\n", "
\n", "
str
\n", "
\n", " ✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:58Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_lex_utf8\n", "
\n", "
str
\n", "
\n", " ✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:59Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_word\n", "
\n", "
str
\n", "
\n", " ✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:04Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_word_utf8\n", "
\n", "
str
\n", "
\n", " ✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:04Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "gloss\n", "
\n", "
str
\n", "
\n", " 🆗 english translation of lexeme (beginning create god(s))\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:13Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "gn\n", "
\n", "
str
\n", "
\n", " ✅ grammatical gender (m; f; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:05Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "label\n", "
\n", "
str
\n", "
\n", " ✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:06Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "language\n", "
\n", "
str
\n", "
\n", " ✅ of word or lexeme (Hebrew; Aramaic.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:13Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "lex\n", "
\n", "
str
\n", "
\n", " ✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:14Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "lex_utf8\n", "
\n", "
str
\n", "
\n", " ✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "ls\n", "
\n", "
str
\n", "
\n", " ✅ lexical set, subclassification of part-of-speech (card; ques; mult)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "nametype\n", "
\n", "
str
\n", "
\n", " ⚠️ named entity type (pers; mens; gens; topo; ppde.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "nme\n", "
\n", "
str
\n", "
\n", " ✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:08Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "nu\n", "
\n", "
str
\n", "
\n", " ✅ grammatical number (sg; du; pl; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:08Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
int
\n", "
\n", " ✅ sequence number of an object within its context\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:09Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "
\n", " \n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "pargr\n", "
\n", "
str
\n", "
\n", " 🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:22:50Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional paragraph file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "pdp\n", "
\n", "
str
\n", "
\n", " ✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:10Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "pfm\n", "
\n", "
str
\n", "
\n", " ✅ preformative consonantal-transliterated (absent; n/a; J, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:11Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "prs\n", "
\n", "
str
\n", "
\n", " ✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:11Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "prs_gn\n", "
\n", "
str
\n", "
\n", " ✅ pronominal suffix gender (m; f; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:11Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "prs_nu\n", "
\n", "
str
\n", "
\n", " ✅ pronominal suffix number (sg; du; pl; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:12Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "prs_ps\n", "
\n", "
str
\n", "
\n", " ✅ pronominal suffix person (p1; p2; p3; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:12Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "ps\n", "
\n", "
str
\n", "
\n", " ✅ grammatical person (p1; p2; p3; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:12Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "qere\n", "
\n", "
str
\n", "
\n", " ✅ word pointed-transliterated masoretic reading correction\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "qere_trailer\n", "
\n", "
str
\n", "
\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "qere_trailer_utf8\n", "
\n", "
str
\n", "
\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "qere_utf8\n", "
\n", "
str
\n", "
\n", " ✅ word pointed-Hebrew masoretic reading correction\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "rank_lex\n", "
\n", "
int
\n", "
\n", " ✅ ranking of lexemes based on freqnuecy\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:24:46Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed on the basis of the ETCBC core set of features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "rela\n", "
\n", "
str
\n", "
\n", " ✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:13Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "sp\n", "
\n", "
str
\n", "
\n", " ✅ part-of-speech (art; verb; subs; nmpr, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "st\n", "
\n", "
str
\n", "
\n", " ✅ state of a noun (a (absolute); c (construct); e (emphatic).)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:14Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "tab\n", "
\n", "
int
\n", "
\n", " ✅ clause atom: its level in the linguistic embedding\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "trailer\n", "
\n", "
str
\n", "
\n", " ✅ interword material pointed-transliterated (& 00 05 00_P ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:01Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "trailer_utf8\n", "
\n", "
str
\n", "
\n", " ✅ interword material pointed-Hebrew (־ ׃)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:01Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "txt\n", "
\n", "
str
\n", "
\n", " ✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "typ\n", "
\n", "
str
\n", "
\n", " ✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "uvf\n", "
\n", "
str
\n", "
\n", " ✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "vbe\n", "
\n", "
str
\n", "
\n", " ✅ verbal ending consonantal-transliterated (n/a; W; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "vbs\n", "
\n", "
str
\n", "
\n", " ✅ root formation consonantal-transliterated (absent; n/a; H; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "verse\n", "
\n", "
int
\n", "
\n", " ✅ verse number\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:18Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "voc_lex\n", "
\n", "
str
\n", "
\n", " ✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "voc_lex_utf8\n", "
\n", "
str
\n", "
\n", " ✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "vs\n", "
\n", "
str
\n", "
\n", " ✅ verbal stem (qal; piel; hif; apel; pael)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:18Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "vt\n", "
\n", "
str
\n", "
\n", " ✅ verbal tense (perf; impv; wayq; infc)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:18Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "mother\n", "
\n", "
none
\n", "
\n", " ✅ linguistic dependency between textual objects\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:22Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "
\n", " \n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "\n", "
Phonetic Transcriptions\n", "
\n", "\n", "
\n", "
\n", "phono\n", "
\n", "
str
\n", "
\n", " 🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)\n", "
\n", "\n", "
\n", "
author:
\n", "
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:25:55Z
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the phono notebook, see https://github.com/ETCBC/phono
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "phono_trailer\n", "
\n", "
str
\n", "
\n", " 🆗 interword material in phonological transcription\n", "
\n", "\n", "
\n", "
author:
\n", "
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:25:55Z
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the phono notebook, see https://github.com/ETCBC/phono
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Text-Fabric API: names N F E L T S C TF directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A = use(\"ETCBC/bhsa\", hoist=globals())" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "FREQ = 70\n", "AMOUNT = 20" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Query\n", "\n", "A straightforward query is:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "query = f\"\"\"\n", "chapter\n", "/without/\n", " word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", " < word freq_lex<{FREQ}\n", "/-/\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Several problems with this query:\n", "\n", "* it is very inelegant.\n", "* it does not perform, in fact, you cannot wait for it.\n", "* the logic is wasteful: the `/without/` query that expresses what should be left out\n", " denotes all possible combinations of 20 infrequent words, an astronomical number.\n", "\n", "So, better not search with this one." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# A.indent(reset=True)\n", "# A.info('start query')\n", "# results = S.search(query, limit=1)\n", "# A.info('end query')\n", "# len(results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# By hand\n", "\n", "On the other hand, with a bit of hand coding it is very easy, and almost instantaneous:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "60 chapters out of 929\n" ] } ], "source": [ "results = []\n", "allChapters = F.otype.s(\"chapter\")\n", "\n", "for chapter in allChapters:\n", " if (\n", " len([word for word in L.d(chapter, otype=\"word\") if F.freq_lex.v(word) < FREQ])\n", " < AMOUNT\n", " ):\n", " results.append(chapter)\n", "\n", "print(f\"{len(results)} chapters out of {len(allChapters)}\")" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Exodus 11, 24\n", "Leviticus 17\n", "Deuteronomy 30\n", "Joshua 23\n", "Isaiah 12, 39\n", "Jeremiah 45\n", "Ezekiel 15\n", "Hosea 3\n", "Joel 3\n", "Psalms 1, 3, 4, 13, 14, 15, 20, 23, 24, 26, 43, 47, 53, 54, 61, 67, 70, 82, 86, 87, 93, 97, 99, 100, 101, 110, 113, 114, 115, 117, 120, 121, 122, 123, 124, 125, 126, 127, 128, 130, 131, 133, 134, 136, 138, 150\n", "Job 25\n", "Esther 10\n", "2_Chronicles 27\n" ] } ], "source": [ "resultsByBook = dict()\n", "\n", "for chapter in results:\n", " (bk, ch) = T.sectionFromNode(chapter)\n", " resultsByBook.setdefault(bk, []).append(ch)\n", "\n", "for (bk, chps) in resultsByBook.items():\n", " print(\"{} {}\".format(bk, \", \".join(str(c) for c in chps)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Custom sets\n", "\n", "Once you have these chapters, you can put them in a set and use them in queries.\n", "\n", "We show how to query results as far as they occur in an \"ordinary\" chapter.\n", "\n", "First we search for a phenomenon in all chapters. The phenomenon is a clause with a subject consisting of a single noun in\n", "the plural and a verb in the plural." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "sets = dict(ochapter=set(results))" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "query1 = \"\"\"\n", "verse\n", " clause\n", " phrase function=Pred\n", " word pdp=verb nu=sg\n", " phrase function=Subj\n", " =: word pdp=subs nu=pl\n", " :=\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1.58s 262 results\n" ] } ], "source": [ "results1 = A.search(query1)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "
npclausephrasewordphraseword
1Genesis 1:1בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃ בָּרָ֣א בָּרָ֣א אֱלֹהִ֑ים אֱלֹהִ֑ים
2Genesis 1:3וַיֹּ֥אמֶר אֱלֹהִ֖ים יֹּ֥אמֶר יֹּ֥אמֶר אֱלֹהִ֖ים אֱלֹהִ֖ים
3Genesis 1:4וַיַּ֧רְא אֱלֹהִ֛ים אֶת־הָאֹ֖ור יַּ֧רְא יַּ֧רְא אֱלֹהִ֛ים אֱלֹהִ֛ים
4Genesis 1:4וַיַּבְדֵּ֣ל אֱלֹהִ֔ים בֵּ֥ין הָאֹ֖ור וּבֵ֥ין הַחֹֽשֶׁךְ׃ יַּבְדֵּ֣ל יַּבְדֵּ֣ל אֱלֹהִ֔ים אֱלֹהִ֔ים
5Genesis 1:5וַיִּקְרָ֨א אֱלֹהִ֤ים׀ לָאֹור֙ יֹ֔ום יִּקְרָ֨א יִּקְרָ֨א אֱלֹהִ֤ים׀ אֱלֹהִ֤ים׀
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.table(results1, start=1, end=5, skipCols=\"1\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we want to restrict results to ordinary chapters:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "query2 = \"\"\"\n", "ochapter\n", " verse\n", " clause\n", " phrase function=Pred\n", " word pdp=verb nu=sg\n", " phrase function=Subj\n", " =: word pdp=subs nu=pl\n", " :=\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that we use the name of a set here: `ochapter`.\n", "It is not a known node type in the BHSA, so we have to tell it what it means.\n", "We do that by passing a dictionary of custom sets.\n", "The keys are the names of the sets, which are the values.\n", "\n", "Then we may use those keys in queries, everywhere where a node type is expected." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1.55s 6 results\n" ] } ], "source": [ "results2 = A.search(query2, sets=sets)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "
npchapterverseclausephrasewordphraseword
1Psalms 47:6Psalms 47עָלָ֣ה אֱ֭לֹהִים בִּתְרוּעָ֑ה עָלָ֣ה עָלָ֣ה אֱ֭לֹהִים אֱ֭לֹהִים
2Psalms 47:9Psalms 47מָלַ֣ךְ אֱ֭לֹהִים עַל־גֹּויִ֑ם מָלַ֣ךְ מָלַ֣ךְ אֱ֭לֹהִים אֱ֭לֹהִים
3Psalms 47:9Psalms 47אֱ֝לֹהִ֗ים יָשַׁ֤ב׀ עַל־כִּסֵּ֬א קָדְשֹֽׁו׃ יָשַׁ֤ב׀ יָשַׁ֤ב׀ אֱ֝לֹהִ֗ים אֱ֝לֹהִ֗ים
4Psalms 53:3Psalms 53אֱֽלֹהִ֗ים מִשָּׁמַיִם֮ הִשְׁקִ֢יף עַֽל־בְּנֵ֫י אָדָ֥ם הִשְׁקִ֢יף הִשְׁקִ֢יף אֱֽלֹהִ֗ים אֱֽלֹהִ֗ים
5Psalms 53:6Psalms 53כִּֽי־אֱלֹהִ֗ים פִּ֭זַּר עַצְמֹ֣ות חֹנָ֑ךְ פִּ֭זַּר פִּ֭זַּר אֱלֹהִ֗ים אֱלֹהִ֗ים
6Psalms 70:5Psalms 70יִגְדַּ֣ל אֱלֹהִ֑ים יִגְדַּ֣ל יִגְדַּ֣ל אֱלֹהִ֑ים אֱלֹהִ֑ים
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.table(results2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Custom sets in the browser\n", "\n", "We save the sets in a file.\n", "But before we do so, we also want to save all ordinary verses in a set, and all ordinary words." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.52s 2751 results\n" ] } ], "source": [ "queryV = f\"\"\"\n", "verse\n", "/without/\n", " word freq_lex<{FREQ}\n", "/-/\n", "\"\"\"\n", "resultsV = A.search(queryV, shallow=True)\n", "sets[\"overse\"] = resultsV" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "sets[\"oword\"] = {w for w in F.otype.s(\"word\") if F.freq_lex.v(w) >= FREQ}" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "SETS_FILE = os.path.expanduser(\"~/Downloads/ordinary.set\")\n", "writeSets(sets, SETS_FILE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a test, we read back the sets from disk and compare the number of\n", "elements with those in the original sets, which we still have in memory." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ochapter with 60 nb 0\n", "overse with 2751 nb 0\n", "oword with 361411 nb 0\n" ] } ], "source": [ "testSets = readSets(SETS_FILE)\n", "for s in sorted(testSets):\n", " elems = len(testSets[s])\n", " oelems = len(sets[s])\n", " print(f\"{s} with {elems} nb {elems - oelems}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you can start your TF browser as follows:\n", "\n", "```sh\n", "text-fabric bhsa --sets=~/Downloads/ordinary.set\n", "```\n", "\n", "and then you can run the same queries over there!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Appendix: investigation\n", "\n", "Let's investigate the number of ordinary chapters with shifting definitions of ordinary" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "There are 929 chapters, the longest is 1603 words\n" ] } ], "source": [ "allChapters = F.otype.s(\"chapter\")\n", "longestChapter = max(len(L.d(chapter, otype=\"word\")) for chapter in allChapters)\n", "\n", "print(f\"There are {len(allChapters)} chapters, the longest is {longestChapter} words\")" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "def getOrdinary(freq, amount):\n", " results = []\n", "\n", " for chapter in allChapters:\n", " if (\n", " len(\n", " [\n", " word\n", " for word in L.d(chapter, otype=\"word\")\n", " if F.freq_lex.v(word) < freq\n", " ]\n", " )\n", " < amount\n", " ):\n", " results.append(chapter)\n", " return results" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "def overview(freq):\n", " for amount in range(20, 1700, 50):\n", " results = getOrdinary(freq, amount)\n", " print(\n", " f\"for freq={freq:>3} and amount={amount:>4}: {len(results):>4} ordinary chapters\"\n", " )\n", " if len(results) >= len(allChapters):\n", " break" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "for freq= 40 and amount= 20: 139 ordinary chapters\n", "for freq= 40 and amount= 70: 757 ordinary chapters\n", "for freq= 40 and amount= 120: 885 ordinary chapters\n", "for freq= 40 and amount= 170: 908 ordinary chapters\n", "for freq= 40 and amount= 220: 919 ordinary chapters\n", "for freq= 40 and amount= 270: 923 ordinary chapters\n", "for freq= 40 and amount= 320: 924 ordinary chapters\n", "for freq= 40 and amount= 370: 925 ordinary chapters\n", "for freq= 40 and amount= 420: 926 ordinary chapters\n", "for freq= 40 and amount= 470: 928 ordinary chapters\n", "for freq= 40 and amount= 520: 929 ordinary chapters\n", "for freq= 70 and amount= 20: 60 ordinary chapters\n", "for freq= 70 and amount= 70: 550 ordinary chapters\n", "for freq= 70 and amount= 120: 842 ordinary chapters\n", "for freq= 70 and amount= 170: 889 ordinary chapters\n", "for freq= 70 and amount= 220: 915 ordinary chapters\n", "for freq= 70 and amount= 270: 922 ordinary chapters\n", "for freq= 70 and amount= 320: 923 ordinary chapters\n", "for freq= 70 and amount= 370: 923 ordinary chapters\n", "for freq= 70 and amount= 420: 926 ordinary chapters\n", "for freq= 70 and amount= 470: 927 ordinary chapters\n", "for freq= 70 and amount= 520: 928 ordinary chapters\n", "for freq= 70 and amount= 570: 928 ordinary chapters\n", "for freq= 70 and amount= 620: 929 ordinary chapters\n", "for freq=100 and amount= 20: 38 ordinary chapters\n", "for freq=100 and amount= 70: 432 ordinary chapters\n", "for freq=100 and amount= 120: 782 ordinary chapters\n", "for freq=100 and amount= 170: 874 ordinary chapters\n", "for freq=100 and amount= 220: 905 ordinary chapters\n", "for freq=100 and amount= 270: 918 ordinary chapters\n", "for freq=100 and amount= 320: 921 ordinary chapters\n", "for freq=100 and amount= 370: 923 ordinary chapters\n", "for freq=100 and amount= 420: 923 ordinary chapters\n", "for freq=100 and amount= 470: 926 ordinary chapters\n", "for freq=100 and amount= 520: 927 ordinary chapters\n", "for freq=100 and amount= 570: 928 ordinary chapters\n", "for freq=100 and amount= 620: 928 ordinary chapters\n", "for freq=100 and amount= 670: 929 ordinary chapters\n" ] } ], "source": [ "for freq in (40, 70, 100):\n", " overview(freq)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# All steps\n", "\n", "* **[start](start.ipynb)** your first step in mastering the bible computationally\n", "* **[display](display.ipynb)** become an expert in creating pretty displays of your text structures\n", "* **[search](search.ipynb)** turbo charge your hand-coding with search templates\n", "\n", "---\n", "\n", "[advanced](searchAdvanced.ipynb)\n", "sets\n", "\n", "You have seen how to mingle sets with queries.\n", "\n", "Time to enter the race for space:\n", "\n", "[relations](searchRelations.ipynb)\n", "[quantifiers](searchQuantifiers.ipynb)\n", "[from MQL](searchFromMQL.ipynb)\n", "[rough](searchRough.ipynb)\n", "[gaps](searchGaps.ipynb)\n", "\n", "---\n", "\n", "* **[export Excel](exportExcel.ipynb)** make tailor-made spreadsheets out of your results\n", "* **[share](share.ipynb)** draw in other people's data and let them use yours\n", "* **[export](export.ipynb)** export your dataset as an Emdros database\n", "* **[annotate](annotate.ipynb)** annotate plain text by means of other tools and import the annotations as TF features\n", "* **[map](map.ipynb)** map somebody else's annotations to a new version of the corpus\n", "* **[volumes](volumes.ipynb)** work with selected books only\n", "* **[trees](trees.ipynb)** work with the BHSA data as syntax trees\n", "\n", "CC-BY Dirk Roorda" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.1" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }