{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "929221ee-e866-45ac-8bdc-c0a7d7e77dbc", "metadata": {}, "outputs": [], "source": [ "#%pip3 install text-fabric\n", "#%pip3 install textanalysis" ] }, { "cell_type": "code", "execution_count": 1, "id": "161f8b21-73ea-4a26-8356-d4f46310b290", "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 2, "id": "e6b32e84-a252-47d0-a1d9-10c287141e00", "metadata": {}, "outputs": [ { "ename": "ModuleNotFoundError", "evalue": "No module named 'tfa'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn [2], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mtf\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mapp\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m use\n\u001b[0;32m----> 2\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mtfa\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mdiff\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m collectDiffs\n", "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'tfa'" ] } ], "source": [ "from tf.app import use\n", "from tfa.diff import collectDiffs" ] }, { "cell_type": "markdown", "id": "a43f764f-62ad-43fa-afed-a0ea24fc86c4", "metadata": { "tags": [] }, "source": [ "# Systematic differences\n", "\n", "We look at the systematic differences between words.\n", "We do this with the phonetic representation.\n", "\n", "The idea is that if the same difference occurs between various word pairs, we have a morphological\n", "relationship between the pair.\n", "\n", "We load the corpus." ] }, { "cell_type": "code", "execution_count": 4, "id": "2ee1befa-ce1a-4d74-901f-b58a9d91eed7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "TF-app: ~/text-fabric-data/etcbc/bhsa/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/etcbc/bhsa/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/etcbc/phono/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/etcbc/parallels/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "This is Text-Fabric 9.2.2\n", "Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html\n", "\n", "122 features found and 0 ignored\n" ] }, { "data": { "text/html": [ "Text-Fabric: Text-Fabric API 9.2.2, etcbc/bhsa/app v3, Search Reference
Data: BHSA, Character table, Feature docs
Features:
\n", "
Parallel Passages\n", "
\n", "\n", "
\n", "
\n", "crossref\n", "
\n", "
int
\n", "
\n", " 🆗 links between similar passages\n", "
\n", "\n", "
\n", "
author:
\n", "
BHSA Data: Constantijn Sikkel; Parallels Notebook: Dirk Roorda, Martijn Naaijer
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:40:46Z
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
Parallels notebook, see https://github.com/ETCBC/parallels
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "\n", "
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis\n", "
\n", "\n", "
\n", "
\n", "book\n", "
\n", "
str
\n", "
\n", " ✅ book name in Latin (Genesis; Numeri; Reges1; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:55Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "book@ll\n", "
\n", "
str
\n", "
\n", " ✅ book name in amharic (ኣማርኛ)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:20:27Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
language:
\n", "
ኣማርኛ
\n", "
\n", "\n", "
\n", "
languageCode:
\n", "
am
\n", "
\n", "\n", "
\n", "
languageEnglish:
\n", "
amharic
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
book names from wikipedia and other sources
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "chapter\n", "
\n", "
int
\n", "
\n", " ✅ chapter number (1; 2; 3; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:55Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "code\n", "
\n", "
int
\n", "
\n", " ✅ identifier of a clause atom relationship (0; 74; 367; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:56Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "det\n", "
\n", "
str
\n", "
\n", " ✅ determinedness of phrase(atom) (det; und; NA.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:56Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "domain\n", "
\n", "
str
\n", "
\n", " ✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:57Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "freq_lex\n", "
\n", "
int
\n", "
\n", " ✅ frequency of lexemes\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:24:45Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed on the basis of the ETCBC core set of features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "function\n", "
\n", "
str
\n", "
\n", " ✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:57Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_cons\n", "
\n", "
str
\n", "
\n", " ✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:57Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_cons_utf8\n", "
\n", "
str
\n", "
\n", " ✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:58Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_lex\n", "
\n", "
str
\n", "
\n", " ✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:58Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_lex_utf8\n", "
\n", "
str
\n", "
\n", " ✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:59Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_word\n", "
\n", "
str
\n", "
\n", " ✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:04Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "g_word_utf8\n", "
\n", "
str
\n", "
\n", " ✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:04Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "gloss\n", "
\n", "
str
\n", "
\n", " 🆗 english translation of lexeme (beginning create god(s))\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:13Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "gn\n", "
\n", "
str
\n", "
\n", " ✅ grammatical gender (m; f; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:05Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "label\n", "
\n", "
str
\n", "
\n", " ✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:06Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "language\n", "
\n", "
str
\n", "
\n", " ✅ of word or lexeme (Hebrew; Aramaic.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:13Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "lex\n", "
\n", "
str
\n", "
\n", " ✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:14Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "lex_utf8\n", "
\n", "
str
\n", "
\n", " ✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "ls\n", "
\n", "
str
\n", "
\n", " ✅ lexical set, subclassification of part-of-speech (card; ques; mult)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "nametype\n", "
\n", "
str
\n", "
\n", " ⚠️ named entity type (pers; mens; gens; topo; ppde.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "nme\n", "
\n", "
str
\n", "
\n", " ✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:08Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "nu\n", "
\n", "
str
\n", "
\n", " ✅ grammatical number (sg; du; pl; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:08Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
int
\n", "
\n", " ✅ sequence number of an object within its context\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:09Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "
\n", " \n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "pargr\n", "
\n", "
str
\n", "
\n", " 🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:22:50Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional paragraph file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "pdp\n", "
\n", "
str
\n", "
\n", " ✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:10Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "pfm\n", "
\n", "
str
\n", "
\n", " ✅ preformative consonantal-transliterated (absent; n/a; J, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:11Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "prs\n", "
\n", "
str
\n", "
\n", " ✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:11Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "prs_gn\n", "
\n", "
str
\n", "
\n", " ✅ pronominal suffix gender (m; f; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:11Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "prs_nu\n", "
\n", "
str
\n", "
\n", " ✅ pronominal suffix number (sg; du; pl; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:12Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "prs_ps\n", "
\n", "
str
\n", "
\n", " ✅ pronominal suffix person (p1; p2; p3; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:12Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "ps\n", "
\n", "
str
\n", "
\n", " ✅ grammatical person (p1; p2; p3; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:12Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "qere\n", "
\n", "
str
\n", "
\n", " ✅ word pointed-transliterated masoretic reading correction\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "qere_trailer\n", "
\n", "
str
\n", "
\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "qere_trailer_utf8\n", "
\n", "
str
\n", "
\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "qere_utf8\n", "
\n", "
str
\n", "
\n", " ✅ word pointed-Hebrew masoretic reading correction\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "rank_lex\n", "
\n", "
int
\n", "
\n", " ✅ ranking of lexemes based on freqnuecy\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:24:46Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed on the basis of the ETCBC core set of features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "rela\n", "
\n", "
str
\n", "
\n", " ✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:13Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "sp\n", "
\n", "
str
\n", "
\n", " ✅ part-of-speech (art; verb; subs; nmpr, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "st\n", "
\n", "
str
\n", "
\n", " ✅ state of a noun (a (absolute); c (construct); e (emphatic).)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:14Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "tab\n", "
\n", "
int
\n", "
\n", " ✅ clause atom: its level in the linguistic embedding\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "trailer\n", "
\n", "
str
\n", "
\n", " ✅ interword material pointed-transliterated (& 00 05 00_P ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:01Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "trailer_utf8\n", "
\n", "
str
\n", "
\n", " ✅ interword material pointed-Hebrew (־ ׃)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:01Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "txt\n", "
\n", "
str
\n", "
\n", " ✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "typ\n", "
\n", "
str
\n", "
\n", " ✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "uvf\n", "
\n", "
str
\n", "
\n", " ✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "vbe\n", "
\n", "
str
\n", "
\n", " ✅ verbal ending consonantal-transliterated (n/a; W; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "vbs\n", "
\n", "
str
\n", "
\n", " ✅ root formation consonantal-transliterated (absent; n/a; H; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "verse\n", "
\n", "
int
\n", "
\n", " ✅ verse number\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:18Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "voc_lex\n", "
\n", "
str
\n", "
\n", " ✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "voc_lex_utf8\n", "
\n", "
str
\n", "
\n", " ✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "vs\n", "
\n", "
str
\n", "
\n", " ✅ verbal stem (qal; piel; hif; apel; pael)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:18Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "vt\n", "
\n", "
str
\n", "
\n", " ✅ verbal tense (perf; impv; wayq; infc)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:18Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "mother\n", "
\n", "
none
\n", "
\n", " ✅ linguistic dependency between textual objects\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:22Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "
\n", " \n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "\n", "
Phonetic Transcriptions\n", "
\n", "\n", "
\n", "
\n", "phono\n", "
\n", "
str
\n", "
\n", " 🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)\n", "
\n", "\n", "
\n", "
author:
\n", "
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:25:55Z
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the phono notebook, see https://github.com/ETCBC/phono
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n", "phono_trailer\n", "
\n", "
str
\n", "
\n", " 🆗 interword material in phonological transcription\n", "
\n", "\n", "
\n", "
author:
\n", "
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:25:55Z
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the phono notebook, see https://github.com/ETCBC/phono
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "
\n", "\n", "
\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Text-Fabric API: names N F E L T S C TF directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# A = use(\"ETCBC/bhsa:clone\", checkout=\"clone\", hoist=globals())\n", "A = use(\"ETCBC/bhsa\", hoist=globals())" ] }, { "cell_type": "markdown", "id": "52951a6f-8b0b-4aa2-b855-20d1095a7e85", "metadata": { "tags": [] }, "source": [ "# Collect\n", "\n", "We collect the differences between pairs of words." ] }, { "cell_type": "code", "execution_count": 6, "id": "dc6133a4-a7c0-4c83-8142-7cf7adbb722c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "420166 word occurrences of 60805 distinct words\n", "6686 common words\n", "Computing 22347955 comparisons\n", " 22347955 = 100 %\n", "Stored 3957198 word pairs between 6680 words\n", "Computing 3957198 diffs between word pairs\n", " 3957198 = 100 %\n", "3417632 distinct differences\n" ] } ], "source": [ "D = collectDiffs(A, \"word\", \"phono\", frequencyThreshold=6, sizeThreshold=4)" ] }, { "cell_type": "markdown", "id": "dfb59a43-9e66-4feb-a731-1eb4187fbce9", "metadata": {}, "source": [ "Here are the top-100 differences." ] }, { "cell_type": "code", "execution_count": 7, "id": "57c262da-1f1f-4327-bb8c-79b172a4ef3b", "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "seq | freq | `-` | `+` | examples\n", "--- | --- | --- | --- | --- \n", "*1* | **1919** | `ˈ` | `ˌ` | *lˈô `~` *lˌô ` ` [yhwˈāh] `~` [yhwˌāh] ` ` [yᵊhwˈih] `~` [yᵊhwˌih]\n", "*2* | **288** | ` ` | `ˈ` | [yhwˈāh] `~` [ˈyhwˈāh] ` ` [yᵊhwāh] `~` [yᵊhwˈāh] ` ` [yᵊhwˈāh] `~` [yᵊhˈwˈāh]\n", "*3* | **126** | ` ` | `ˌ` | [yᵊhwāh] `~` [yᵊhwˌāh] ` ` bbāʔîm `~` bbāʔˌîm ` ` bbᵊḵôr `~` bbᵊḵˌôr\n", "*4* | **116** | `a` | `ā` | bbˈayiṯ `~` bbˈāyiṯ ` ` bbˈayᵊṯā `~` bbˈāyᵊṯā ` ` bbˈaʕal `~` bbˈāʕal\n", "*5* | **101** | `ô` | `ō` | ggibbôrˈîm `~` ggibbōrˈîm ` ` ggāḏˈôl `~` ggāḏˈōl ` ` ggāḏˌôl `~` ggāḏˌōl\n", "*6* | **99** | `(b` | `(v` | bal- `~` val- ` ` barzˈel `~` varzˈel ` ` baṯ- `~` vaṯ-\n", "*7* | **84** | `-)` | `ˈ` | baṯ- `~` bˈaṯ ` ` bben- `~` bbˈen ` ` bên- `~` bˈên\n", "*8* | **83** | `-)` | `ˌ` | baṯ- `~` bˌaṯ ` ` bben- `~` bbˌen ` ` bên- `~` bˌên\n", "*9* | **80** | `ˈā` | `ˌa` | bbˈāyiṯ `~` bbˌayiṯ ` ` bbˈāʕal `~` bbˌaʕal ` ` bˈāyiṯ `~` bˌayiṯ\n", "*10* | **72** | ` ` | `-)` | bˈên `~` bˈên- ` ` bˈêṯ `~` bˈêṯ- ` ` bᵊnˈê `~` bᵊnˈê-\n", "*11* | **71** | `ˌ` | `ˈ` | [yhwˌāh] `~` [ˈyhwāh] ` ` bbayˌiṯ `~` bbˈayiṯ ` ` gibbôrˌê `~` gibbˈôrê\n", "*12* | **70** | `î)` | `ô)` | bittˈî `~` bittˈô ` ` bêṯˈî `~` bêṯˈô ` ` bêṯˌî `~` bêṯˌô\n", "*13* | **67** | `y` | ` ` | yyîṭˌav `~` yîṭˌav ` ` yyôm- `~` yôm- ` ` yyônˈā `~` yônˈā\n", "*14* | **62** | `ˈ` | `ˈ` | [yhwˈāh] `~` [ˈyhwāh] ` ` bānˈû `~` bˈānû ` ` bāʔˈā `~` bˈāʔā\n", "*15* | **61** | `ˈ.-)` | `ˌ` | bˈên- `~` bˌên ` ` bˈêṯ- `~` bˌêṯ ` ` bᵊnˈê- `~` bᵊnˌê\n", "*16* | **52** | `(t` | `(y` | taʕᵃlˌeh `~` yaʕᵃlˌeh ` ` taʕᵃmˈōḏ `~` yaʕᵃmˈōḏ ` ` taʕᵃśeh- `~` yaʕᵃśeh-\n", "*17* | **52** | `h` | `ḵ` | bāhˈem `~` bāḵˈem ` ` bāhˌem `~` bāḵˌem ` ` bānˈeʸhā `~` bānˈeʸḵā\n", "*18* | **51** | ` ` | `y` | yaggˈîḏû `~` yyaggˈîḏû ` ` yammˈîm `~` yyammˈîm ` ` yardˌēn `~` yyardˌēn\n", "*19* | **51** | `î)` | `ā)` | bᵊḵˌî `~` bᵊḵˌā ` ` dibbˈartî `~` dibbˈartā ` ` dibbˌartî `~` dibbˌartā\n", "*20* | **51** | `ˌ` | `ˈ.ˈ` | [yhwˌāh] `~` [ˈyhwˈāh] ` ` [yᵊhwˌāh] `~` [yᵊhˈwˈāh] ` ` [yᵊhwˌāh] `~` [yᵊˈhwˈāh]\n", "*21* | **47** | `îm)` | `ā)` | bānˈîm `~` bānˈā ` ` bānˌîm `~` bānˌā ` ` bāʔˈîm `~` bāʔˈā\n", "*22* | **46** | `ô)` | `ām)` | darkˈô `~` darkˈām ` ` kullˈô `~` kullˈām ` ` leḵtˈô `~` leḵtˈām\n", "*23* | **45** | `(d` | `(ḏ` | dabbˈēr `~` ḏabbˈēr ` ` dabbˌēr `~` ḏabbˌēr ` ` dahᵃvˈā `~` ḏahᵃvˈā\n", "*24* | **44** | `(t` | `(ṯ` | taršˈîš `~` ṯaršˈîš ` ` taʕᵃśˈeh `~` ṯaʕᵃśˈeh ` ` taʕᵃśˈû `~` ṯaʕᵃśˈû\n", "*25* | **42** | `e.ḵā)` | `ā.w)` | bānˈeʸḵā `~` bānˈāʸw ` ` dᵊrāḵˈeʸḵā `~` dᵊrāḵˈāʸw ` ` dᵊvārˈeʸḵā `~` dᵊvārˈāʸw\n", "*26* | **41** | `(k` | `(ḵ` | kikkar- `~` ḵikkar- ` ` kissˈē `~` ḵissˈē ` ` kol- `~` ḵol-\n", "*27* | **41** | `(ʔ` | `(ʕ` | ʔal- `~` ʕal- ` ` ʔattˈā `~` ʕattˈā ` ` ʔattˌā `~` ʕattˌā\n", "*28* | **41** | `e` | `ā` | ddˈereḵ `~` ddˈāreḵ ` ` ddˈever `~` ddˈāver ` ` gˈever `~` gˈāver\n", "*29* | **39** | `b` | ` ` | bben- `~` ben- ` ` bbinyāmˈin `~` binyāmˈin ` ` bbāmˈôṯ `~` bāmˈôṯ\n", "*30* | **38** | ` ` | ` . s)` | [yhwˈāh] `~` [yhwˈāh] . s ` ` [yᵊhwˈih] `~` [yᵊhwˈih] . s ` ` [yᵊhwˈāh] `~` [yᵊhwˈāh] . s\n", "*31* | **38** | `(y` | `(ṯ` | yaʕᵃvˌōr `~` ṯaʕᵃvˌōr ` ` yaʕᵃśˈeh `~` ṯaʕᵃśˈeh ` ` yaʕᵃśˈû `~` ṯaʕᵃśˈû\n", "*32* | **37** | `(y` | `(ʔ` | yaggˈîḏ `~` ʔaggˈîḏ ` ` yakkˌeh `~` ʔakkˌeh ` ` yarbˈeh `~` ʔarbˈeh\n", "*33* | **36** | ` ` | `û)` | haggˈîḏ `~` haggˈîḏû ` ` heḥᵉzˈîq `~` heḥᵉzˈîqû ` ` hiṣṣˌîl `~` hiṣṣˌîlû\n", "*34* | **35** | `(f` | `(p` | farʕˈō `~` parʕˈō ` ` farʕˌō `~` parʕˌō ` ` fānāʸw `~` pānāʸw\n", "*35* | **35** | `î)` | `ām)` | hinnˈî `~` hinnˈām ` ` hinnˌî `~` hinnˌām ` ` libbˈî `~` libbˈām\n", "*36* | **35** | `û)` | `ā)` | bānˈû `~` bānˈā ` ` bˈāʔû `~` bˈāʔā ` ` haqšˈîvû `~` haqšˈîvā\n", "*37* | **34** | `aṯ)` | `ā)` | givʕˈaṯ `~` givʕˈā ` ` minḥˈaṯ `~` minḥˈā ` ` minḥˌaṯ `~` minḥˌā\n", "*38* | **33** | `îm)` | `āʸw)` | bānîm `~` bānāʸw ` ` bānˈîm `~` bānˈāʸw ` ` bānˌîm `~` bānˌāʸw\n", "*39* | **33** | `ˈā` | `ˌe` | ddˈāreḵ `~` ddˌereḵ ` ` ddˈāver `~` ddˌever ` ` gˈāver `~` gˌever\n", "*40* | **33** | `ˌ` | `ˌ` | bbayˌiṯ `~` bbˌayiṯ ` ` hāyˌû `~` hˌāyû ` ` hāyˌā `~` hˌāyā\n", "*41* | **32** | ` ` | `m` | malkˈā `~` mmalkˈā ` ` malʔāḵˈîm `~` mmalʔāḵˈîm ` ` malʔˈāḵ `~` mmalʔˈāḵ\n", "*42* | **32** | `ˈî)` | `ˌô)` | bêṯˈî `~` bêṯˌô ` ` bᵊnˈî `~` bᵊnˌô ` ` libbˈî `~` libbˌô\n", "*43* | **31** | `ay)` | `āʸw)` | dᵊvārˈay `~` dᵊvārˈāʸw ` ` fānˈay `~` fānˈāʸw ` ` fānˌay `~` fānˌāʸw\n", "*44* | **31** | `ā` | `ē` | ggˈār `~` ggˈēr ` ` ggˌār `~` ggˌēr ` ` hāmˈîṯ `~` hēmˈîṯ\n", "*45* | **31** | `ā` | `ᵊ` | bānˈôṯ `~` bᵊnˈôṯ ` ` kāvˈôḏ `~` kᵊvˈôḏ ` ` lāšˈôn `~` lᵊšˈôn\n", "*46* | **30** | `(b.ˌ` | `(v.ˈ` | barzˌel `~` varzˈel ` ` ben-hᵃḏˌaḏ `~` ven-hᵃḏˈaḏ ` ` biltˌî `~` viltˈî\n", "*47* | **30** | `(bb` | `(v` | bbarzˈel `~` varzˈel ` ` bben- `~` ven- ` ` bbinyāmˈin `~` vinyāmˈin\n", "*48* | **30** | `(t` | `(yy` | taʕᵃmˈōḏ `~` yyaʕᵃmˈōḏ ` ` taʕᵃśˈû `~` yyaʕᵃśˈû ` ` taʕᵃśˌû `~` yyaʕᵃśˌû\n", "*49* | **30** | `eḵā)` | `ô)` | bêṯˈeḵā `~` bêṯˈô ` ` darkˈeḵā `~` darkˈô ` ` kᵊvôḏˈeḵā `~` kᵊvôḏˈô\n", "*50* | **30** | `ˈ. . s)` | `ˌ` | [yhwˈāh] . s `~` [yhwˌāh] ` ` [yᵊhwˈih] . s `~` [yᵊhwˌih] ` ` [yᵊhwˈāh] . s `~` [yᵊhwˌāh]\n", "*51* | **29** | `y.ˌ` | `ˈ` | yyaʕˌal `~` yˈaʕal ` ` yyaʕˌan `~` yˈaʕan ` ` yyôšˌēv `~` yôšˈēv\n", "*52* | **29** | `êh.em)` | `ê)` | bᵊnêhˈem `~` bᵊnˈê ` ` bᵊnêhˌem `~` bᵊnˌê ` ` fᵊnêhˈem `~` fᵊnˈê\n", "*53* | **29** | `ˈô)` | `ˌî)` | bêṯˈô `~` bêṯˌî ` ` bᵊnˈô `~` bᵊnˌî ` ` bᵊrîṯˈô `~` bᵊrîṯˌî\n", "*54* | **28** | `ˈô)` | `ˌ` | bêṯˈô `~` bˌêṯ ` ` bᵊrîṯˈô `~` bᵊrˌîṯ ` ` dāmˈô `~` dˌām\n", "*55* | **27** | `(b.ˈ` | `(v.ˌ` | biltˈî `~` viltˌî ` ` binyāmˈin `~` vinyāmˌin ` ` bittˈô `~` vittˌô\n", "*56* | **27** | `(tt` | `(yy` | ttaʕᵃmˌōḏ `~` yyaʕᵃmˌōḏ ` ` ttippˈōl `~` yyippˈōl ` ` ttippˌōl `~` yyippˌōl\n", "*57* | **27** | `(š` | `(ʔ` | šammˈā `~` ʔammˈā ` ` šammˌā `~` ʔammˌā ` ` šiššˈā `~` ʔiššˈā\n", "*58* | **27** | `ay)` | `eʸḵā)` | dᵊvārˈay `~` dᵊvārˈeʸḵā ` ` fānˈay `~` fānˈeʸḵā ` ` fānˌay `~` fānˌeʸḵā\n", "*59* | **27** | `ô)` | `ᵊḵ.ā)` | libbˈô `~` libbᵊḵˈā ` ` libbˌô `~` libbᵊḵˌā ` ` llˈô `~` llᵊḵˈā\n", "*60* | **27** | `ô.ˌ` | `ō.ˈ` | hôlˌēḵ `~` hōlˈēḵ ` ` môšˌēl `~` mōšˈēl ` ` qôlˌî `~` qōlˈî\n", "*61* | **26** | `-)` | `(ˈ` | bên- `~` ˈbên ` ` bêṯ- `~` ˈbêṯ ` ` gam- `~` ˈgam\n", "*62* | **26** | `m` | ` ` | mmôʕˈēḏ `~` môʕˈēḏ ` ` mmāqôm `~` māqôm ` ` mmāqˈôm `~` māqˈôm\n", "*63* | **26** | `n` | ` ` | nnoḵrˈî `~` noḵrˈî ` ` nnôrˈā `~` nôrˈā ` ` nnāhˈār `~` nāhˈār\n", "*64* | **26** | `î)` | `ᵊḵ.ā)` | hinnˌî `~` hinnᵊḵˌā ` ` libbˈî `~` libbᵊḵˈā ` ` libbˌî `~` libbᵊḵˌā\n", "*65* | **26** | `ô` | `ē` | hˈôn `~` hˈēn ` ` kkāvˈôḏ `~` kkāvˈēḏ ` ` kāvˈôḏ `~` kāvˈēḏ\n", "*66* | **26** | `ô)` | `āh)` | bêṯˈô `~` bêṯˈāh ` ` bᵊnˈô `~` bᵊnˈāh ` ` bᵊnˌô `~` bᵊnˌāh\n", "*67* | **26** | `ô.ˈ` | `ō.ˌ` | ggᵊḏôlˈā `~` ggᵊḏōlˌā ` ` hôlˈēḵ `~` hōlˌēḵ ` ` mišpᵊḥôṯˈām `~` mišpᵊḥōṯˌām\n", "*68* | **26** | `ā` | `ō` | hālˈāḵ `~` hālˈōḵ ` ` hˈār `~` hˈōr ` ` lˈā- `~` lˈō-\n", "*69* | **25** | `(g` | `(ḡ` | gam- `~` ḡam- ` ` gibbˈôr `~` ḡibbˈôr ` ` gilʕˈāḏ `~` ḡilʕˈāḏ\n", "*70* | **25** | `(t.ˈ` | `(y.ˌ` | taqrˈîv `~` yaqrˌîv ` ` taʕᵃmˈōḏ `~` yaʕᵃmˌōḏ ` ` taʕᵃśˈeh `~` yaʕᵃśˌeh\n", "*71* | **24** | `eḵā)` | `î)` | bêṯˈeḵā `~` bêṯˈî ` ` libbˈeḵā `~` libbˈî ` ` lᵊvāvˈeḵā `~` lᵊvāvˈî\n", "*72* | **24** | `l` | `ś` | maʕᵃlˈē `~` maʕᵃśˈē ` ` naʕᵃlˈeh `~` naʕᵃśˈeh ` ` taʕᵃlˌeh `~` taʕᵃśˌeh\n", "*73* | **24** | `ê)` | `îm)` | bāttˌê `~` bāttˌîm ` ` hārˈê `~` hārˈîm ` ` hārˌê `~` hārˌîm\n", "*74* | **24** | `ˈ` | `y.ˌ` | yaʕᵃlˈû `~` yyaʕᵃlˌû ` ` yaʕᵃmˈōḏ `~` yyaʕᵃmˌōḏ ` ` yaʕᵃvˈōr `~` yyaʕᵃvˌōr\n", "*75* | **24** | `ˈîm)` | `ˌā)` | bānˈîm `~` bānˌā ` ` ggᵊḏōlˈîm `~` ggᵊḏōlˌā ` ` hārˈîm `~` hārˌā\n", "*76* | **24** | `ˌ` | `(ˈ` | bᵊnˌî `~` ˈbᵊnî ` ` leḥˌem `~` ˈleḥem ` ` libbˌî `~` ˈlibbî\n", "*77* | **23** | `a` | `i` | dabber- `~` dibber- ` ` dabbˈēr `~` dibbˈēr ` ` dabbᵊrˈû `~` dibbᵊrˈû\n", "*78* | **23** | `ˌ` | `y.ˈ` | yakkˌeh `~` yyakkˈeh ` ` yardˌēn `~` yyardˈēn ` ` yaʕᵃlˌû `~` yyaʕᵃlˈû\n", "*79* | **22** | `(tt` | `(y` | ttaʕᵃmˌōḏ `~` yaʕᵃmˌōḏ ` ` ttippˈōl `~` yippˈōl ` ` ttippˌōl `~` yippˌōl\n", "*80* | **22** | `m)` | `ᵊḵ` | bˈām `~` bᵊḵˈā ` ` bˌām `~` bᵊḵˌā ` ` hinnˌām `~` hinnᵊḵˌā\n", "*81* | **22** | `ā.a` | `ō.ē` | hālaḵ `~` hōlēḵ ` ` hālˈaḵ `~` hōlˈēḵ ` ` hālˌaḵ `~` hōlˌēḵ\n", "*82* | **22** | `ˈî)` | `ˌ` | bênˈî `~` bˌên ` ` bêṯˈî `~` bˌêṯ ` ` bᵊrîṯˈî `~` bᵊrˌîṯ\n", "*83* | **22** | `ˈā)` | `ˌîm)` | bānˈā `~` bānˌîm ` ` bāʔˈā `~` bāʔˌîm ` ` mmᵊlāḵˈā `~` mmᵊlāḵˌîm\n", "*84* | **21** | ` ` | `ᵊ` | [yhwˈāh] `~` [yᵊhwˈāh] ` ` [yhwˈāh] . f `~` [yᵊhwˈāh] . f ` ` [yhwˈāh] . s `~` [yᵊhwˈāh] . s\n", "*85* | **21** | `(t.ˌ` | `(y.ˈ` | taʕᵃlˌeh `~` yaʕᵃlˈeh ` ` taʕᵃśˌeh `~` yaʕᵃśˈeh ` ` taʕᵃśˌû `~` yaʕᵃśˈû\n", "*86* | **21** | `eʸḵā)` | `îm)` | bānˈeʸḵā `~` bānˈîm ` ` dᵊvārˈeʸḵā `~` dᵊvārˈîm ` ` fānˈeʸḵā `~` fānˈîm\n", "*87* | **21** | `h)` | `m)` | bˈāh `~` bˈām ` ` bˌāh `~` bˌām ` ` kullˈāh `~` kullˈām\n", "*88* | **21** | `î)` | `āh)` | bêṯˈî `~` bêṯˈāh ` ` bᵊnˈî `~` bᵊnˈāh ` ` bᵊnˌî `~` bᵊnˌāh\n", "*89* | **21** | `ôṯ)` | `ā)` | bānˈôṯ `~` bānˈā ` ` mamlāḵˈôṯ `~` mamlāḵˈā ` ` mmᵊḏînˈôṯ `~` mmᵊḏînˈā\n", "*90* | **21** | `û` | `ā` | hāyû- `~` hāyā- ` ` mmᵊlûḵˈā `~` mmᵊlāḵˈā ` ` mmᵊlûḵˌā `~` mmᵊlāḵˌā\n", "*91* | **20** | `(yi` | `(ʔe` | yifqˈōḏ `~` ʔefqˈōḏ ` ` yihyˈeh `~` ʔehyˈeh ` ` yihyˌeh `~` ʔehyˌeh\n", "*92* | **20** | `(yy` | `(ṯ` | yyaʕᵃvˌōr `~` ṯaʕᵃvˌōr ` ` yyaʕᵃśˈû `~` ṯaʕᵃśˈû ` ` yyaʕᵃśˌû `~` ṯaʕᵃśˌû\n", "*93* | **20** | `î` | `ē` | haggˈîḏ `~` haggˈēḏ ` ` hinnî- `~` hinnē- ` ` hôšˈîₐʕ `~` hôšˈēₐʕ\n", "*94* | **20** | `ˈ` | `(ˈ` | bᵊnˈî `~` ˈbᵊnî ` ` libbˈî `~` ˈlibbî ` ` lāhˈem `~` ˈlāhem\n", "*95* | **20** | `ˈ.û)` | `ˌ` | haggˈîḏû `~` haggˌîḏ ` ` hēšˈîvû `~` hēšˌîv ` ` mˈēṯû `~` mˌēṯ\n", "*96* | **20** | `ˈeḵā)` | `ˌô)` | bêṯˈeḵā `~` bêṯˌô ` ` libbˈeḵā `~` libbˌô ` ` lᵊvāvˈeḵā `~` lᵊvāvˌô\n", "*97* | **20** | `ˈî)` | `ˈ` | bênˈî `~` bˈên ` ` bêṯˈî `~` bˈêṯ ` ` ggilʕāḏˈî `~` ggilʕˈāḏ\n", "*98* | **20** | `ˈô)` | `ˈ` | bêṯˈô `~` bˈêṯ ` ` bᵊrîṯˈô `~` bᵊrˈîṯ ` ` dāmˈô `~` dˈām\n", "*99* | **20** | `ˌā)` | `ˈ` | gērˌā `~` gˈēr ` ` hārˌā `~` hˈār ` ` malᵊḵûṯˌā `~` malᵊḵˈûṯ\n", "*100* | **19** | ` ` | `m)` | malkˈā `~` malkˈām ` ` malkˌā `~` malkˌām ` ` mišpāṭˈî `~` mišpāṭˈîm" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "D.showDiffs(0, 100)" ] }, { "cell_type": "markdown", "id": "17ed79aa-b966-4f44-9570-dcf3371996a7", "metadata": {}, "source": [ "Now in ETCBC transcription." ] }, { "cell_type": "code", "execution_count": 8, "id": "bab8c3b0-29d3-48ac-b9ab-d5af9e92fbd3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "426590 word occurrences of 92756 distinct words\n", "7254 common words\n", "Computing 26306631 comparisons\n", " 26306631 = 100 %\n", "Stored 1567423 word pairs between 7237 words\n", "Computing 1567423 diffs between word pairs\n", " 1567423 = 100 %\n", "1291898 distinct differences\n" ] } ], "source": [ "D = collectDiffs(A, \"word\", \"g_word\", frequencyThreshold=6, sizeThreshold=4)" ] }, { "cell_type": "markdown", "id": "535780a7-eec6-448b-8150-6e5acba4d808", "metadata": {}, "source": [ "Here are the top-100 differences." ] }, { "cell_type": "code", "execution_count": 9, "id": "d8d19c40-bfb0-4653-bf74-4c40f6baacd2", "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "seq | freq | `-` | `+` | examples\n", "--- | --- | --- | --- | --- \n", "*1* | **672** | `1` | `4` | 11B.:- `~` 14B.:- ` ` 11B.A- `~` 14B.A- ` ` 11B.I- `~` 14B.I-\n", "*2* | **546** | `.` | ` ` | 11B.:- `~` 11B:- ` ` 13B.:- `~` 13B:- ` ` B.:>;71R `~` B:>;71R\n", "*3* | **541** | `3` | `4` | 13B.:- `~` 14B.:- ` ` 13B.A- `~` 14B.A- ` ` 13B.I- `~` 14B.I-\n", "*4* | **510** | `1` | `3` | 11B.:- `~` 13B.:- ` ` 11B.A- `~` 13B.A- ` ` 11B.I- `~` 13B.I-\n", "*5* | **501** | `0` | `1` | <:ADA70T `~` <:ADA71T ` ` <:AFO70WT `~` <:AFO71WT ` ` <:AFO80WT `~` <:AFO81WT\n", "*6* | **465** | `73` | `80` | <:AB@D@73JW `~` <:AB@D@80JW ` ` <:AB@DE73JK@ `~` <:AB@DE80JK@ ` ` <:AB@DI73JM `~` <:AB@DI80JM\n", "*7* | **439** | `75` | `92` | <:AB@D@75JW `~` <:AB@D@92JW ` ` <:AB@DE75JK@ `~` <:AB@DE92JK@ ` ` <:AB@DI75JM `~` <:AB@DI92JM\n", "*8* | **383** | `80` | `92` | <:AB@D@80JW `~` <:AB@D@92JW ` ` <:AB@DE80JK@ `~` <:AB@DE92JK@ ` ` <:AB@DI80JM `~` <:AB@DI92JM\n", "*9* | **368** | `75` | `80` | <:AB@D@75JW `~` <:AB@D@80JW ` ` <:AB@DE75JK@ `~` <:AB@DE80JK@ ` ` <:AB@DI75JM `~` <:AB@DI80JM\n", "*10* | **365** | `3` | `5` | <:AB@D@73JW `~` <:AB@D@75JW ` ` <:AB@DE73JK@ `~` <:AB@DE75JK@ ` ` <:AB@DI73JM `~` <:AB@DI75JM\n", "*11* | **356** | `73` | `92` | <:AB@D@73JW `~` <:AB@D@92JW ` ` <:AB@DE73JK@ `~` <:AB@DE92JK@ ` ` <:AB@DI73JM `~` <:AB@DI92JM\n", "*12* | **316** | `7` | `8` | <:AFO70WT `~` <:AFO80WT ` ` <:AFO71WT `~` <:AFO81WT ` ` <:AFO75WT `~` <:AFO85WT\n", "*13* | **290** | `74` | `80` | <:AB@DE74JK@ `~` <:AB@DE80JK@ ` ` <:AB@DI74JM `~` <:AB@DI80JM ` ` <:AFO74WT `~` <:AFO80WT\n", "*14* | **284** | `0` | `4` | 10;T `~` 14>;T ` ` 10D.IJ `~` 14D.IJ\n", "*15* | **260** | `73` | `81` | <:AFO73WT `~` <:AFO81WT ` ` <:AL;JHE73M `~` <:AL;JHE81M ` ` <:AL;JKE73M `~` <:AL;JKE81M\n", "*16* | **238** | `7` | `9` | <:AFO71WT `~` <:AFO91WT ` ` <;71Y `~` <;91Y ` ` <@71M `~` <@91M\n", "*17* | **235** | `0` | `3` | 10>AP `~` 13>AP ` ` 10>EREY `~` 13>EREY ` ` 10>IJC `~` 13>IJC\n", "*18* | **216** | `4` | `5` | <:AB@DE74JK@ `~` <:AB@DE75JK@ ` ` <:AB@DI74JM `~` <:AB@DI75JM ` ` <:AFIJTE74M `~` <:AFIJTE75M\n", "*19* | **215** | `71` | `80` | <:AB@D@71JW `~` <:AB@D@80JW ` ` <:AFO71WT `~` <:AFO80WT ` ` <;71Y `~` <;80Y\n", "*20* | **215** | `74` | `81` | <:AFO74WT `~` <:AFO81WT ` ` <:AL;JHE74M `~` <:AL;JHE81M ` ` <;74T `~` <;81T\n", "*21* | **212** | `74` | `92` | <:AB@DE74JK@ `~` <:AB@DE92JK@ ` ` <:AB@DI74JM `~` <:AB@DI92JM ` ` <:AF74W. `~` <:AF92W.\n", "*22* | **199** | `1` | `5` | <:AB@D@71JW `~` <:AB@D@75JW ` ` <:AFO71WT `~` <:AFO75WT ` ` <:AFO81WT `~` <:AFO85WT\n", "*23* | **199** | `73` | `91` | <:AFO73WT `~` <:AFO91WT ` ` <:AL;JHE73M `~` <:AL;JHE91M ` ` <:AL;JKE73M `~` <:AL;JKE91M\n", "*24* | **195** | `6` | `7` | <:AFO63WT `~` <:AFO73WT ` ` <@61M `~` <@71M ` ` <@63M `~` <@73M\n", "*25* | **188** | `81` | `92` | <:AFO81WT `~` <:AFO92WT ` ` <:AL;JHE81M `~` <:AL;JHE92M ` ` <:AL;JKE81M `~` <:AL;JKE92M\n", "*26* | **179** | `75` | `81` | <:AFO75WT `~` <:AFO81WT ` ` <:AL;JHE75M `~` <:AL;JHE81M ` ` <:AL;JKE75M `~` <:AL;JKE81M\n", "*27* | **173** | `74` | `91` | <:AFO74WT `~` <:AFO91WT ` ` <:AL;JHE74M `~` <:AL;JHE91M ` ` <;74Y `~` <;91Y\n", "*28* | **161** | `71` | `92` | <:AB@D@71JW `~` <:AB@D@92JW ` ` <:AF@R@71H `~` <:AF@R@92H ` ` <:AFO71WT `~` <:AFO92WT\n", "*29* | **157** | `63` | `71` | <:AFO63WT `~` <:AFO71WT ` ` <@63M `~` <@71M ` ` <@F63W. `~` <@F71W.\n", "*30* | **157** | `63` | `74` | <:AFO63WT `~` <:AFO74WT ` ` <@63M `~` <@74M ` ` <@F63W. `~` <@F74W.\n", "*31* | **152** | `80` | `91` | <:AFO80WT `~` <:AFO91WT ` ` <:AL;JHE80M `~` <:AL;JHE91M ` ` <:AL;JKE80M `~` <:AL;JKE91M\n", "*32* | **144** | `8` | `9` | <:AFO81WT `~` <:AFO91WT ` ` <:AL;JHE81M `~` <:AL;JHE91M ` ` <:AL;JKE81M `~` <:AL;JKE91M\n", "*33* | **137** | `63` | `70` | <:AFO63WT `~` <:AFO70WT ` ` <@63M `~` <@70M ` ` <@F63W. `~` <@F70W.\n", "*34* | **129** | `1` | `2` | <:AFO91WT `~` <:AFO92WT ` ` <:AL;JHE91M `~` <:AL;JHE92M ` ` <:AL;JKE91M `~` <:AL;JKE92M\n", "*35* | **125** | `75` | `91` | <:AFO75WT `~` <:AFO91WT ` ` <:AL;JHE75M `~` <:AL;JHE91M ` ` <:AL;JKE75M `~` <:AL;JKE91M\n", "*36* | **119** | `70` | `91` | <:AFO70WT `~` <:AFO91WT ` ` <:AL;JHE70M `~` <:AL;JHE91M ` ` <;70Y `~` <;91Y\n", "*37* | **115** | `70` | `81` | <:AFO70WT `~` <:AFO81WT ` ` <:AL;JHE70M `~` <:AL;JHE81M ` ` <;70T `~` <;81T\n", "*38* | **113** | `0` | `5` | <:AFO70WT `~` <:AFO75WT ` ` <:AFO80WT `~` <:AFO85WT ` ` <:AL;JHE70M `~` <:AL;JHE75M\n", "*39* | **105** | `74` | ` ` | <:AL;74J `~` <:AL;J ` ` <:AY;74J `~` <:AY;J ` ` `~` B@74>\n", "*55* | **76** | `61` | `75` | <:AL;JHE61M `~` <:AL;JHE75M ` ` <@61M `~` <@75M ` ` <@F@61H `~` <@F@75H\n", "*56* | **76** | `70` | `94` | <@F@70H `~` <@F@94H ` ` <@L@70JW `~` <@L@94JW ` ` <@R;70J `~` <@R;94J\n", "*57* | **73** | `..4` | `1` | B.:>;74R `~` B:>;71R ` ` B.:N;74J `~` B:N;71J ` ` B.:NO74WT `~` B:NO71WT\n", "*58* | **73** | `33.03)` | `80` | <:AB@DE33JK@03 `~` <:AB@DE80JK@ ` ` <;JNE33JK@03 `~` <;JNE80JK@ ` ` <@FI33JT@03 `~` <@FI80JT@\n", "*59* | **73** | `63` | `91` | <:AFO63WT `~` <:AFO91WT ` ` <@63M `~` <@91M ` ` <@F@63H `~` <@F@91H\n", "*60* | **72** | `73` | `94` | <@F@73H `~` <@F@94H ` ` <@L@73JW `~` <@L@94JW ` ` :ACE70R `~` >:ACER\n", "*72* | **61** | `45` | ` ` | >:ACE45R `~` >:ACER ` ` >:ELO45H;JHE92M `~` >:ELOH;JHE92M ` ` >:ELO45H;JKE80M `~` >:ELOH;JKE80M\n", "*73* | **61** | `63` | `80` | <:AFO63WT `~` <:AFO80WT ` ` <@63M `~` <@80M ` ` <@F63W. `~` <@F80W.\n", "*74* | **60** | `..92` | `75` | B.:H;M@92H `~` B:H;M@75H ` ` B.:NO92W `~` B:NO75W ` ` B.;JTO92W `~` B;JTO75W\n", "*75* | **60** | `33.03)` | `71` | <@FI33JT@03 `~` <@FI71JT@ ` ` :ANA33X:NW.03 `~` >:ANA71X:NW.\n", "*76* | **59** | `..0` | `1` | B.:N;70J `~` B:N;71J ` ` B.;70JN `~` B;71JN ` ` B.;70JT `~` B;71JT\n", "*77* | **59** | `..73` | `80` | B.:NI73J `~` B:NI80J ` ` B.:NO73W `~` B:NO80W ` ` B.;JTO73W `~` B;JTO80W\n", "*78* | **58** | `..1` | `0` | B.:N;71J `~` B:N;70J ` ` B.:NO71WT `~` B:NO70WT ` ` B.:NO81W `~` B:NO80W\n", "*79* | **58** | `..1` | `3` | B.::ABOT@73M `~` >:ABOWT@73M\n", "*81* | **55** | `H` | `K` | <:AL;JHE73M `~` <:AL;JKE73M ` ` <:AL;JHE75M `~` <:AL;JKE75M ` ` <:AL;JHE80M `~` <:AL;JKE80M\n", "*82* | **54** | `..75` | `92` | B.;JTO75W `~` B;JTO92W ` ` B.@75H. `~` B@92H. ` ` B.@75K: `~` B@92K:\n", "*83* | **53** | `(B.` | `(L` | B.:45- `~` L:45- ` ` B.:K@73 `~` L:K@73 ` ` B.;45- `~` L;45-\n", "*84* | **53** | `..8` | `7` | B.@80> `~` B@70> ` ` B.@81> `~` B@71> ` ` B.A81:ACE63R `~` >:ACER ` ` >:ANI63J `~` >:ANIJ\n", "*95* | **48** | ` ` | `03)` | :ACER `~` >:ACER03\n", "*96* | **47** | `..80` | `74` | B.@80> `~` B@74> ` ` B.@N@80JW `~` B@N@74JW ` ` B.I80J `~` BI74J\n", "*97* | **47** | `..80` | `75` | B.:H;M@80H `~` B:H;M@75H ` ` B.:NO80W `~` B:NO75W ` ` B.;JTO80W `~` B;JTO75W\n", "*98* | **47** | `63` | `92` | <:AFO63WT `~` <:AFO92WT ` ` <@63M `~` <@92M ` ` <@F63W. `~` <@F92W.\n", "*99* | **46** | `(B` | `(P` | B.:N;45J `~` P.:N;45J ` ` B.:N;63J `~` P.:N;63J ` ` B.:N;70J `~` P.:N;70J\n", "*100* | **45** | `(<` | `(>` | <:AL;JHE61M `~` >:AL;JHE61M ` ` <:AL;JHE73M `~` >:AL;JHE73M ` ` <:AL;JHE75M `~` >:AL;JHE75M" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "D.showDiffs(0, 100)" ] }, { "cell_type": "markdown", "id": "1358ed3c-fda9-4809-9af9-b9f052a2fe30", "metadata": {}, "source": [ "Fully pointed Hebrew" ] }, { "cell_type": "code", "execution_count": 10, "id": "6a6d9b55-4207-4a91-8751-e0ad5dec4de1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "420102 word occurrences of 92473 distinct words\n", "7158 common words\n", "Computing 25614903 comparisons\n", " 25614903 = 100 %\n", "Stored 2333244 word pairs between 7141 words\n", "Computing 2333244 diffs between word pairs\n", " 2333244 = 100 %\n", "2098279 distinct differences\n" ] } ], "source": [ "D = collectDiffs(A, \"word\", \"g_word_utf8\", frequencyThreshold=6, sizeThreshold=4)" ] }, { "cell_type": "markdown", "id": "2a05324a-5c68-45be-bbeb-c594ba2f0def", "metadata": {}, "source": [ "Here are the top-100 differences." ] }, { "cell_type": "code", "execution_count": 11, "id": "8de48f3a-e1b8-4da0-820d-e976ae30a98a", "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "seq | freq | `-` | `+` | examples\n", "--- | --- | --- | --- | --- \n", "*1* | **619** | `֣` | `֥` | אֱכֹ֣ל `~` אֱכֹ֥ל ` ` אֱלָהָ֣א `~` אֱלָהָ֥א ` ` אֱלֹהִ֣ים `~` אֱלֹהִ֥ים\n", "*2* | **539** | ` ` | `ּ` | בְאֵ֥ר `~` בְּאֵ֥ר ` ` בְהֵמָ֖ה `~` בְּהֵמָ֖ה ` ` בְהֵמָֽה `~` בְּהֵמָֽה\n", "*3* | **523** | `֖` | `֣` | אֱלָהָ֖א `~` אֱלָהָ֣א ` ` אֱלֹהִ֖ים `~` אֱלֹהִ֣ים ` ` אֱלֹהֵ֖י `~` אֱלֹהֵ֣י\n", "*4* | **468** | `֔` | `֖` | אֱדֹ֔ום `~` אֱדֹ֖ום ` ` אֱלֹהִ֔ים `~` אֱלֹהִ֖ים ` ` אֱלֹהֵ֔ינוּ `~` אֱלֹהֵ֖ינוּ\n", "*5* | **445** | `֖` | `֥` | אֱלָהָ֖א `~` אֱלָהָ֥א ` ` אֱלֹהִ֖ים `~` אֱלֹהִ֥ים ` ` אֱלֹהֵ֖י `~` אֱלֹהֵ֥י\n", "*6* | **442** | `֑` | `ֽ` | אֱדֹ֑ום `~` אֱדֹֽום ` ` אֱלֹהִ֑ים `~` אֱלֹהִֽים ` ` אֱלֹהֵ֑ינוּ `~` אֱלֹהֵֽינוּ\n", "*7* | **404** | `֖` | `ֽ` | אֱדֹ֖ום `~` אֱדֹֽום ` ` אֱלֹהִ֖ים `~` אֱלֹהִֽים ` ` אֱלֹהֵ֖י `~` אֱלֹהֵֽי\n", "*8* | **383** | `֑` | `֔` | אֱדֹ֑ום `~` אֱדֹ֔ום ` ` אֱלֹהִ֑ים `~` אֱלֹהִ֔ים ` ` אֱלֹהֵ֑ינוּ `~` אֱלֹהֵ֔ינוּ\n", "*9* | **378** | `֔` | `ֽ` | אֱדֹ֔ום `~` אֱדֹֽום ` ` אֱלִישָׁ֔ע `~` אֱלִישָֽׁע ` ` אֱלֹהִ֔ים `~` אֱלֹהִֽים\n", "*10* | **357** | `֑` | `֖` | אֱדֹ֑ום `~` אֱדֹ֖ום ` ` אֱלֹהִ֑ים `~` אֱלֹהִ֖ים ` ` אֱלֹהֵ֑ינוּ `~` אֱלֹהֵ֖ינוּ\n", "*11* | **293** | `֔` | `֣` | אֱלִישָׁ֔ע `~` אֱלִישָׁ֣ע ` ` אֱלֹהִ֔ים `~` אֱלֹהִ֣ים ` ` אֱמֶ֔ת `~` אֱמֶ֣ת\n", "*12* | **291** | `֣` | `ֽ` | אֱלִישָׁ֣ע `~` אֱלִישָֽׁע ` ` אֱלֹ֣הֵיכֶ֔ם `~` אֱלֹֽהֵיכֶ֔ם ` ` אֱלֹהִ֣ים `~` אֱלֹהִֽים\n", "*13* | **288** | `֣` | `֤` | אֱלֹהִ֣ים `~` אֱלֹהִ֤ים ` ` אֱלֹהֵ֣י `~` אֱלֹהֵ֤י ` ` אֲדֹנָ֣י `~` אֲדֹנָ֤י\n", "*14* | **272** | `֤` | `֥` | אֱלֹהִ֤ים `~` אֱלֹהִ֥ים ` ` אֱלֹהֵ֤י `~` אֱלֹהֵ֥י ` ` אֲדֹנָ֤י `~` אֲדֹנָ֥י\n", "*15* | **260** | `֖` | `֗` | אֱדֹ֖ום `~` אֱדֹ֗ום ` ` אֱלֹהִ֖ים `~` אֱלֹהִ֗ים ` ` אֱלֹהֵ֖ינוּ `~` אֱלֹהֵ֗ינוּ\n", "*16* | **244** | `֥` | `ֽ` | אֱלֹהִ֥ים `~` אֱלֹהִֽים ` ` אֱלֹהֵ֥י `~` אֱלֹהֵֽי ` ` אֲבִ֥י `~` אֲבִֽי\n", "*17* | **234** | `֔` | `֗` | אֱדֹ֔ום `~` אֱדֹ֗ום ` ` אֱלֹֽהֵיכֶ֔ם `~` אֱלֹֽהֵיכֶ֗ם ` ` אֱלֹהִ֔ים `~` אֱלֹהִ֗ים\n", "*18* | **220** | `֖` | `֤` | אֱלֹהִ֖ים `~` אֱלֹהִ֤ים ` ` אֱלֹהֵ֖י `~` אֱלֹהֵ֤י ` ` אֲדֹנָ֖י `~` אֲדֹנָ֤י\n", "*19* | **218** | `֔` | `֥` | אֱלֹהִ֔ים `~` אֱלֹהִ֥ים ` ` אֲדֹנִ֔י `~` אֲדֹנִ֥י ` ` אֲדֹנָ֔י `~` אֲדֹנָ֥י\n", "*20* | **214** | `֗` | `֣` | אֱלֹהִ֗ים `~` אֱלֹהִ֣ים ` ` אֱמֶ֗ת `~` אֱמֶ֣ת ` ` אֱמֹ֗ר `~` אֱמֹ֣ר\n", "*21* | **213** | `֑` | `֣` | אֱלֹהִ֑ים `~` אֱלֹהִ֣ים ` ` אֱמֶ֑ת `~` אֱמֶ֣ת ` ` אֲבָנִ֑ים `~` אֲבָנִ֣ים\n", "*22* | **199** | `֖` | `֛` | אֱלֹהִ֖ים `~` אֱלֹהִ֛ים ` ` אֱלֹהֵ֖ינוּ `~` אֱלֹהֵ֛ינוּ ` ` אֱלֹהֵיכֶ֖ם `~` אֱלֹהֵיכֶ֛ם\n", "*23* | **193** | `֗` | `ֽ` | אֱדֹ֗ום `~` אֱדֹֽום ` ` אֱלֹהִ֗ים `~` אֱלֹהִֽים ` ` אֱלֹהֵ֗ינוּ `~` אֱלֹהֵֽינוּ\n", "*24* | **189** | `֑` | `֗` | אֱדֹ֑ום `~` אֱדֹ֗ום ` ` אֱלֹהִ֑ים `~` אֱלֹהִ֗ים ` ` אֱלֹהֵ֑ינוּ `~` אֱלֹהֵ֗ינוּ\n", "*25* | **173** | `֛` | `֣` | אֱלֹהִ֛ים `~` אֱלֹהִ֣ים ` ` אֲנִ֛י `~` אֲנִ֣י ` ` אֲנָשִׁ֛ים `~` אֲנָשִׁ֣ים\n", "*26* | **172** | `֗` | `֥` | אֱלֹהִ֗ים `~` אֱלֹהִ֥ים ` ` אֱמֹ֗ר `~` אֱמֹ֥ר ` ` אֲדֹנָ֗י `~` אֲדֹנָ֥י\n", "*27* | **162** | `֑` | `֥` | אֱלֹהִ֑ים `~` אֱלֹהִ֥ים ` ` אֲדַבֵּ֑ר `~` אֲדַבֵּ֥ר ` ` אֲדֹנִ֑י `~` אֲדֹנִ֥י\n", "*28* | **161** | `֥` | `֨` | אֱלֹהִ֥ים `~` אֱלֹהִ֨ים ` ` אֱלֹהֵ֥י `~` אֱלֹהֵ֨י ` ` אֲדֹנָ֥י `~` אֲדֹנָ֨י\n", "*29* | **160** | `֣` | `֨` | אֱלֹהִ֣ים `~` אֱלֹהִ֨ים ` ` אֱלֹהֵ֣י `~` אֱלֹהֵ֨י ` ` אֲדֹנָ֣י `~` אֲדֹנָ֨י\n", "*30* | **159** | `֛` | `֥` | אֱלֹהִ֛ים `~` אֱלֹהִ֥ים ` ` אֲנִ֛י `~` אֲנִ֥י ` ` אֲנָשִׁ֛ים `~` אֲנָשִׁ֥ים\n", "*31* | **153** | `֔` | `֛` | אֱלֹהִ֔ים `~` אֱלֹהִ֛ים ` ` אֱלֹהֵ֔ינוּ `~` אֱלֹהֵ֛ינוּ ` ` אֱלֹהֵיכֶ֔ם `~` אֱלֹהֵיכֶ֛ם\n", "*32* | **147** | `֛` | `ֽ` | אֱלֹהִ֛ים `~` אֱלֹהִֽים ` ` אֱלֹהֵ֛ינוּ `~` אֱלֹהֵֽינוּ ` ` אֱלֹהֵיכֶ֛ם `~` אֱלֹהֵיכֶֽם\n", "*33* | **143** | `֗` | `֛` | אֱלֹהִ֗ים `~` אֱלֹהִ֛ים ` ` אֱלֹהֵ֗ינוּ `~` אֱלֹהֵ֛ינוּ ` ` אֱלֹהֶ֗יךָ `~` אֱלֹהֶ֛יךָ\n", "*34* | **142** | `֤` | `ֽ` | אֱלֹהִ֤ים `~` אֱלֹהִֽים ` ` אֱלֹהֵ֤י `~` אֱלֹהֵֽי ` ` אֲדֹנָ֤י `~` אֲדֹנָֽי\n", "*35* | **138** | `֤` | `֨` | אֱלֹהִ֤ים `~` אֱלֹהִ֨ים ` ` אֱלֹהֵ֤י `~` אֱלֹהֵ֨י ` ` אֲדֹנָ֤י `~` אֲדֹנָ֨י\n", "*36* | **123** | `֖` | `֨` | אֱלֹהִ֖ים `~` אֱלֹהִ֨ים ` ` אֱלֹהֵ֖י `~` אֱלֹהֵ֨י ` ` אֲדֹנָ֖י `~` אֲדֹנָ֨י\n", "*37* | **120** | `֑` | `֛` | אֱלֹהִ֑ים `~` אֱלֹהִ֛ים ` ` אֱלֹהֵ֑ינוּ `~` אֱלֹהֵ֛ינוּ ` ` אֱלֹהֵיכֶ֑ם `~` אֱלֹהֵיכֶ֛ם\n", "*38* | **119** | `֛` | `֤` | אֱלֹהִ֛ים `~` אֱלֹהִ֤ים ` ` אֲנִ֛י `~` אֲנִ֤י ` ` אֲנָשִׁ֛ים `~` אֲנָשִׁ֤ים\n", "*39* | **118** | `֔` | `֤` | אֱלֹהִ֔ים `~` אֱלֹהִ֤ים ` ` אֲדֹנָ֔י `~` אֲדֹנָ֤י ` ` אֲנַ֔חְנוּ `~` אֲנַ֤חְנוּ\n", "*40* | **114** | `֖` | `֙)` | אֱדֹ֖ום `~` אֱדֹום֙ ` ` אֱלֹהִ֖ים `~` אֱלֹהִים֙ ` ` אֲבָנִ֖ים `~` אֲבָנִים֙\n", "*41* | **114** | `֗` | `֤` | אֱלֹהִ֗ים `~` אֱלֹהִ֤ים ` ` אֲדֹנָ֗י `~` אֲדֹנָ֤י ` ` אֲנִ֗י `~` אֲנִ֤י\n", "*42* | **110** | `֣` | ` ` | אֱלֹ֣הֵיכֶ֔ם `~` אֱלֹהֵיכֶ֔ם ` ` אֲנִ֣י `~` אֲנִי ` ` אֲשֶׁ֣ר `~` אֲשֶׁר\n", "*43* | **104** | `֥` | ` ` | אֲנִ֥י `~` אֲנִי ` ` אֲשֶׁ֥ר `~` אֲשֶׁר ` ` אִ֥ישׁ `~` אִישׁ\n", "*44* | **101** | `֣` | `֙)` | אֱלֹהִ֣ים `~` אֱלֹהִים֙ ` ` אֲבָנִ֣ים `~` אֲבָנִים֙ ` ` אֲלָפִ֣ים `~` אֲלָפִים֙\n", "*45* | **100** | `֔` | `֙)` | אֱדֹ֔ום `~` אֱדֹום֙ ` ` אֱלֹהִ֔ים `~` אֱלֹהִים֙ ` ` אֲבָנִ֔ים `~` אֲבָנִים֙\n", "*46* | **97** | `ֽ` | `֙)` | אֱדֹֽום `~` אֱדֹום֙ ` ` אֱלֹהִֽים `~` אֱלֹהִים֙ ` ` אֲֽנִי `~` אֲנִי֙\n", "*47* | **96** | `֖` | `֜` | אֱדֹ֖ום `~` אֱדֹ֜ום ` ` אֱלֹהִ֖ים `~` אֱלֹהִ֜ים ` ` אֱלֹהֶ֖יךָ `~` אֱלֹהֶ֜יךָ\n", "*48* | **95** | `֖` | `֨.֙)` | אֱלֹהֵ֖ינוּ `~` אֱלֹהֵ֨ינוּ֙ ` ` אֱלֹהֶ֖יךָ `~` אֱלֹהֶ֨יךָ֙ ` ` אֲנַ֖חְנוּ `~` אֲנַ֨חְנוּ֙\n", "*49* | **91** | `ִ.י)` | `ֹ.ו)` | אִתִּ֑י `~` אִתֹּ֑ו ` ` אִתִּ֔י `~` אִתֹּ֔ו ` ` אִתִּ֖י `~` אִתֹּ֖ו\n", "*50* | **90** | `֔` | `֜` | אֱדֹ֔ום `~` אֱדֹ֜ום ` ` אֱלֹהִ֔ים `~` אֱלֹהִ֜ים ` ` אֱלֹהֶ֔יךָ `~` אֱלֹהֶ֜יךָ\n", "*51* | **89** | `֑` | `֤` | אֱלֹהִ֑ים `~` אֱלֹהִ֤ים ` ` אֲדֹנָ֑י `~` אֲדֹנָ֤י ` ` אֲנָשִׁ֑ים `~` אֲנָשִׁ֤ים\n", "*52* | **87** | `֗` | `֜` | אֱדֹ֗ום `~` אֱדֹ֜ום ` ` אֱלֹהִ֗ים `~` אֱלֹהִ֜ים ` ` אֱלֹהֶ֗יךָ `~` אֱלֹהֶ֜יךָ\n", "*53* | **85** | `ֽ` | ` ` | אֱֽלֹהִ֗ים `~` אֱלֹהִ֗ים ` ` אֱלֹֽהֵיהֶ֑ם `~` אֱלֹהֵיהֶ֑ם ` ` אֱלֹֽהֵיכֶ֔ם `~` אֱלֹהֵיכֶ֔ם\n", "*54* | **83** | `֑` | `֙)` | אֱדֹ֑ום `~` אֱדֹום֙ ` ` אֱלֹהִ֑ים `~` אֱלֹהִים֙ ` ` אֲבָנִ֑ים `~` אֲבָנִים֙\n", "*55* | **83** | `֨` | `ֽ` | אֱלֹהִ֨ים `~` אֱלֹהִֽים ` ` אֱלֹהֵ֨י `~` אֱלֹהֵֽי ` ` אֲדֹנָ֨י `~` אֲדֹנָֽי\n", "*56* | **82** | `֥` | `֧` | אֱלֹהִ֥ים `~` אֱלֹהִ֧ים ` ` אֱלֹהֵ֥י `~` אֱלֹהֵ֧י ` ` אֲדֹנָ֥י `~` אֲדֹנָ֧י\n", "*57* | **80** | `֣` | `֧` | אֱלֹהִ֣ים `~` אֱלֹהִ֧ים ` ` אֱלֹהֵ֣י `~` אֱלֹהֵ֧י ` ` אֲדֹנָ֣י `~` אֲדֹנָ֧י\n", "*58* | **78** | `֑` | `֜` | אֱדֹ֑ום `~` אֱדֹ֜ום ` ` אֱלֹהִ֑ים `~` אֱלֹהִ֜ים ` ` אֱלֹהֶ֑יךָ `~` אֱלֹהֶ֜יךָ\n", "*59* | **78** | `֥` | `֙)` | אֱלֹהִ֥ים `~` אֱלֹהִים֙ ` ` אֲנָשִׁ֥ים `~` אֲנָשִׁים֙ ` ` אֲרֹ֥ון `~` אֲרֹון֙\n", "*60* | **77** | `֜` | `֣` | אֱלֹהִ֜ים `~` אֱלֹהִ֣ים ` ` אֲנִ֜י `~` אֲנִ֣י ` ` אֲנָשִׁ֜ים `~` אֲנָשִׁ֣ים\n", "*61* | **77** | `֜` | `ֽ` | אֱדֹ֜ום `~` אֱדֹֽום ` ` אֱלֹהִ֜ים `~` אֱלֹהִֽים ` ` אֱלֹהֶ֜יךָ `~` אֱלֹהֶֽיךָ\n", "*62* | **77** | `֣` | `֨.֙)` | אֲנַ֣חְנוּ `~` אֲנַ֨חְנוּ֙ ` ` אֵ֣לֶּה `~` אֵ֨לֶּה֙ ` ` אֵלֶ֣יךָ `~` אֵלֶ֨יךָ֙\n", "*63* | **76** | `֛` | `֜` | אֱלֹהִ֛ים `~` אֱלֹהִ֜ים ` ` אֱלֹהֶ֛יךָ `~` אֱלֹהֶ֜יךָ ` ` אֲנִ֛י `~` אֲנִ֜י\n", "*64* | **76** | `֣` | `ּ.֖` | בְנֵ֣י `~` בְּנֵ֖י ` ` בִ֣י `~` בִּ֖י ` ` בִלְתִּ֣י `~` בִּלְתִּ֖י\n", "*65* | **76** | `֤` | `֧` | אֱלֹהִ֤ים `~` אֱלֹהִ֧ים ` ` אֱלֹהֵ֤י `~` אֱלֹהֵ֧י ` ` אֲדֹנָ֤י `~` אֲדֹנָ֧י\n", "*66* | **73** | `֔` | `֨.֙)` | אֱלֹהֵ֔ינוּ `~` אֱלֹהֵ֨ינוּ֙ ` ` אֱלֹהֶ֔יךָ `~` אֱלֹהֶ֨יךָ֙ ` ` אֲנַ֔חְנוּ `~` אֲנַ֨חְנוּ֙\n", "*67* | **73** | `֛` | `֨` | אֱלֹהִ֛ים `~` אֱלֹהִ֨ים ` ` אֲנִ֛י `~` אֲנִ֨י ` ` אֲנָשִׁ֛ים `~` אֲנָשִׁ֨ים\n", "*68* | **72** | `֖` | `֧` | אֱלֹהִ֖ים `~` אֱלֹהִ֧ים ` ` אֱלֹהֵ֖י `~` אֱלֹהֵ֧י ` ` אֲדֹנָ֖י `~` אֲדֹנָ֧י\n", "*69* | **71** | `֖` | ` ` | אֲנִ֖י `~` אֲנִי ` ` אֲשֶׁ֖ר `~` אֲשֶׁר ` ` אִ֖ישׁ `~` אִישׁ\n", "*70* | **71** | `֜` | `֥` | אֱלֹהִ֜ים `~` אֱלֹהִ֥ים ` ` אֲנִ֜י `~` אֲנִ֥י ` ` אֲנָשִׁ֜ים `~` אֲנָשִׁ֥ים\n", "*71* | **71** | `֥` | `ּ.֣` | בְאֵ֥ר `~` בְּאֵ֣ר ` ` בְנֵ֥י `~` בְּנֵ֣י ` ` בְנֹ֥ות `~` בְּנֹ֣ות\n", "*72* | **70** | ` ` | `֙)` | אֲנִי `~` אֲנִי֙ ` ` אֲשֶׁר `~` אֲשֶׁר֙ ` ` אִישׁ `~` אִישׁ֙\n", "*73* | **70** | `֖` | `ּ.֔` | בְהֵמָ֖ה `~` בְּהֵמָ֔ה ` ` בְנִ֖י `~` בְּנִ֔י ` ` בְנֹ֖ו `~` בְּנֹ֔ו\n", "*74* | **70** | `֖` | `ּ.֣` | בְהֵמָ֖ה `~` בְּהֵמָ֣ה ` ` בְנִ֖י `~` בְּנִ֣י ` ` בְנֵ֖י `~` בְּנֵ֣י\n", "*75* | **65** | `֤` | ` ` | אֲנִ֤י `~` אֲנִי ` ` אֲשֶׁ֤ר `~` אֲשֶׁר ` ` אִ֤ישׁ `~` אִישׁ\n", "*76* | **65** | `֥` | `ּ.֖` | בְנֵ֥י `~` בְּנֵ֖י ` ` בְרִ֥ית `~` בְּרִ֖ית ` ` בִלְתִּ֥י `~` בִּלְתִּ֖י\n", "*77* | **65** | `֧` | `֨` | אֱלֹהִ֧ים `~` אֱלֹהִ֨ים ` ` אֱלֹהֵ֧י `~` אֱלֹהֵ֨י ` ` אֲדֹנָ֧י `~` אֲדֹנָ֨י\n", "*78* | **64** | `֖` | `ּ.ֽ` | בְהֵמָ֖ה `~` בְּהֵמָֽה ` ` בְנִ֖י `~` בְּנִֽי ` ` בְנֵ֖י `~` בְּֽנֵי\n", "*79* | **63** | `֔` | `֨` | אֱלֹהִ֔ים `~` אֱלֹהִ֨ים ` ` אֲדֹנָ֔י `~` אֲדֹנָ֨י ` ` אֲנָשִׁ֔ים `~` אֲנָשִׁ֨ים\n", "*80* | **63** | `֗` | `֨` | אֱלֹהִ֗ים `~` אֱלֹהִ֨ים ` ` אֲדֹנָ֗י `~` אֲדֹנָ֨י ` ` אֲנִ֗י `~` אֲנִ֨י\n", "*81* | **62** | `֣` | `ּ.֥` | בְנֵ֣י `~` בְּנֵ֥י ` ` בְנֹ֣ות `~` בְּנֹ֥ות ` ` בִ֣י `~` בִּ֥י\n", "*82* | **61** | `ֽ` | `ּ.֑` | בְהֵמָֽה `~` בְּהֵמָ֑ה ` ` בְנֹֽו `~` בְּנֹ֑ו ` ` בִֽי `~` בִּ֑י\n", "*83* | **61** | `ֽ` | `ּ.֖` | בְהֵמָֽה `~` בְּהֵמָ֖ה ` ` בְנֵֽי `~` בְּנֵ֖י ` ` בְנֹֽו `~` בְּנֹ֖ו\n", "*84* | **60** | `֗` | `֙)` | אֱדֹ֗ום `~` אֱדֹום֙ ` ` אֱלֹהִ֗ים `~` אֱלֹהִים֙ ` ` אֲנָשִׁ֗ים `~` אֲנָשִׁים֙\n", "*85* | **60** | `֥` | `֨.֙)` | אֲנַ֥חְנוּ `~` אֲנַ֨חְנוּ֙ ` ` אֵ֥לֶּה `~` אֵ֨לֶּה֙ ` ` אֵלֶ֥יךָ `~` אֵלֶ֨יךָ֙\n", "*86* | **59** | `֔` | `ּ.֖` | בְנִ֔י `~` בְּנִ֖י ` ` בְנֹ֔ו `~` בְּנֹ֖ו ` ` בִ֔י `~` בִּ֖י\n", "*87* | **57** | `(ב` | `(ל` | בְךָ֔ `~` לְךָ֔ ` ` בְךָ֖ `~` לְךָ֖ ` ` בְךָ֙ `~` לְךָ֙\n", "*88* | **55** | `֑` | `ּ.ֽ` | בִ֑י `~` בִּֽי ` ` בִנְיָמִ֑ן `~` בִּנְיָמִֽן ` ` בֵיתֹ֑ו `~` בֵּיתֹֽו\n", "*89* | **53** | `֖` | `ּ.֑` | בְהֵמָ֖ה `~` בְּהֵמָ֑ה ` ` בְנִ֖י `~` בְּנִ֑י ` ` בְנֹ֖ו `~` בְּנֹ֑ו\n", "*90* | **53** | `֖` | `ּ.֥` | בְנִ֖י `~` בְּנִ֥י ` ` בְנֵ֖י `~` בְּנֵ֥י ` ` בְנֹ֖ות `~` בְּנֹ֥ות\n", "*91* | **53** | `֗` | `֨.֙)` | אֱלֹהֵ֗ינוּ `~` אֱלֹהֵ֨ינוּ֙ ` ` אֱלֹהֶ֗יךָ `~` אֱלֹהֶ֨יךָ֙ ` ` אֲנַ֗חְנוּ `~` אֲנַ֨חְנוּ֙\n", "*92* | **53** | `֜` | `֤` | אֱלֹהִ֜ים `~` אֱלֹהִ֤ים ` ` אֲנִ֜י `~` אֲנִ֤י ` ` אֲנָשִׁ֜ים `~` אֲנָשִׁ֤ים\n", "*93* | **50** | `֔` | `ּ.֑` | בְנִ֔י `~` בְּנִ֑י ` ` בְנֹ֔ו `~` בְּנֹ֑ו ` ` בִ֔י `~` בִּ֑י\n", "*94* | **50** | `֔` | `ּ.ֽ` | בְנִ֔י `~` בְּנִֽי ` ` בְנֹ֔ו `~` בְּנֹֽו ` ` בִ֔י `~` בִּֽי\n", "*95* | **50** | `֨` | ` ` | אֲנִ֨י `~` אֲנִי ` ` אֲשֶׁ֨ר `~` אֲשֶׁר ` ` אִ֨ישׁ `~` אִישׁ\n", "*96* | **50** | `ֶ.ךָ)` | `ָ.ו)` | אֱלֹהֶ֑יךָ `~` אֱלֹהָ֑יו ` ` אֱלֹהֶ֔יךָ `~` אֱלֹהָ֔יו ` ` אֱלֹהֶ֖יךָ `~` אֱלֹהָ֖יו\n", "*97* | **49** | `ֽ` | `ּ.֔` | בְהֵמָֽה `~` בְּהֵמָ֔ה ` ` בְנֹֽו `~` בְּנֹ֔ו ` ` בִֽי `~` בִּ֔י\n", "*98* | **48** | `֑` | `֨` | אֱלֹהִ֑ים `~` אֱלֹהִ֨ים ` ` אֲדֹנָ֑י `~` אֲדֹנָ֨י ` ` אֲנָשִׁ֑ים `~` אֲנָשִׁ֨ים\n", "*99* | **47** | `(ב` | `(פ` | בְנֵ֖י `~` פְנֵ֖י ` ` בְנֵ֣י `~` פְנֵ֣י ` ` בְנֵ֤י `~` פְנֵ֤י\n", "*100* | **47** | `֛` | `֧` | אֱלֹהִ֛ים `~` אֱלֹהִ֧ים ` ` אֲנִ֛י `~` אֲנִ֧י ` ` אֲשֶׁ֛ר `~` אֲשֶׁ֧ר" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "D.showDiffs(0, 100)" ] }, { "cell_type": "markdown", "id": "6a47c43d-3685-4f82-8d8a-af8bdca7f8bf", "metadata": {}, "source": [ "Consonantal ETCBC transcription" ] }, { "cell_type": "code", "execution_count": 13, "id": "38baf99d-5d77-40cb-a180-e40b10f7fe42", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "420102 word occurrences of 24438 distinct words\n", "5178 common words\n", "Computing 13403253 comparisons\n", " 13403253 = 100 %\n", "Stored 2674428 word pairs between 5167 words\n", "Computing 2674428 diffs between word pairs\n", " 2674428 = 100 %\n", "2282571 distinct differences\n" ] } ], "source": [ "D = collectDiffs(A, \"word\", \"g_cons\", frequencyThreshold=6, sizeThreshold=3, distThreshold=3)" ] }, { "cell_type": "markdown", "id": "1cdf1b13-76bb-4b0d-a6ac-353a25d25e91", "metadata": {}, "source": [ "Here are the top-100 differences." ] }, { "cell_type": "code", "execution_count": 14, "id": "1a8b4736-7e16-4b6f-9c56-499502aa8651", "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "seq | freq | `-` | `+` | examples\n", "--- | --- | --- | --- | --- \n", "*1* | **546** | ` ` | `W)` | ` | ` ` | >BDM `~` BDM ` ` >BDW `~` BDW ` ` >BJNW `~` BJNW\n", "*23* | **125** | `H)` | `M)` | ` | `(J` | >
` | `(T` | >
BCJ `~` >BJ\n", "*47* | **82** | `L` | `R` | LH ` ` >NH ` ` >R<> `~` >R>\n", "*49* | **80** | `K` | ` ` | CJT `~` >CXJT ` ` >JH `~` >XJH ` ` >JK `~` >XJK\n", "*53* | **77** | `W)` | `(T` | ` | `W)` | >B> `~` B>W ` ` >BD `~` BDW ` ` >BL `~` BLW\n", "*60* | **70** | ` ` | `L` | ` | BD ` ` BDH ` ` BDM\n", "*62* | **70** | `J` | `T` | ` | `(M` | >` | `(J.W)` | >
` | `H)` | >B> `~` B>H ` ` >BJR `~` BJRH ` ` >BN `~` BNH\n", "*69* | **66** | `M` | ` ` | ` | `(C` | >B> `~` CB> ` ` >BH `~` CBH ` ` >BJ `~` CBJ\n", "*73* | **64** | `(N` | `(T` | N` | ` ` | >R>K `~` >RK ` ` B>KH `~` BKH ` ` B>KM `~` BKM\n", "*87* | **59** | `H)` | `TJ)` | ` | `(N` | >
` | `(X` | >BL `~` XBL ` ` >CB `~` XCB ` ` >CM `~` XCM\n", "*94* | **56** | `L` | ` ` | " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "D.showDiffs(0, 100)" ] }, { "cell_type": "code", "execution_count": null, "id": "ab93cc8d-5aa4-4bff-9635-933c534c52a5", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.1" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }