{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "You might want to consider the [start](search.ipynb) of this tutorial.\n", "\n", "Short introductions to other TF datasets:\n", "\n", "* [Dead Sea Scrolls](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/dss.ipynb),\n", "* [Old Babylonian Letters](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/oldbabylonian.ipynb),\n", "or the\n", "* [Quran](https://nbviewer.jupyter.org/github/annotation/tutorials/blob/master/lorentz2020/quran.ipynb)\n" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "# Sharing data features\n", "\n", "This tutorial is a companion to the Text-Fabric\n", "[documentation on data sharing](https://annotation.github.io/text-fabric/tf/about/datasharing.html).\n", "\n", "## Explore additional data\n", "The ETCBC has a few other repositories with data that work in conjunction with the BHSA data.\n", "One of them you have already seen:\n", "[phono](https://github.com/ETCBC/phono),\n", "for phonetic transcriptions.\n", "There is also\n", "[parallels](https://github.com/ETCBC/parallels)\n", "for detecting parallel passages,\n", "and\n", "[valence](https://github.com/ETCBC/valence)\n", "for studying patterns around verbs that determine their meanings.\n", "\n", "## Make your own data\n", "If you study the additional data, you can observe how that data is created and also\n", "how it is turned into a text-fabric data module.\n", "The last step is incredibly easy. You can write out every Python dictionary where the keys are numbers\n", "and the values string or numbers as a Text-Fabric feature.\n", "When you are creating data, you have already constructed those dictionaries, so writing\n", "them out is just one method call.\n", "See for example how the\n", "[flowchart](https://nbviewer.jupyter.org/github/etcbc/valence/blob/master/programs/flowchart.ipynb#Add-sense-feature-to-valence-module)\n", "notebook in valence writes out verb sense data.\n", "\n", "## Share your new data\n", "You can then easily share your new features on GitHub, so that your colleagues everywhere\n", "can try it out for themselves.\n", "\n", "Here is how you draw in other data, for example\n", "\n", "* [etcbc/valence/tf](https://github.com/etcbc/valence) :\n", " the results of the *verbal valence* work of Janet Dyk in the SYNVAR project;\n", "* [etcbc/lingo/heads/tf](https://github.com/etcbc/lingo/tree/master/heads) :\n", " head words for phrases, work done by Cody Kingham;\n", "* [ch-jensen/participants/actor/tf](https://github.com/ch-jensen/participants) :\n", " participant analysis in progress by Christian Højgaard-Jensen;\n", "* [cmerwich/bh-reference-system/tf](https://github.com/cmerwich/bh-reference-system):\n", " participant analysis in progress by Christiaan Erwich;\n", "* or whatever you have in the making!\n", "\n", "You can add such data on the fly, by passing a `mod={org}/{repo}/{path}` parameter,\n", "or a bunch of them separated by commas, or packed in a list or tuple.\n", "\n", "If the data is there, it will be auto-downloaded and stored on your machine.\n", "\n", "Let's do it." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Incantation\n", "\n", "The ins and outs of installing Text-Fabric, getting the corpus, and initializing a notebook are\n", "explained in the [start tutorial](start.ipynb)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2018-05-24T10:06:39.818664Z", "start_time": "2018-05-24T10:06:39.796588Z" } }, "outputs": [], "source": [ "from tf.app import use" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First we are going to include the work of Cody Kingham on\n", "[heads of phrases](https://nbviewer.org/github/ETCBC/lingo/blob/master/heads/Heads2TF.ipynb) and some earlier work\n", "by Janet Dyk and Dirk Roorda on [verbal valence](https://github.com/etcbc/valence)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "rate limit is 5000 requests per hour, with 5000 left for this hour\n", "\tconnecting to online GitHub repo ETCBC/bhsa ... connected\n" ] }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/ETCBC/bhsa/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/bhsa/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/etcbc/lingo/heads/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/etcbc/valence/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/phono/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/parallels/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " TF: TF API 12.1.7, ETCBC/bhsa/app v3, Search Reference
\n", " Data: ETCBC - bhsa 2021, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots / node% coverage
book3910938.21100
chapter929459.19100
lex923046.22100
verse2321318.38100
half_verse451799.44100
sentence637176.70100
sentence_atom645146.61100
clause881314.84100
clause_atom907044.70100
phrase2532031.68100
phrase_atom2675321.59100
subphrase1138501.4238
word4265901.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
Parallel Passages\n", "
\n", "\n", "
\n", "
\n", "crossref\n", "
\n", "
int
\n", "\n", " 🆗 links between similar passages\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis\n", "
\n", "\n", "
\n", "
\n", "book\n", "
\n", "
str
\n", "\n", " ✅ book name in Latin (Genesis; Numeri; Reges1; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "book@ll\n", "
\n", "
str
\n", "\n", " ✅ book name in amharic (ኣማርኛ)\n", "\n", "
\n", "\n", "
\n", "
\n", "chapter\n", "
\n", "
int
\n", "\n", " ✅ chapter number (1; 2; 3; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "code\n", "
\n", "
int
\n", "\n", " ✅ identifier of a clause atom relationship (0; 74; 367; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "det\n", "
\n", "
str
\n", "\n", " ✅ determinedness of phrase(atom) (det; und; NA.)\n", "\n", "
\n", "\n", "
\n", "
\n", "domain\n", "
\n", "
str
\n", "\n", " ✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)\n", "\n", "
\n", "\n", "
\n", "
\n", "freq_lex\n", "
\n", "
int
\n", "\n", " ✅ frequency of lexemes\n", "\n", "
\n", "\n", "
\n", "
\n", "function\n", "
\n", "
str
\n", "\n", " ✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons\n", "
\n", "
str
\n", "\n", " ✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons_utf8\n", "
\n", "
str
\n", "\n", " ✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex\n", "
\n", "
str
\n", "\n", " ✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex_utf8\n", "
\n", "
str
\n", "\n", " ✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_word\n", "
\n", "
str
\n", "\n", " ✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_word_utf8\n", "
\n", "
str
\n", "\n", " ✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)\n", "\n", "
\n", "\n", "
\n", "
\n", "gloss\n", "
\n", "
str
\n", "\n", " 🆗 english translation of lexeme (beginning create god(s))\n", "\n", "
\n", "\n", "
\n", "
\n", "gn\n", "
\n", "
str
\n", "\n", " ✅ grammatical gender (m; f; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "label\n", "
\n", "
str
\n", "\n", " ✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)\n", "\n", "
\n", "\n", "
\n", "
\n", "language\n", "
\n", "
str
\n", "\n", " ✅ of word or lexeme (Hebrew; Aramaic.)\n", "\n", "
\n", "\n", "
\n", "
\n", "lex\n", "
\n", "
str
\n", "\n", " ✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)\n", "\n", "
\n", "\n", "
\n", "
\n", "lex_utf8\n", "
\n", "
str
\n", "\n", " ✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)\n", "\n", "
\n", "\n", "
\n", "
\n", "ls\n", "
\n", "
str
\n", "\n", " ✅ lexical set, subclassification of part-of-speech (card; ques; mult)\n", "\n", "
\n", "\n", "
\n", "
\n", "nametype\n", "
\n", "
str
\n", "\n", " ⚠️ named entity type (pers; mens; gens; topo; ppde.)\n", "\n", "
\n", "\n", "
\n", "
\n", "nme\n", "
\n", "
str
\n", "\n", " ✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "nu\n", "
\n", "
str
\n", "\n", " ✅ grammatical number (sg; du; pl; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
int
\n", "\n", " ✅ sequence number of an object within its context\n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "pargr\n", "
\n", "
str
\n", "\n", " 🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "pdp\n", "
\n", "
str
\n", "\n", " ✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "pfm\n", "
\n", "
str
\n", "\n", " ✅ preformative consonantal-transliterated (absent; n/a; J, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_gn\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix gender (m; f; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_nu\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix number (sg; du; pl; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_ps\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix person (p1; p2; p3; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "ps\n", "
\n", "
str
\n", "\n", " ✅ grammatical person (p1; p2; p3; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "qere\n", "
\n", "
str
\n", "\n", " ✅ word pointed-transliterated masoretic reading correction\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer\n", "
\n", "
str
\n", "\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer_utf8\n", "
\n", "
str
\n", "\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_utf8\n", "
\n", "
str
\n", "\n", " ✅ word pointed-Hebrew masoretic reading correction\n", "\n", "
\n", "\n", "
\n", "
\n", "rank_lex\n", "
\n", "
int
\n", "\n", " ✅ ranking of lexemes based on freqnuecy\n", "\n", "
\n", "\n", "
\n", "
\n", "rela\n", "
\n", "
str
\n", "\n", " ✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "sp\n", "
\n", "
str
\n", "\n", " ✅ part-of-speech (art; verb; subs; nmpr, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "st\n", "
\n", "
str
\n", "\n", " ✅ state of a noun (a (absolute); c (construct); e (emphatic).)\n", "\n", "
\n", "\n", "
\n", "
\n", "tab\n", "
\n", "
int
\n", "\n", " ✅ clause atom: its level in the linguistic embedding\n", "\n", "
\n", "\n", "
\n", "
\n", "trailer\n", "
\n", "
str
\n", "\n", " ✅ interword material pointed-transliterated (& 00 05 00_P ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "trailer_utf8\n", "
\n", "
str
\n", "\n", " ✅ interword material pointed-Hebrew (־ ׃)\n", "\n", "
\n", "\n", "
\n", "
\n", "txt\n", "
\n", "
str
\n", "\n", " ✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)\n", "\n", "
\n", "\n", "
\n", "
\n", "typ\n", "
\n", "
str
\n", "\n", " ✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)\n", "\n", "
\n", "\n", "
\n", "
\n", "uvf\n", "
\n", "
str
\n", "\n", " ✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "vbe\n", "
\n", "
str
\n", "\n", " ✅ verbal ending consonantal-transliterated (n/a; W; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "vbs\n", "
\n", "
str
\n", "\n", " ✅ root formation consonantal-transliterated (absent; n/a; H; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "verse\n", "
\n", "
int
\n", "\n", " ✅ verse number\n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex\n", "
\n", "
str
\n", "\n", " ✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)\n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex_utf8\n", "
\n", "
str
\n", "\n", " ✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)\n", "\n", "
\n", "\n", "
\n", "
\n", "vs\n", "
\n", "
str
\n", "\n", " ✅ verbal stem (qal; piel; hif; apel; pael)\n", "\n", "
\n", "\n", "
\n", "
\n", "vt\n", "
\n", "
str
\n", "\n", " ✅ verbal tense (perf; impv; wayq; infc)\n", "\n", "
\n", "\n", "
\n", "
\n", "mother\n", "
\n", "
none
\n", "\n", " ✅ linguistic dependency between textual objects\n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
Phonetic Transcriptions\n", "
\n", "\n", "
\n", "
\n", "phono\n", "
\n", "
str
\n", "\n", " 🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)\n", "\n", "
\n", "\n", "
\n", "
\n", "phono_trailer\n", "
\n", "
str
\n", "\n", " 🆗 interword material in phonological transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
etcbc/lingo/heads/tf\n", "
\n", "\n", "
\n", "
\n", "heads\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "noun_heads\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "prep_obj\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
etcbc/valence/tf\n", "
\n", "\n", "
\n", "
\n", "cfunction\n", "
\n", "
str
\n", "\n", " ❗️ corrected phrase function, only present for phrases that were in a correction sheet\n", "\n", "
\n", "\n", "
\n", "
\n", "f_correction\n", "
\n", "
str
\n", "\n", " ❗️ whether the phrase function has been manually corrected\n", "\n", "
\n", "\n", "
\n", "
\n", "grammatical\n", "
\n", "
str
\n", "\n", " ❗️ constituent role main classification\n", "\n", "
\n", "\n", "
\n", "
\n", "lexical\n", "
\n", "
str
\n", "\n", " ❗️ additional lexical characteristics\n", "\n", "
\n", "\n", "
\n", "
\n", "original\n", "
\n", "
str
\n", "\n", " ❗️ default value before enrichment logic has been applied\n", "\n", "
\n", "\n", "
\n", "
\n", "predication\n", "
\n", "
str
\n", "\n", " ❗️ verbal function main classification\n", "\n", "
\n", "\n", "
\n", "
\n", "s_manual\n", "
\n", "
str
\n", "\n", " ❗️ whether the generated enrichment features have been manually changed\n", "\n", "
\n", "\n", "
\n", "
\n", "semantic\n", "
\n", "
str
\n", "\n", " ❗️ additional semantic characteristics\n", "\n", "
\n", "\n", "
\n", "
\n", "sense\n", "
\n", "
str
\n", "\n", " ❗️ sense label of verb occurrences (d-; i.; -p; d-; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "valence\n", "
\n", "
str
\n", "\n", " ❗️ verbal valence main classification\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", " Settings:
specified
  1. apiVersion: 3
  2. appName: ETCBC/bhsa
  3. appPath: /Users/me/text-fabric-data/github/ETCBC/bhsa/app
  4. commit: gb112c161cfd21eae403d51a2733740d8743460e7
  5. css: ''
  6. dataDisplay:
    • exampleSectionHtml:<code>Genesis 1:1</code> (use <a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names</a>)
    • excludedFeatures:
      • g_uvf_utf8
      • g_vbs
      • kq_hybrid
      • languageISO
      • g_nme
      • lex0
      • is_root
      • g_vbs_utf8
      • g_uvf
      • dist
      • root
      • suffix_person
      • g_vbe
      • dist_unit
      • suffix_number
      • distributional_parent
      • kq_hybrid_utf8
      • crossrefSET
      • instruction
      • g_prs
      • lexeme_count
      • rank_occ
      • g_pfm_utf8
      • freq_occ
      • crossrefLCS
      • functional_parent
      • g_pfm
      • g_nme_utf8
      • g_vbe_utf8
      • kind
      • g_prs_utf8
      • suffix_gender
      • mother_object_type
    • noneValues:
      • absent
      • n/a
      • none
      • unknown
      • no value
      • NA
  7. docs:
    • docBase: {docRoot}/{repo}
    • docExt: ''
    • docPage: ''
    • docRoot: https://{org}.github.io
    • featurePage: 0_home
  8. interfaceDefaults: {}
  9. isCompatible: True
  10. local: no value
  11. localDir: /Users/me/text-fabric-data/github/ETCBC/bhsa/_temp
  12. provenanceSpec:
    • corpus: BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
    • doi: 10.5281/zenodo.1007624
    • extraData: ner
    • moduleSpecs:
      • :
        • backend: no value
        • corpus: Phonetic Transcriptions
        • docUrl:https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
        • doi: 10.5281/zenodo.1007636
        • org: ETCBC
        • relative: /tf
        • repo: phono
      • :
        • backend: no value
        • corpus: Parallel Passages
        • docUrl:https://nbviewer.jupyter.org/github/ETCBC/parallels/blob/master/programs/parallels.ipynb
        • doi: 10.5281/zenodo.1007642
        • org: ETCBC
        • relative: /tf
        • repo: parallels
    • org: ETCBC
    • relative: /tf
    • repo: bhsa
    • version: 2021
    • webBase: https://shebanq.ancient-data.org/hebrew
    • webHint: Show this on SHEBANQ
    • webLang: la
    • webLexId: True
    • webUrl:{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
    • webUrlLex: {webBase}/word?version={version}&id=<lid>
  13. release: v1.8.1
  14. typeDisplay:
    • clause:
      • label: {typ} {rela}
      • style: ''
    • clause_atom:
      • hidden: True
      • label: {code}
      • level: 1
      • style: ''
    • half_verse:
      • hidden: True
      • label: {label}
      • style: ''
      • verselike: True
    • lex:
      • featuresBare: gloss
      • label: {voc_lex_utf8}
      • lexOcc: word
      • style: orig
      • template: {voc_lex_utf8}
    • phrase:
      • label: {typ} {function}
      • style: ''
    • phrase_atom:
      • hidden: True
      • label: {typ} {rela}
      • level: 1
      • style: ''
    • sentence:
      • label: {number}
      • style: ''
    • sentence_atom:
      • hidden: True
      • label: {number}
      • level: 1
      • style: ''
    • subphrase:
      • hidden: True
      • label: {number}
      • style: ''
    • word:
      • features: pdp vs vt
      • featuresBare: lex:gloss
  15. writing: hbo
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
TF API: names N F E L T S C TF Fs Fall Es Eall Cs Call directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A = use('ETCBC/bhsa', mod=\"etcbc/lingo/heads/tf,etcbc/valence/tf\", hoist=globals())" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "You see that the features from the **etcbc/valence/tf** and **etcbc/lingo/heads/tf** modules have been added to the mix.\n", "\n", "## ETCBC Valence\n", "\n", "Click the triangle before **etcbc/valence/tf** to see what features have been contributed.\n", "\n", "Note that edge features are in ***bold italic***.\n", "\n", "Let's find out more about *sense*.\n", "\n", "You can start with clicking the triangle after \"sense str\" above.\n", "It tells you where the feature comes from, and it shows you the context where it has been constructed.\n", "You might go there to see additional documentation.\n", "\n", "But we can also dive directly into its data:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(('--', 17941),\n", " ('d-', 9975),\n", " ('-p', 6537),\n", " ('-i', 3604),\n", " ('-c', 3231),\n", " ('dp', 1899),\n", " ('dc', 1002),\n", " ('di', 918),\n", " ('l.', 876),\n", " ('i.', 630),\n", " ('n.', 532),\n", " ('-b', 64),\n", " ('db', 61),\n", " ('c.', 57),\n", " ('k.', 54))" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "F.sense.freqList()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Which nodes have a sense feature?" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'word'}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "{F.otype.v(n) for n in N.walk() if F.sense.v(n)}" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.16s 47381 results\n" ] } ], "source": [ "results = A.search(\n", " \"\"\"\n", "word sense\n", "\"\"\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's show some of the rarer sense values:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.16s 54 results\n" ] } ], "source": [ "results = A.search(\n", " \"\"\"\n", "word sense=k.\n", "\"\"\"\n", ")" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "
npword
1Genesis 4:17יִּקְרָא֙
2Genesis 13:16שַׂמְתִּ֥י
3Genesis 32:13שַׂמְתִּ֤י
4Genesis 34:31יַעֲשֶׂ֖ה
5Genesis 48:20יְשִֽׂמְךָ֣
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.table(results, end=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we do a pretty display, the `sense` feature shows up." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 1" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

verse:1414485
sentence:1172591 53
clause:427946 WayX NA
phrase:652729 CP Conj
1943 וַ
phrase:652730 VP Pred
sense=d-
phrase:652731 PrNP Subj
phrase:652732 PP Objc
sentence:1172592 54
clause:427947 Way0 NA
phrase:652733 CP Conj
1948 וַ
phrase:652734 VP Pred
sense=--
sentence:1172593 55
clause:427948 Way0 NA
phrase:652735 CP Conj
1950 וַ
phrase:652736 VP Pred
sense=d-
phrase:652737 PP Objc
sentence:1172594 56
clause:427949 WayX NA
phrase:652738 CP Conj
1954 וַֽ
phrase:652739 VP Pred
sense=--
clause:427950 Ptcp Subj
phrase:652740 VP PreC
sense=d-
phrase:652741 NP Objc
sentence:1172595 57
clause:427951 Way0 NA
phrase:652742 CP Conj
1958 וַ
phrase:652743 VP Pred
sense=k.
phrase:652744 NP Objc
phrase:652745 PP Cmpl
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.show(results, start=1, end=1, withNodes=True)" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Lingo heads\n", "If you click the triangle before **etcbc/lingo/heads/tf** you see what features it contributes.\n", "Unfortunately, the authors have not provided a description of this feature, but if you click\n", "on the triangle after *heads* none, you see where the feature comes from and who has made it.\n", "\n", "Moreover, the fact that *heads* is in italics makes clear that it is an edge feature.\n", "\n", "Let's use it in a query:\n", "Now, `heads` is an edge feature, we cannot directly make it visible in pretty displays, but we can use it in queries.\n", "\n", "We also want to make the feature `sense` visible, so we mention the feature in the query, without restricting the results." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.42s 402 results\n" ] } ], "source": [ "results = A.search(\n", " \"\"\"\n", "book book=Genesis\n", " chapter chapter=1\n", " clause\n", " phrase\n", " -heads> word sense*\n", "\"\"\"\n", ")" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 1" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

book Genesis
book=Genesis
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
chapter Genesis 1
book=Genesischapter=1
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
verse
book=Genesischapter=1
sentence 1
clause xQtX NA
phrase PP Time
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase VP Pred
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "sense=d-
phrase NP Subj
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase PP Objc
heads•\n", "\n", " \n", "\n", "
heads•\n", "\n", " \n", "\n", "
heads•\n", "\n", " \n", "\n", "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

result 2" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

book Genesis
book=Genesis
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
chapter Genesis 1
book=Genesischapter=1
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
verse
book=Genesischapter=1
sentence 1
clause xQtX NA
phrase PP Time
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase VP Pred
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "sense=d-
phrase NP Subj
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase PP Objc
heads•\n", "\n", " \n", "\n", "
heads•\n", "\n", " \n", "\n", "
heads•\n", "\n", " \n", "\n", "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.show(results, start=1, end=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note how the words that are ***heads*** of their phrases are highlighted within their phrases." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Participants\n", "\n", "Now we are going to add another promising module, provided by Christian Canu Højgaard, from this repo:\n", "[participants](https://github.com/ch-jensen/participants).\n", "\n", "Let's do it in the straightforward way:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/ETCBC/bhsa/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/bhsa/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/lingo/heads/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/valence/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "The requested data is not available offline\n", "\t~/text-fabric-data/github/ch-jensen/participants/actor/tf/2021 not found\n", "rate limit is 5000 requests per hour, with 5000 left for this hour\n", "\tconnecting to online GitHub repo ch-jensen/participants ... connected\n", "No directory /actor/tf/2021 in #9671910a329c069cfd3d366526ea816de57666dcWill try something else\n", "\tFailed\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "No directory /actor/tf/2021 in #9671910a329c069cfd3d366526ea816de57666dc\tFailed\n" ] }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/phono/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/parallels/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "There were problems with loading data.\n", "The TF API has not been loaded!\n", "The app \"ETCBC/bhsa\" will not work!\n" ] } ], "source": [ "A = use(\n", " 'ETCBC/bhsa',\n", " mod=(\n", " \"ETCBC/lingo/heads/tf\",\n", " \"ETCBC/valence/tf\",\n", " \"ch-jensen/participants/actor/tf\"\n", " ),\n", " hoist=globals(),\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The features are not there!\n", "\n", "If we have a look on GitHub in this repo we see under\n", "[actor/tf](https://github.com/ch-jensen/participants/tree/master/actor/tf)\n", "the directory `c` only. Christian has produced his features against version `c` of the BHSA.\n", "\n", "Ok, then we go back, and run our command for version `c`." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/ETCBC/bhsa/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/bhsa/tf/c" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/lingo/heads/tf/c" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/valence/tf/c" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ch-jensen/participants/actor/tf/c" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/phono/tf/c" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/ETCBC/parallels/tf/c" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " TF: TF API 12.1.7, ETCBC/bhsa/app v3, Search Reference
\n", " Data: ETCBC - bhsa c, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots / node% coverage
book3910938.05100
chapter929459.19100
lex923346.20100
verse2321318.38100
half_verse451809.44100
sentence637276.69100
sentence_atom645256.61100
clause881214.84100
clause_atom906884.70100
phrase2532071.68100
phrase_atom2675411.59100
subphrase1138121.4238
word4265841.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
Parallel Passages\n", "
\n", "\n", "
\n", "
\n", "crossref\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis\n", "
\n", "\n", "
\n", "
\n", "book\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "book@ll\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "chapter\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "code\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "det\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "domain\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "freq_lex\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "function\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons_utf8\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex_utf8\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "g_word\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "g_word_utf8\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "gloss\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "gn\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "label\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "language\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "lex\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "lex_utf8\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "ls\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "nametype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "nme\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "nu\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "pargr\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "pdp\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "pfm\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "prs\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "prs_gn\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "prs_nu\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "prs_ps\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "ps\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "qere\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer_utf8\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "qere_utf8\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "rank_lex\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "rela\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "sp\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "st\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "tab\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "trailer\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "trailer_utf8\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "txt\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "typ\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "uvf\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "vbe\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "vbs\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "verse\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex_utf8\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "vs\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "vt\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "mother\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
ETCBC/lingo/heads/tf\n", "
\n", "\n", "
\n", "
\n", "heads\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "noun_heads\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "prep_obj\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
Phonetic Transcriptions\n", "
\n", "\n", "
\n", "
\n", "phono\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "phono_trailer\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
ETCBC/valence/tf\n", "
\n", "\n", "
\n", "
\n", "cfunction\n", "
\n", "
str
\n", "\n", " corrected phrase function, only present for phrases that were in a correction sheet\n", "\n", "
\n", "\n", "
\n", "
\n", "f_correction\n", "
\n", "
str
\n", "\n", " whether the phrase function has been manually corrected\n", "\n", "
\n", "\n", "
\n", "
\n", "grammatical\n", "
\n", "
str
\n", "\n", " constituent role main classification\n", "\n", "
\n", "\n", "
\n", "
\n", "lexical\n", "
\n", "
str
\n", "\n", " additional lexical characteristics\n", "\n", "
\n", "\n", "
\n", "
\n", "original\n", "
\n", "
str
\n", "\n", " default value before enrichment logic has been applied\n", "\n", "
\n", "\n", "
\n", "
\n", "predication\n", "
\n", "
str
\n", "\n", " verbal function main classification\n", "\n", "
\n", "\n", "
\n", "
\n", "s_manual\n", "
\n", "
str
\n", "\n", " whether the generated enrichment features have been manually changed\n", "\n", "
\n", "\n", "
\n", "
\n", "semantic\n", "
\n", "
str
\n", "\n", " additional semantic characteristics\n", "\n", "
\n", "\n", "
\n", "
\n", "sense\n", "
\n", "
str
\n", "\n", " sense label verb occurrences, computed by the flowchart algorithm, see https://github.com/ETCBC/valence/wiki/Legend\n", "\n", "
\n", "\n", "
\n", "
\n", "valence\n", "
\n", "
str
\n", "\n", " verbal valence main classification\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
ch-jensen/participants/actor/tf\n", "
\n", "\n", "
\n", "
\n", "actor\n", "
\n", "
str
\n", "\n", " Participant references for words, subphrases and phrases. The references are adapted from Eep Talstra's work on participant tracking. http://doi.org/10.5281/zenodo.1479491\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_actor\n", "
\n", "
str
\n", "\n", " Participant references for pronominal suffixes. The references are adapted from Eep Talstra's work on participant tracking. http://doi.org/10.5281/zenodo.1479491\n", "\n", "
\n", "\n", "
\n", "
\n", "coref\n", "
\n", "
none
\n", "\n", " Edges to co-referring actors on chapter-level. The references are adapted from Eep Talstra's work on participant tracking. http://doi.org/10.5281/zenodo.1479491\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", " Settings:
specified
  1. apiVersion: 3
  2. appName: ETCBC/bhsa
  3. appPath: /Users/me/text-fabric-data/github/ETCBC/bhsa/app
  4. commit: gb112c161cfd21eae403d51a2733740d8743460e7
  5. css: ''
  6. dataDisplay:
    • exampleSectionHtml:<code>Genesis 1:1</code> (use <a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names</a>)
    • excludedFeatures:
      • g_uvf_utf8
      • g_vbs
      • kq_hybrid
      • languageISO
      • g_nme
      • lex0
      • is_root
      • g_vbs_utf8
      • g_uvf
      • dist
      • root
      • suffix_person
      • g_vbe
      • dist_unit
      • suffix_number
      • distributional_parent
      • kq_hybrid_utf8
      • crossrefSET
      • instruction
      • g_prs
      • lexeme_count
      • rank_occ
      • g_pfm_utf8
      • freq_occ
      • crossrefLCS
      • functional_parent
      • g_pfm
      • g_nme_utf8
      • g_vbe_utf8
      • kind
      • g_prs_utf8
      • suffix_gender
      • mother_object_type
    • noneValues:
      • absent
      • n/a
      • none
      • unknown
      • no value
      • NA
  7. docs:
    • docBase: {docRoot}/{repo}
    • docExt: ''
    • docPage: ''
    • docRoot: https://{org}.github.io
    • featurePage: 0_home
  8. interfaceDefaults: {}
  9. isCompatible: True
  10. local: local
  11. localDir: /Users/me/text-fabric-data/github/ETCBC/bhsa/_temp
  12. provenanceSpec:
    • corpus: BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
    • doi: 10.5281/zenodo.1007624
    • extraData: ner
    • moduleSpecs:
      • :
        • backend: no value
        • corpus: Phonetic Transcriptions
        • docUrl:https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
        • doi: 10.5281/zenodo.1007636
        • org: ETCBC
        • relative: /tf
        • repo: phono
      • :
        • backend: no value
        • corpus: Parallel Passages
        • docUrl:https://nbviewer.jupyter.org/github/ETCBC/parallels/blob/master/programs/parallels.ipynb
        • doi: 10.5281/zenodo.1007642
        • org: ETCBC
        • relative: /tf
        • repo: parallels
    • org: ETCBC
    • relative: /tf
    • repo: bhsa
    • version: c
    • webBase: https://shebanq.ancient-data.org/hebrew
    • webHint: Show this on SHEBANQ
    • webLang: la
    • webLexId: True
    • webUrl:{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
    • webUrlLex: {webBase}/word?version={version}&id=<lid>
  13. release: v1.8.1
  14. typeDisplay:
    • clause:
      • label: {typ} {rela}
      • style: ''
    • clause_atom:
      • hidden: True
      • label: {code}
      • level: 1
      • style: ''
    • half_verse:
      • hidden: True
      • label: {label}
      • style: ''
      • verselike: True
    • lex:
      • featuresBare: gloss
      • label: {voc_lex_utf8}
      • lexOcc: word
      • style: orig
      • template: {voc_lex_utf8}
    • phrase:
      • label: {typ} {function}
      • style: ''
    • phrase_atom:
      • hidden: True
      • label: {typ} {rela}
      • level: 1
      • style: ''
    • sentence:
      • label: {number}
      • style: ''
    • sentence_atom:
      • hidden: True
      • label: {number}
      • level: 1
      • style: ''
    • subphrase:
      • hidden: True
      • label: {number}
      • style: ''
    • word:
      • features: pdp vs vt
      • featuresBare: lex:gloss
  15. writing: hbo
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
TF API: names N F E L T S C TF Fs Fall Es Eall Cs Call directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A = use(\n", " 'ETCBC/bhsa',\n", " version=\"c\",\n", " mod=(\n", " \"ETCBC/lingo/heads/tf\",\n", " \"ETCBC/valence/tf\",\n", " \"ch-jensen/participants/actor/tf\"\n", " ),\n", " hoist=globals(),\n", ")" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " TF: TF API 12.1.7, ETCBC/bhsa/app v3, Search Reference
\n", " Data: ETCBC - bhsa 2021, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots / node% coverage
book3910938.21100
chapter929459.19100
lex923046.22100
verse2321318.38100
half_verse451799.44100
sentence637176.70100
sentence_atom645146.61100
clause881314.84100
clause_atom907044.70100
phrase2532031.68100
phrase_atom2675321.59100
subphrase1138501.4238
word4265901.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
Parallel Passages\n", "
\n", "\n", "
\n", "
\n", "crossref\n", "
\n", "
int
\n", "\n", "
\n", " 🆗 links between similar passages\n", "
\n", "\n", "
\n", "
author:
\n", "
BHSA Data: Constantijn Sikkel; Parallels Notebook: Dirk Roorda, Martijn Naaijer
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:40:46Z
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
Parallels notebook, see https://github.com/ETCBC/parallels
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis\n", "
\n", "\n", "
\n", "
\n", "book\n", "
\n", "
str
\n", "\n", "
\n", " ✅ book name in Latin (Genesis; Numeri; Reges1; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:55Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "book@ll\n", "
\n", "
str
\n", "\n", "
\n", " ✅ book name in amharic (ኣማርኛ)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:20:27Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
language:
\n", "
ኣማርኛ
\n", "
\n", "\n", "
\n", "
languageCode:
\n", "
am
\n", "
\n", "\n", "
\n", "
languageEnglish:
\n", "
amharic
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
book names from wikipedia and other sources
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "chapter\n", "
\n", "
int
\n", "\n", "
\n", " ✅ chapter number (1; 2; 3; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:55Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "code\n", "
\n", "
int
\n", "\n", "
\n", " ✅ identifier of a clause atom relationship (0; 74; 367; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:56Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "det\n", "
\n", "
str
\n", "\n", "
\n", " ✅ determinedness of phrase(atom) (det; und; NA.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:56Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "domain\n", "
\n", "
str
\n", "\n", "
\n", " ✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:57Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "freq_lex\n", "
\n", "
int
\n", "\n", "
\n", " ✅ frequency of lexemes\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:24:45Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed on the basis of the ETCBC core set of features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "function\n", "
\n", "
str
\n", "\n", "
\n", " ✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:57Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons\n", "
\n", "
str
\n", "\n", "
\n", " ✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:57Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons_utf8\n", "
\n", "
str
\n", "\n", "
\n", " ✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:58Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex\n", "
\n", "
str
\n", "\n", "
\n", " ✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:58Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex_utf8\n", "
\n", "
str
\n", "\n", "
\n", " ✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:17:59Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "g_word\n", "
\n", "
str
\n", "\n", "
\n", " ✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:04Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "g_word_utf8\n", "
\n", "
str
\n", "\n", "
\n", " ✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:04Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "gloss\n", "
\n", "
str
\n", "\n", "
\n", " 🆗 english translation of lexeme (beginning create god(s))\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:13Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "gn\n", "
\n", "
str
\n", "\n", "
\n", " ✅ grammatical gender (m; f; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:05Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "label\n", "
\n", "
str
\n", "\n", "
\n", " ✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:06Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "language\n", "
\n", "
str
\n", "\n", "
\n", " ✅ of word or lexeme (Hebrew; Aramaic.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:13Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "lex\n", "
\n", "
str
\n", "\n", "
\n", " ✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:14Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "lex_utf8\n", "
\n", "
str
\n", "\n", "
\n", " ✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "ls\n", "
\n", "
str
\n", "\n", "
\n", " ✅ lexical set, subclassification of part-of-speech (card; ques; mult)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "nametype\n", "
\n", "
str
\n", "\n", "
\n", " ⚠️ named entity type (pers; mens; gens; topo; ppde.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "nme\n", "
\n", "
str
\n", "\n", "
\n", " ✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:08Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "nu\n", "
\n", "
str
\n", "\n", "
\n", " ✅ grammatical number (sg; du; pl; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:08Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
int
\n", "\n", "
\n", " ✅ sequence number of an object within its context\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:09Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", "
\n", " \n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:15Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "pargr\n", "
\n", "
str
\n", "\n", "
\n", " 🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:22:50Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional paragraph file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "pdp\n", "
\n", "
str
\n", "\n", "
\n", " ✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:10Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "pfm\n", "
\n", "
str
\n", "\n", "
\n", " ✅ preformative consonantal-transliterated (absent; n/a; J, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:11Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "prs\n", "
\n", "
str
\n", "\n", "
\n", " ✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:11Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_gn\n", "
\n", "
str
\n", "\n", "
\n", " ✅ pronominal suffix gender (m; f; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:11Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_nu\n", "
\n", "
str
\n", "\n", "
\n", " ✅ pronominal suffix number (sg; du; pl; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:12Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_ps\n", "
\n", "
str
\n", "\n", "
\n", " ✅ pronominal suffix person (p1; p2; p3; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:12Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "ps\n", "
\n", "
str
\n", "\n", "
\n", " ✅ grammatical person (p1; p2; p3; NA; unknown.)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:12Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "qere\n", "
\n", "
str
\n", "\n", "
\n", " ✅ word pointed-transliterated masoretic reading correction\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer\n", "
\n", "
str
\n", "\n", "
\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer_utf8\n", "
\n", "
str
\n", "\n", "
\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_utf8\n", "
\n", "
str
\n", "\n", "
\n", " ✅ word pointed-Hebrew masoretic reading correction\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:23:29Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional ketiv/qere file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "rank_lex\n", "
\n", "
int
\n", "\n", "
\n", " ✅ ranking of lexemes based on freqnuecy\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:24:46Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed on the basis of the ETCBC core set of features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "rela\n", "
\n", "
str
\n", "\n", "
\n", " ✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:13Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "sp\n", "
\n", "
str
\n", "\n", "
\n", " ✅ part-of-speech (art; verb; subs; nmpr, ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "st\n", "
\n", "
str
\n", "\n", "
\n", " ✅ state of a noun (a (absolute); c (construct); e (emphatic).)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:14Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "tab\n", "
\n", "
int
\n", "\n", "
\n", " ✅ clause atom: its level in the linguistic embedding\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "trailer\n", "
\n", "
str
\n", "\n", "
\n", " ✅ interword material pointed-transliterated (& 00 05 00_P ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:01Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "trailer_utf8\n", "
\n", "
str
\n", "\n", "
\n", " ✅ interword material pointed-Hebrew (־ ׃)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:01Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "txt\n", "
\n", "
str
\n", "\n", "
\n", " ✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "typ\n", "
\n", "
str
\n", "\n", "
\n", " ✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "uvf\n", "
\n", "
str
\n", "\n", "
\n", " ✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "vbe\n", "
\n", "
str
\n", "\n", "
\n", " ✅ verbal ending consonantal-transliterated (n/a; W; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "vbs\n", "
\n", "
str
\n", "\n", "
\n", " ✅ root formation consonantal-transliterated (absent; n/a; H; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "verse\n", "
\n", "
int
\n", "\n", "
\n", " ✅ verse number\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:18Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex\n", "
\n", "
str
\n", "\n", "
\n", " ✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:16Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex_utf8\n", "
\n", "
str
\n", "\n", "
\n", " ✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
from additional lexicon file provided by the ETCBC
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "vs\n", "
\n", "
str
\n", "\n", "
\n", " ✅ verbal stem (qal; piel; hif; apel; pael)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:18Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "vt\n", "
\n", "
str
\n", "\n", "
\n", " ✅ verbal tense (perf; impv; wayq; infc)\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:18Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "mother\n", "
\n", "
none
\n", "\n", "
\n", " ✅ linguistic dependency between textual objects\n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:18:22Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", "
\n", " \n", "
\n", "\n", "
\n", "
author:
\n", "
Eep Talstra Centre for Bible and Computer
\n", "
\n", "\n", "
\n", "
dataset:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
datasetName:
\n", "
Biblia Hebraica Stuttgartensia Amstelodamensis
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:21:17Z
\n", "
\n", "\n", "
\n", "
email:
\n", "
shebanq@ancient-data.org
\n", "
\n", "\n", "
\n", "
encoders:
\n", "
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
website:
\n", "
https://shebanq.ancient-data.org
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
Phonetic Transcriptions\n", "
\n", "\n", "
\n", "
\n", "phono\n", "
\n", "
str
\n", "\n", "
\n", " 🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)\n", "
\n", "\n", "
\n", "
author:
\n", "
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:25:55Z
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the phono notebook, see https://github.com/ETCBC/phono
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "phono_trailer\n", "
\n", "
str
\n", "\n", "
\n", " 🆗 interword material in phonological transcription\n", "
\n", "\n", "
\n", "
author:
\n", "
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:25:55Z
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the phono notebook, see https://github.com/ETCBC/phono
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
etcbc/lingo/heads/tf\n", "
\n", "\n", "
\n", "
\n", "heads\n", "
\n", "
none
\n", "\n", "
\n", " \n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
coreVersion:
\n", "
2021
\n", "
\n", "\n", "
\n", "
created_by:
\n", "
Cody Kingham
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-13T11:38:35Z
\n", "
\n", "\n", "
\n", "
source:
\n", "
see the notebook at https://github.com/etcbc/lingo/heads
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "noun_heads\n", "
\n", "
none
\n", "\n", "
\n", " \n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
coreVersion:
\n", "
2021
\n", "
\n", "\n", "
\n", "
created_by:
\n", "
Cody Kingham
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-13T11:38:36Z
\n", "
\n", "\n", "
\n", "
source:
\n", "
see the notebook at https://github.com/etcbc/lingo/heads
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "prep_obj\n", "
\n", "
none
\n", "\n", "
\n", " \n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
coreVersion:
\n", "
2021
\n", "
\n", "\n", "
\n", "
created_by:
\n", "
Cody Kingham
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-13T11:38:37Z
\n", "
\n", "\n", "
\n", "
source:
\n", "
see the notebook at https://github.com/etcbc/lingo/heads
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
etcbc/valence/tf\n", "
\n", "\n", "
\n", "
\n", "cfunction\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ corrected phrase function, only present for phrases that were in a correction sheet\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:38:56Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "f_correction\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ whether the phrase function has been manually corrected\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:38:56Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "grammatical\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ constituent role main classification\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:38:56Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "lexical\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ additional lexical characteristics\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:38:57Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "original\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ default value before enrichment logic has been applied\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:38:57Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "predication\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ verbal function main classification\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:38:57Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "s_manual\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ whether the generated enrichment features have been manually changed\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:38:57Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "semantic\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ additional semantic characteristics\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:38:57Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "sense\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ sense label of verb occurrences (d-; i.; -p; d-; ...)\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:39:56Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "valence\n", "
\n", "
str
\n", "\n", "
\n", " ❗️ verbal valence main classification\n", "
\n", "\n", "
\n", "
author:
\n", "
The content and nature of the features are by Janet Dyk, the workflow is by Dirk Roorda
\n", "
\n", "\n", "
\n", "
coreData:
\n", "
BHSA
\n", "
\n", "\n", "
\n", "
dateWritten:
\n", "
2021-12-09T14:38:58Z
\n", "
\n", "\n", "
\n", "
method:
\n", "
Generated blank correction and enrichment spreadsheets with selected clauses
\n", "
\n", "\n", "
\n", "
provenance:
\n", "
computed by the enrich and flowchart notebooks, see https://github.com/ETCBC/valence
\n", "
\n", "\n", "
\n", "
purpose:
\n", "
Support the decision process of assigning valence to verbs
\n", "
\n", "\n", "
\n", "
steps:
\n", "
sheets filled out by researcher; read back in by program; generated new features based on contents
\n", "
\n", "\n", "
\n", "
title:
\n", "
Correction and enrichment features
\n", "
\n", "\n", "
\n", "
version:
\n", "
2021
\n", "
\n", "\n", "
\n", "
writtenBy:
\n", "
Text-Fabric
\n", "
\n", "\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", " Settings:
specified
  1. apiVersion: 3
  2. appName: ETCBC/bhsa
  3. appPath: /Users/me/text-fabric-data/github/ETCBC/bhsa/app
  4. commit: gb112c161cfd21eae403d51a2733740d8743460e7
  5. css: ''
  6. dataDisplay:
    • exampleSectionHtml:<code>Genesis 1:1</code> (use <a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names</a>)
    • excludedFeatures:
      • g_uvf_utf8
      • g_vbs
      • kq_hybrid
      • languageISO
      • g_nme
      • lex0
      • is_root
      • g_vbs_utf8
      • g_uvf
      • dist
      • root
      • suffix_person
      • g_vbe
      • dist_unit
      • suffix_number
      • distributional_parent
      • kq_hybrid_utf8
      • crossrefSET
      • instruction
      • g_prs
      • lexeme_count
      • rank_occ
      • g_pfm_utf8
      • freq_occ
      • crossrefLCS
      • functional_parent
      • g_pfm
      • g_nme_utf8
      • g_vbe_utf8
      • kind
      • g_prs_utf8
      • suffix_gender
      • mother_object_type
    • noneValues:
      • absent
      • n/a
      • none
      • unknown
      • no value
      • NA
  7. docs:
    • docBase: {docRoot}/{repo}
    • docExt: ''
    • docPage: ''
    • docRoot: https://{org}.github.io
    • featurePage: 0_home
  8. interfaceDefaults: {}
  9. isCompatible: True
  10. local: local
  11. localDir: /Users/me/text-fabric-data/github/ETCBC/bhsa/_temp
  12. provenanceSpec:
    • corpus: BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
    • doi: 10.5281/zenodo.1007624
    • extraData: ner
    • moduleSpecs:
      • :
        • backend: no value
        • corpus: Phonetic Transcriptions
        • docUrl:https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
        • doi: 10.5281/zenodo.1007636
        • org: ETCBC
        • relative: /tf
        • repo: phono
      • :
        • backend: no value
        • corpus: Parallel Passages
        • docUrl:https://nbviewer.jupyter.org/github/ETCBC/parallels/blob/master/programs/parallels.ipynb
        • doi: 10.5281/zenodo.1007642
        • org: ETCBC
        • relative: /tf
        • repo: parallels
    • org: ETCBC
    • relative: /tf
    • repo: bhsa
    • version: 2021
    • webBase: https://shebanq.ancient-data.org/hebrew
    • webHint: Show this on SHEBANQ
    • webLang: la
    • webLexId: True
    • webUrl:{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
    • webUrlLex: {webBase}/word?version={version}&id=<lid>
  13. release: v1.8.1
  14. typeDisplay:
    • clause:
      • label: {typ} {rela}
      • style: ''
    • clause_atom:
      • hidden: True
      • label: {code}
      • level: 1
      • style: ''
    • half_verse:
      • hidden: True
      • label: {label}
      • style: ''
      • verselike: True
    • lex:
      • featuresBare: gloss
      • label: {voc_lex_utf8}
      • lexOcc: word
      • style: orig
      • template: {voc_lex_utf8}
    • phrase:
      • label: {typ} {function}
      • style: ''
    • phrase_atom:
      • hidden: True
      • label: {typ} {rela}
      • level: 1
      • style: ''
    • sentence:
      • label: {number}
      • style: ''
    • sentence_atom:
      • hidden: True
      • label: {number}
      • level: 1
      • style: ''
    • subphrase:
      • hidden: True
      • label: {number}
      • style: ''
    • word:
      • features: pdp vs vt
      • featuresBare: lex:gloss
  15. writing: hbo
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.header(allMeta=True)" ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "While this succeeded, there are scenarios where you have more trouble.\n", "For example, you decide that you really, really need the bhsa data as in release 1.7.1.\n", "\n", "Then you discover that this does note work:\n", "\n", "```\n", "A = use(\n", " 'etcbc/bhsa',\n", " version=\"c\",\n", " checkout=\"v1.7.1\",\n", " mod=(\"etcbc/lingo/heads/tf\" ,\"etcbc/valence/tf\", \"ch-jensen/participants/actor/tf\"), \n", " hoist=globals(),\n", ")\n", "```\n", "\n", "because the BHSA invokes two standard modules, `etcbc/phono/tf` and `etcbc/parallels/tf` and if you go to their\n", "GitHub repos, you see that they do not have a release `v1.7.1`.\n", "You have to walk through their releases and find one with the right data version.\n", "Having found them, you can then get it all like this:\n", "\n", "```\n", "A = use(\n", " 'etcbc/bhsa',\n", " version=\"c\",\n", " checkout=\"v1.7.1\",\n", " mod=(\n", " \"etcbc/phono/tf:1.2\",\n", " \"etcbc/parallels/tf:v1.2\",\n", " \"etcbc/lingo/heads/tf\",\n", " \"etcbc/valence/tf\",\n", " \"ch-jensen/participants/actor/tf\",\n", " ),\n", " hoist=globals(),\n", ")\n", "```" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Semantic actors\n", "\n", "Let's find out about *actor*.\n", "\n", "Again, we can click on the triangles and see information about the features.\n", "Christian has provided descriptions in the metadata of the features.\n", "\n", "And we can look into the data itself." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "415" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fl = F.actor.freqList()\n", "len(fl)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(('JHWH', 358),\n", " ('BN JFR>L', 205),\n", " ('>JC', 101),\n", " ('2sm\"YOUSgmas\"', 67),\n", " ('MCH', 60),\n", " ('>RY', 58),\n", " ('>TM', 45),\n", " ('>X \"YOUSgmas\"', 36),\n", " ('JFR>L', 35),\n", " ('KHN', 33))" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fl[0:10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Which nodes have an actor feature?" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'phrase_atom', 'subphrase'}" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "{F.otype.v(n) for n in N.walk() if F.actor.v(n)}" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.08s 2062 results\n" ] } ], "source": [ "results = A.search(\n", " \"\"\"\n", "phrase_atom actor\n", "\"\"\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's show some of the rarer actor values:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.10s 30 results\n" ] } ], "source": [ "results = A.search(\n", " \"\"\"\n", "phrase_atom actor=KHN\n", "\"\"\"\n", ")" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
npphrase_atom
1Leviticus 17:5אֶל־הַכֹּהֵ֑ן
2Leviticus 17:6זָרַ֨ק
3Leviticus 17:6הַכֹּהֵ֤ן
4Leviticus 17:6הִקְטִ֣יר
5Leviticus 19:22כִפֶּר֩
6Leviticus 19:22הַכֹּהֵ֜ן
7Leviticus 21:1אֶל־הַכֹּהֲנִ֖ים
8Leviticus 21:1בְּנֵ֣י אַהֲרֹ֑ן
9Leviticus 21:5יִקְרְח֤וּ
10Leviticus 21:5יְגַלֵּ֑חוּ
11Leviticus 21:5יִשְׂרְט֖וּ
12Leviticus 21:6קְדֹשִׁ֤ים
13Leviticus 21:6יִהְיוּ֙
14Leviticus 21:6יְחַלְּל֔וּ
15Leviticus 21:6הֵ֥ם
16Leviticus 21:6מַקְרִיבִ֖ם
17Leviticus 21:6הָ֥יוּ
18Leviticus 21:6קֹֽדֶשׁ׃
19Leviticus 21:7יִקָּ֔חוּ
20Leviticus 21:7יִקָּ֑חוּ
21Leviticus 22:11כֹהֵ֗ן
22Leviticus 22:11יִקְנֶ֥ה
23Leviticus 22:14לַכֹּהֵ֖ן
24Leviticus 23:10אֶל־הַכֹּהֵֽן׃
25Leviticus 23:11הֵנִ֧יף
26Leviticus 23:11יְנִיפֶ֖נּוּ
27Leviticus 23:11הַכֹּהֵֽן׃
28Leviticus 23:20הֵנִ֣יף
29Leviticus 23:20הַכֹּהֵ֣ן׀
30Leviticus 23:20לַכֹּהֵֽן׃
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.table(results)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 1" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

verse
sentence 7
clause Ptcp Attr
phrase CP Rela
phrase PPrP Subj
phrase VP PreC
clause WQt0 Coor
clause WQt0 Coor
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.show(results, start=1, end=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see no highlights!\n", "That is because phrase atoms are hidden by default. So let's unhide:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "A.displaySetup(hiddenTypes=\"subphrase clause_atom sentence_atom half_verse\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next calls to `show()` will work as if `hiddenTypes=\"subphrase clause_atom sentence_atom half_verse\"` is passed to them. " ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 1" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

verse
sentence 7
clause xYqX Adju
phrase CP Conj
phrase VP Pred
phrase_atom VP NA
actor=BN JFR>L
phrase NP Subj
phrase_atom NP NA
actor=BN JFR>L
phrase PP Objc
phrase_atom PP NA
actor=ZBX BN JFR>L
clause Ptcp Attr
phrase CP Rela
phrase_atom CP NA
phrase PPrP Subj
phrase_atom PPrP NA
actor=BN JFR>L
phrase VP PreC
phrase_atom VP NA
actor=BN JFR>L
phrase PP Loca
clause WQt0 Coor
phrase CP Conj
phrase_atom CP NA
phrase VP PreO
phrase_atom VP NA
actor=BN JFR>L
phrase PP Cmpl
phrase_atom PP NA
actor=JHWH
phrase PP Cmpl
phrase_atom PP NA
actor=PTX >HL MW<D
phrase PP Cmpl
phrase_atom PP NA
actor=KHN
clause WQt0 Coor
phrase CP Conj
phrase_atom CP NA
phrase VP Pred
phrase_atom VP NA
actor=BN JFR>L
phrase NP Objc
phrase_atom PP Spec
actor=JHWH
phrase PP Objc
phrase_atom PP NA
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.show(results, start=1, end=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We make the feature `sense` from the valence module visible:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 1" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

verse:1417594
sentence:1181377 7
clause:439665 xYqX Adju
phrase:688387 CP Conj
phrase_atom:943218 CP NA
phrase:688388 VP Pred
phrase_atom:943219 VP NA
actor=BN JFR>L
phrase:688389 NP Subj
phrase_atom:943220 NP NA
actor=BN JFR>L
phrase:688390 PP Objc
phrase_atom:943221 PP NA
actor=ZBX BN JFR>L
clause:439666 Ptcp Attr
phrase:688391 CP Rela
phrase_atom:943222 CP NA
phrase:688392 PPrP Subj
phrase_atom:943223 PPrP NA
actor=BN JFR>L
63102 הֵ֣ם
phrase:688393 VP PreC
phrase_atom:943224 VP NA
actor=BN JFR>L
sense=-p
phrase:688394 PP Loca
phrase_atom:943225 PP NA
clause:439667 WQt0 Coor
phrase:688395 CP Conj
phrase_atom:943226 CP NA
63108 וֶֽ
phrase:688396 VP PreO
phrase_atom:943227 VP NA
actor=BN JFR>L
phrase:688397 PP Cmpl
phrase_atom:943228 PP NA
actor=JHWH
phrase:688398 PP Cmpl
phrase_atom:943229 PP NA
actor=PTX >HL MW<D
phrase:688399 PP Cmpl
phrase_atom:943230 PP NA
actor=KHN
63116 אֶל־
63117 הַ
clause:439668 WQt0 Coor
phrase:688400 CP Conj
phrase_atom:943231 CP NA
63119 וְ
phrase:688401 VP Pred
phrase_atom:943232 VP NA
actor=BN JFR>L
sense=n.
phrase:688402 NP Objc
phrase_atom:943233 NP NA
phrase_atom:943234 PP Spec
actor=JHWH
phrase:688403 PP Objc
phrase_atom:943235 PP NA
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

result 2" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

verse:1417595
sentence:1181378 8
clause:439669 WQtX NA
phrase:688404 CP Conj
phrase_atom:943236 CP NA
63126 וְ
phrase:688405 VP Pred
phrase_atom:943237 VP NA
actor=KHN
sense=dp
phrase:688406 NP Subj
phrase_atom:943238 NP NA
actor=KHN
phrase:688407 PP Objc
phrase_atom:943239 PP NA
actor=DM
63130 אֶת־
63131 הַ
phrase:688408 PP Cmpl
phrase_atom:943240 PP NA
actor=MZBX JHWH
phrase_atom:943241 NP Spec
actor=PTX >HL MW<D
sentence:1181379 9
clause:439670 WQt0 NA
phrase:688409 CP Conj
phrase_atom:943242 CP NA
63139 וְ
phrase:688410 VP Pred
phrase_atom:943243 VP NA
actor=KHN
phrase:688411 NP Objc
phrase_atom:943244 NP NA
63141 הַ
phrase:688412 PP Cmpl
phrase_atom:943245 PP NA
phrase_atom:943246 PP Spec
actor=JHWH
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

result 3" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

verse:1417595
sentence:1181378 8
clause:439669 WQtX NA
phrase:688404 CP Conj
phrase_atom:943236 CP NA
63126 וְ
phrase:688405 VP Pred
phrase_atom:943237 VP NA
actor=KHN
sense=dp
phrase:688406 NP Subj
phrase_atom:943238 NP NA
actor=KHN
phrase:688407 PP Objc
phrase_atom:943239 PP NA
actor=DM
63130 אֶת־
63131 הַ
phrase:688408 PP Cmpl
phrase_atom:943240 PP NA
actor=MZBX JHWH
phrase_atom:943241 NP Spec
actor=PTX >HL MW<D
sentence:1181379 9
clause:439670 WQt0 NA
phrase:688409 CP Conj
phrase_atom:943242 CP NA
63139 וְ
phrase:688410 VP Pred
phrase_atom:943243 VP NA
actor=KHN
phrase:688411 NP Objc
phrase_atom:943244 NP NA
63141 הַ
phrase:688412 PP Cmpl
phrase_atom:943245 PP NA
phrase_atom:943246 PP Spec
actor=JHWH
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.show(results, start=1, end=3, withNodes=True, extraFeatures=\"sense\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# All together!\n", "\n", "Here is a query that shows results with all features." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.26s 30 results\n" ] } ], "source": [ "results = A.search(\n", " \"\"\"\n", "book book=Leviticus\n", " phrase sense*\n", " phrase_atom actor=KHN\n", " -heads> word\n", "\"\"\"\n", ")" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

verse 8" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

verse
book=Leviticus
sentence 27
clause CPen NA
phrase CP Conj
heads•\n", "\n", "
phrase_atom CP NA
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase NP Frnt
heads•\n", "\n", "
phrase_atom NP NA
actor=KHNheads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
clause xYq0 Resu
phrase CP Conj
heads•\n", "\n", "
phrase_atom CP NA
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase VP Pred
heads•\n", "\n", "
phrase_atom VP NA
actor=KHNheads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "sense=d-
phrase NP Objc
heads•\n", "\n", "
phrase_atom NP NA
actor=NPC_3heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase NP Adju
heads•\n", "\n", "
phrase_atom NP NA
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
clause XYqt Coor
phrase PPrP Subj
heads•\n", "\n", "
phrase_atom PPrP NA
actor=NPC_3heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase VP Pred
heads•\n", "\n", "
phrase_atom VP NA
actor=NPC_3heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "sense=-p
phrase PP Cmpl
heads•\n", "\n", "
phrase_atom PP NA
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
sentence 28
clause CPen NA
phrase CP Conj
heads•\n", "\n", "
phrase_atom CP NA
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase NP Frnt
heads•\n", "\n", "
phrase_atom NP NA
actor=JLJD BJT KHNheads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
clause XYqt Resu
phrase PPrP Subj
heads•\n", "\n", "
phrase_atom PPrP NA
actor=JLJD BJT KHNheads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
phrase VP Pred
heads•\n", "\n", "
phrase_atom VP NA
actor=JLJD BJT KHNheads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "sense=-p
phrase PP Cmpl
heads•\n", "\n", "
phrase_atom PP NA
heads•\n", "\n", "
heads•\n", "\n", " \n", "\n", "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.displaySetup(\n", " condensed=True,\n", " condenseType=\"verse\",\n", " hiddenTypes=\"subphrase clause_atom sentence_atom half_verse\",\n", ")\n", "A.show(results, start=8, end=8)\n", "A.displaySetup()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise\n", "\n", "See whether you can find the quote in the Easter egg that is in\n", "`etcbc/lingo/easter/tf` !" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# All steps\n", "\n", "* **[start](start.ipynb)** your first step in mastering the bible computationally\n", "* **[display](display.ipynb)** become an expert in creating pretty displays of your text structures\n", "* **[search](search.ipynb)** turbo charge your hand-coding with search templates\n", "* **[export Excel](exportExcel.ipynb)** make tailor-made spreadsheets out of your results\n", "* **share** draw in other people's data and let them use yours\n", "* **[export](export.ipynb)** export your dataset as an Emdros database\n", "* **[annotate](annotate.ipynb)** annotate plain text by means of other tools and import the annotations as TF features\n", "* **[map](map.ipynb)** map somebody else's annotations to a new version of the corpus\n", "* **[volumes](volumes.ipynb)** work with selected books only\n", "* **[trees](trees.ipynb)** work with the BHSA data as syntax trees\n", "\n", "CC-BY Dirk Roorda" ] } ], "metadata": { "jupytext": { "encoding": "# -*- coding: utf-8 -*-" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.1" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }