{ "cells": [ { "cell_type": "markdown", "id": "b412ef40-3d6a-422d-a658-0f7f9474e423", "metadata": {}, "source": [ "# The 'center' of the Torah (BHSA)" ] }, { "cell_type": "markdown", "id": "8da22526-375c-406c-a63c-bc425007d7e3", "metadata": {}, "source": [ "## Table of content (ToC)\n", "\n", "* 1 - Introduction\n", "* 2 - Load Text-Fabric app and data\n", "* 3 - Performing the queries\n", " * 3.1 - Center book (Leviticus)\n", " * 3.2 - Center chapter (Leviticus Chapter 4)\n", " * 3.3 - Center verse (Lev 8:9)\n", " * 3.4 - Center sentence (Ex 36:11)\n", " * 3.5 - Center clause (Lev 4:35)\n", " * 3.6 - Center phrase (Lev 4:32)\n", " * 3.7 - Center word based upon word node (Lev 8:21)\n", " * 3.8 - Center word based on spaces and maqaf (Lev 8:15)\n", " * 3.9 - Center word based upon using feature 'wordboundary' (Lev 8:15)\n", " * 3.10 - Center word based on spaces (Lev 8:22)\n", " * 3.11 - Center word based on selected part of speech (Lev 8:21)\n", " * 3.12 - Other opinion - Stone Tenach (Lev 10:16)\n", "* 4 - Attribution and footnotes\n", "* 5 - Required libraries\n", "* 6 - Notebook details\n" ] }, { "cell_type": "markdown", "id": "06785b1d-4bde-40bd-813d-9381f48e8aa4", "metadata": {}, "source": [ "# 1 - Introduction \n", "##### [Back to ToC](#TOC)" ] }, { "cell_type": "markdown", "id": "69c871f8-82b9-47b3-b3cd-95c9a3b7ed12", "metadata": {}, "source": [ "It is a common belief that the center of a text segment, a book, or a specific set of books contains the central message. This notebook explores various methods to answer the question, 'What is the center of the Torah?' The main prerequisite for answering this is determining by what measure this center is to be established." ] }, { "cell_type": "markdown", "id": "525e2cf9-08e0-4d20-a5f1-b7b7a15662fa", "metadata": {}, "source": [ "# 2 - Load Text-Fabric app and data \n", "##### [Back to ToC](#TOC)" ] }, { "cell_type": "markdown", "id": "1e3b8848-0ece-4ee1-98b7-bb47d34cb45a", "metadata": {}, "source": [ "This NoteBook uses the ETCBC BHSA as dataset representing the Hebrew text of the TeNaCh." ] }, { "cell_type": "code", "execution_count": 2, "id": "ef428d41-2caa-4522-95cf-ef39e4f0e8da", "metadata": { "tags": [] }, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 3, "id": "77b1cb10-629c-4653-b4d0-afc1d13e9d7e", "metadata": {}, "outputs": [], "source": [ "# Loading the Text-Fabric code\n", "# Note: it is assumed Text-Fabric is installed in your environment.\n", "from tf.fabric import Fabric\n", "from tf.app import use" ] }, { "cell_type": "code", "execution_count": 4, "id": "e256d50f-a0d1-4c4c-819d-33bab7fb75c7", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/etcbc/BHSA/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/etcbc/BHSA/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/etcbc/phono/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/etcbc/parallels/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " TF: TF API 12.6.2, etcbc/BHSA/app v3, Search Reference
\n", " Data: etcbc - BHSA 2021, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots / node% coverage
book3910938.21100
chapter929459.19100
lex923046.22100
verse2321318.38100
half_verse451799.44100
sentence637176.70100
sentence_atom645146.61100
clause881314.84100
clause_atom907044.70100
phrase2532031.68100
phrase_atom2675321.59100
subphrase1138501.4238
word4265901.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
Parallel Passages\n", "
\n", "\n", "
\n", "
\n", "wordboundary\n", "
\n", "
str
\n", "\n", " indicates wordboudaries (spaces OR maqaf)\n", "\n", "
\n", "\n", "
\n", "
\n", "crossref\n", "
\n", "
int
\n", "\n", " 🆗 links between similar passages\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis\n", "
\n", "\n", "
\n", "
\n", "book\n", "
\n", "
str
\n", "\n", " ✅ book name in Latin (Genesis; Numeri; Reges1; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "book@ll\n", "
\n", "
str
\n", "\n", " ✅ book name in amharic (ኣማርኛ)\n", "\n", "
\n", "\n", "
\n", "
\n", "chapter\n", "
\n", "
int
\n", "\n", " ✅ chapter number (1; 2; 3; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "code\n", "
\n", "
int
\n", "\n", " ✅ identifier of a clause atom relationship (0; 74; 367; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "det\n", "
\n", "
str
\n", "\n", " ✅ determinedness of phrase(atom) (det; und; NA.)\n", "\n", "
\n", "\n", "
\n", "
\n", "domain\n", "
\n", "
str
\n", "\n", " ✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)\n", "\n", "
\n", "\n", "
\n", "
\n", "freq_lex\n", "
\n", "
int
\n", "\n", " ✅ frequency of lexemes\n", "\n", "
\n", "\n", "
\n", "
\n", "function\n", "
\n", "
str
\n", "\n", " ✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons\n", "
\n", "
str
\n", "\n", " ✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons_utf8\n", "
\n", "
str
\n", "\n", " ✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex\n", "
\n", "
str
\n", "\n", " ✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex_utf8\n", "
\n", "
str
\n", "\n", " ✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_word\n", "
\n", "
str
\n", "\n", " ✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_word_utf8\n", "
\n", "
str
\n", "\n", " ✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)\n", "\n", "
\n", "\n", "
\n", "
\n", "gloss\n", "
\n", "
str
\n", "\n", " 🆗 english translation of lexeme (beginning create god(s))\n", "\n", "
\n", "\n", "
\n", "
\n", "gn\n", "
\n", "
str
\n", "\n", " ✅ grammatical gender (m; f; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "label\n", "
\n", "
str
\n", "\n", " ✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)\n", "\n", "
\n", "\n", "
\n", "
\n", "language\n", "
\n", "
str
\n", "\n", " ✅ of word or lexeme (Hebrew; Aramaic.)\n", "\n", "
\n", "\n", "
\n", "
\n", "lex\n", "
\n", "
str
\n", "\n", " ✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)\n", "\n", "
\n", "\n", "
\n", "
\n", "lex_utf8\n", "
\n", "
str
\n", "\n", " ✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)\n", "\n", "
\n", "\n", "
\n", "
\n", "ls\n", "
\n", "
str
\n", "\n", " ✅ lexical set, subclassification of part-of-speech (card; ques; mult)\n", "\n", "
\n", "\n", "
\n", "
\n", "nametype\n", "
\n", "
str
\n", "\n", " ⚠️ named entity type (pers; mens; gens; topo; ppde.)\n", "\n", "
\n", "\n", "
\n", "
\n", "nme\n", "
\n", "
str
\n", "\n", " ✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "nu\n", "
\n", "
str
\n", "\n", " ✅ grammatical number (sg; du; pl; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
int
\n", "\n", " ✅ sequence number of an object within its context\n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "pargr\n", "
\n", "
str
\n", "\n", " 🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "pdp\n", "
\n", "
str
\n", "\n", " ✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "pfm\n", "
\n", "
str
\n", "\n", " ✅ preformative consonantal-transliterated (absent; n/a; J, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_gn\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix gender (m; f; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_nu\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix number (sg; du; pl; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_ps\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix person (p1; p2; p3; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "ps\n", "
\n", "
str
\n", "\n", " ✅ grammatical person (p1; p2; p3; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "qere\n", "
\n", "
str
\n", "\n", " ✅ word pointed-transliterated masoretic reading correction\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer\n", "
\n", "
str
\n", "\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer_utf8\n", "
\n", "
str
\n", "\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_utf8\n", "
\n", "
str
\n", "\n", " ✅ word pointed-Hebrew masoretic reading correction\n", "\n", "
\n", "\n", "
\n", "
\n", "rank_lex\n", "
\n", "
int
\n", "\n", " ✅ ranking of lexemes based on freqnuecy\n", "\n", "
\n", "\n", "
\n", "
\n", "rela\n", "
\n", "
str
\n", "\n", " ✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "sp\n", "
\n", "
str
\n", "\n", " ✅ part-of-speech (art; verb; subs; nmpr, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "st\n", "
\n", "
str
\n", "\n", " ✅ state of a noun (a (absolute); c (construct); e (emphatic).)\n", "\n", "
\n", "\n", "
\n", "
\n", "tab\n", "
\n", "
int
\n", "\n", " ✅ clause atom: its level in the linguistic embedding\n", "\n", "
\n", "\n", "
\n", "
\n", "trailer\n", "
\n", "
str
\n", "\n", " ✅ interword material pointed-transliterated (& 00 05 00_P ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "trailer_utf8\n", "
\n", "
str
\n", "\n", " ✅ interword material pointed-Hebrew (־ ׃)\n", "\n", "
\n", "\n", "
\n", "
\n", "txt\n", "
\n", "
str
\n", "\n", " ✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)\n", "\n", "
\n", "\n", "
\n", "
\n", "typ\n", "
\n", "
str
\n", "\n", " ✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)\n", "\n", "
\n", "\n", "
\n", "
\n", "uvf\n", "
\n", "
str
\n", "\n", " ✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "vbe\n", "
\n", "
str
\n", "\n", " ✅ verbal ending consonantal-transliterated (n/a; W; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "vbs\n", "
\n", "
str
\n", "\n", " ✅ root formation consonantal-transliterated (absent; n/a; H; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "verse\n", "
\n", "
int
\n", "\n", " ✅ verse number\n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex\n", "
\n", "
str
\n", "\n", " ✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)\n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex_utf8\n", "
\n", "
str
\n", "\n", " ✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)\n", "\n", "
\n", "\n", "
\n", "
\n", "vs\n", "
\n", "
str
\n", "\n", " ✅ verbal stem (qal; piel; hif; apel; pael)\n", "\n", "
\n", "\n", "
\n", "
\n", "vt\n", "
\n", "
str
\n", "\n", " ✅ verbal tense (perf; impv; wayq; infc)\n", "\n", "
\n", "\n", "
\n", "
\n", "mother\n", "
\n", "
none
\n", "\n", " ✅ linguistic dependency between textual objects\n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
Phonetic Transcriptions\n", "
\n", "\n", "
\n", "
\n", "phono\n", "
\n", "
str
\n", "\n", " 🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)\n", "\n", "
\n", "\n", "
\n", "
\n", "phono_trailer\n", "
\n", "
str
\n", "\n", " 🆗 interword material in phonological transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", " Settings:
specified
  1. apiVersion: 3
  2. appName: etcbc/BHSA
  3. appPath: C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/app
  4. commit: gd905e3fb6e80d0fa537600337614adc2af157309
  5. css: ''
  6. dataDisplay:
    • exampleSectionHtml:<code>Genesis 1:1</code> (use <a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names</a>)
    • excludedFeatures:
      • g_uvf_utf8
      • g_vbs
      • kq_hybrid
      • languageISO
      • g_nme
      • lex0
      • is_root
      • g_vbs_utf8
      • g_uvf
      • dist
      • root
      • suffix_person
      • g_vbe
      • dist_unit
      • suffix_number
      • distributional_parent
      • kq_hybrid_utf8
      • crossrefSET
      • instruction
      • g_prs
      • lexeme_count
      • rank_occ
      • g_pfm_utf8
      • freq_occ
      • crossrefLCS
      • functional_parent
      • g_pfm
      • g_nme_utf8
      • g_vbe_utf8
      • kind
      • g_prs_utf8
      • suffix_gender
      • mother_object_type
    • noneValues:
      • none
      • unknown
      • no value
      • NA
  7. docs:
    • docBase: {docRoot}/{repo}
    • docExt: ''
    • docPage: ''
    • docRoot: https://{org}.github.io
    • featurePage: 0_home
  8. interfaceDefaults: {}
  9. isCompatible: True
  10. local: local
  11. localDir: C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/_temp
  12. provenanceSpec:
    • corpus: BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
    • doi: 10.5281/zenodo.1007624
    • moduleSpecs:
      • :
        • backend: no value
        • corpus: Phonetic Transcriptions
        • docUrl:https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
        • doi: 10.5281/zenodo.1007636
        • org: etcbc
        • relative: /tf
        • repo: phono
      • :
        • backend: no value
        • corpus: Parallel Passages
        • docUrl:https://nbviewer.jupyter.org/github/etcbc/parallels/blob/master/programs/parallels.ipynb
        • doi: 10.5281/zenodo.1007642
        • org: etcbc
        • relative: /tf
        • repo: parallels
    • org: etcbc
    • relative: /tf
    • repo: BHSA
    • version: 2021
    • webBase: https://shebanq.ancient-data.org/hebrew
    • webHint: Show this on SHEBANQ
    • webLang: la
    • webLexId: True
    • webUrl:{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
    • webUrlLex: {webBase}/word?version={version}&id=<lid>
  13. release: v1.8
  14. typeDisplay:
    • clause:
      • label: {typ} {rela}
      • style: ''
    • clause_atom:
      • hidden: True
      • label: {code}
      • level: 1
      • style: ''
    • half_verse:
      • hidden: True
      • label: {label}
      • style: ''
      • verselike: True
    • lex:
      • featuresBare: gloss
      • label: {voc_lex_utf8}
      • lexOcc: word
      • style: orig
      • template: {voc_lex_utf8}
    • phrase:
      • label: {typ} {function}
      • style: ''
    • phrase_atom:
      • hidden: True
      • label: {typ} {rela}
      • level: 1
      • style: ''
    • sentence:
      • label: {number}
      • style: ''
    • sentence_atom:
      • hidden: True
      • label: {number}
      • level: 1
      • style: ''
    • subphrase:
      • hidden: True
      • label: {number}
      • style: ''
    • word:
      • features: pdp vs vt
      • featuresBare: lex:gloss
  15. writing: hbo
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
TF API: names N F E L T S C TF Fs Fall Es Eall Cs Call directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# load the BHSL app and data\n", "BHS = use (\"etcbc/BHSA\",hoist=globals())" ] }, { "cell_type": "markdown", "id": "d32502e9-c6ae-45d6-ac6f-e298b4315bde", "metadata": {}, "source": [ "Note: The Text-Fabric feature documentation can be found at [ETCBC GitHub](https://github.com/ETCBC/bhsa/blob/master/docs/features/0_home.md) " ] }, { "cell_type": "code", "execution_count": 4, "id": "20826b6e-5511-448d-a0da-8abdbd67eb77", "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer)\n", "BHS.dh(BHS.getCss())" ] }, { "cell_type": "markdown", "id": "947d7fd8-20c9-4b82-8b37-45ca74d18ef4", "metadata": {}, "source": [ "# 3 - Performing the queries \n", "##### [Back to ToC](#TOC)" ] }, { "cell_type": "markdown", "id": "1ac06c2b-91bc-47a9-9ac8-1c574ce9473e", "metadata": {}, "source": [ "An important feature used in the queries will be 'number'. This starts with 1. The manner of numbering objects differs per object type. The following are of interest for this research:\n", "\n", "type | numbering\n", "--- | ---\n", "phrase_atom | within the book\n", "clause_atom | within the book\n", "sentence_atom | within the book\n", "word | within the book\n", "\n", "Note: Full Text-Fabric feature documentation is found [here](https://github.com/ETCBC/bhsa/blob/master/docs/features/number.md)" ] }, { "cell_type": "markdown", "id": "b85b2729-5f44-47fd-bfd2-eb1d176f82ff", "metadata": {}, "source": [ "**Important observation:** The BHSA inserts nodes for implicit articles (which are only visable in the vocalisation). See example below:\n", "\n", "" ] }, { "cell_type": "markdown", "id": "57699b9b-8c44-464f-a264-380ddbcf8c1e", "metadata": {}, "source": [ "## 3.1 - Center book\n", "##### [Back to TOC](#TOC)" ] }, { "cell_type": "markdown", "id": "eb211e70-8eca-4629-91cc-38018991fb65", "metadata": {}, "source": [ "Rather trivially, Leviticus constitutes the center of the five books of the Torah." ] }, { "cell_type": "markdown", "id": "30adfe82-5d55-446b-9472-c0089c6b5541", "metadata": { "tags": [] }, "source": [ "## 3.2 - Center chapter \n", "##### [Back to TOC](#TOC)" ] }, { "cell_type": "markdown", "id": "08064c7e-5d49-4a54-ab33-a8511bd2670c", "metadata": {}, "source": [ "The following method is based upon the center chapter." ] }, { "cell_type": "code", "execution_count": 5, "id": "d12239fb-ad50-4dd0-b349-2302abc96be3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.05s 187 results\n" ] } ], "source": [ "# number of chapters in Torah\n", "ChapterQuery = '''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", " chapter \n", "'''\n", "\n", "ChapterResults = BHS.search(ChapterQuery)" ] }, { "cell_type": "code", "execution_count": 6, "id": "4b612d9f-9a35-439a-a604-28025472262e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(426630, 427558)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "F.otype.sInterval('chapter')" ] }, { "cell_type": "code", "execution_count": 7, "id": "cbd00395-9dc8-42e0-a757-8d0fd94eb6da", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('Leviticus', 4)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# start + delta: 426630 + int(187/2) = 426630 + 93 = 426723\n", "T.sectionFromNode(426723)" ] }, { "cell_type": "markdown", "id": "744154c9-69a4-44ae-994f-905936faf2f1", "metadata": {}, "source": [ "## 3.3 - Center verse " ] }, { "cell_type": "markdown", "id": "7c2e8831-a1e3-4531-a53c-f2b671b1238f", "metadata": {}, "source": [ "This method is based upon the middle verse in the Torah." ] }, { "cell_type": "code", "execution_count": 8, "id": "e361025d-7166-4e20-afbf-0189fabc2d0f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.05s 5853 results\n" ] } ], "source": [ "# number of verses in Torah\n", "VerseQuery = '''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", " verse \n", "'''\n", "\n", "VerseResults = BHS.search(VerseQuery)" ] }, { "cell_type": "markdown", "id": "e9993cb4-13c8-441e-8376-949852048b12", "metadata": {}, "source": [ "Determine boundaries of the verse node-numbers." ] }, { "cell_type": "code", "execution_count": 9, "id": "d4d84db1-8411-4011-9fdf-b4d85a406960", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1414389, 1437601)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "F.otype.sInterval('verse')" ] }, { "cell_type": "code", "execution_count": 10, "id": "dff076df-be15-4ba0-b0a6-4ade13c9bef3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('Leviticus', 8, 9)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# start + delta: 1414389 + int(5853/2) = 1414389 + 2926 = 1417315\n", "T.sectionFromNode(1417315)" ] }, { "cell_type": "code", "execution_count": 11, "id": "13888b1c-20c5-4107-9fa1-943d614929a4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'וַיָּ֥שֶׂם אֶת־הַמִּצְנֶ֖פֶת עַל־רֹאשֹׁ֑ו וַיָּ֨שֶׂם עַֽל־הַמִּצְנֶ֜פֶת אֶל־מ֣וּל פָּנָ֗יו אֵ֣ת צִ֤יץ הַזָּהָב֙ נֵ֣זֶר הַקֹּ֔דֶשׁ כַּאֲשֶׁ֛ר צִוָּ֥ה יְהוָ֖ה אֶת־מֹשֶֽׁה׃ '" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T.text(1417315)" ] }, { "cell_type": "code", "execution_count": 12, "id": "84a4378c-60c3-418f-b8f4-3f5adff650e7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 2926" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

book Leviticus
book=Leviticus
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
verse
book=Leviticus
sentence 16
clause Way0 NA
phrase CP Conj
phrase VP Pred
phrase PP Cmpl
sentence 17
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "BHS.show(VerseResults,start=2926,end=2926, multiFeatures=False)" ] }, { "cell_type": "markdown", "id": "0caff32b-2ab5-4976-ae32-f9145df01b25", "metadata": {}, "source": [ "This verse in the King James Version:\n", "> And he put the mitre upon his head; also upon the mitre, even upon his forefront, did he put the golden plate, the holy crown; as the Lord commanded Moses." ] }, { "cell_type": "markdown", "id": "935c6015-0b94-479e-890c-2e49eb4db512", "metadata": {}, "source": [ "## 3.4 - Center sentence " ] }, { "cell_type": "markdown", "id": "dba783d1-b399-4c20-a175-6a1d5f3518af", "metadata": {}, "source": [ "The following method is based upon the center sentence. In this method the sentence definition used is the one according to the ETCBC database, which at places differs from other databases." ] }, { "cell_type": "code", "execution_count": 13, "id": "66b221aa-04d4-42da-9a32-10bb2057ba63", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.10s 15088 results\n" ] } ], "source": [ "# number of sentences in Torah\n", "SentenceQuery = '''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", " sentence \n", "'''\n", "\n", "SentenceResults = BHS.search(SentenceQuery)" ] }, { "cell_type": "markdown", "id": "52d70ff4-441f-43a5-87fe-e498dcb4fde2", "metadata": {}, "source": [ "Determining the interval of sentence node-numbers." ] }, { "cell_type": "code", "execution_count": 14, "id": "384fc0c6-c270-4a0a-bd8d-cb5c8428ab7e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1172308, 1236024)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "F.otype.sInterval('sentence')" ] }, { "cell_type": "code", "execution_count": 15, "id": "2d4c1485-48a5-474d-9479-9966c1c13c2d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('Exodus', 36, 11)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# start + delta: 1172308 + int(15088/2) = 1172308 + 7544 = 1179852\n", "T.sectionFromNode(1179852)" ] }, { "cell_type": "code", "execution_count": 16, "id": "9b0ea95f-1584-4c26-83b5-2231a906d80f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'וַיַּ֜עַשׂ לֻֽלְאֹ֣ת תְּכֵ֗לֶת עַ֣ל שְׂפַ֤ת הַיְרִיעָה֙ הָֽאֶחָ֔ת מִקָּצָ֖ה בַּמַּחְבָּ֑רֶת '" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T.text(1179852)" ] }, { "cell_type": "code", "execution_count": 17, "id": "2420b17d-bb54-42e4-b57c-72085f5c271d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 7544" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

book Exodus
book=Exodus
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
verse
book=Exodus
sentence 19
clause Way0 NA
clause Ellp Adju
phrase NP Objc
phrase PP Cmpl
sentence 20
clause WxQ0 NA
phrase CP Conj
phrase VP Pred
clause Ellp Adju
phrase NP Objc
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# 15088 results / 2 = 7544 \n", "BHS.show(SentenceResults,start=7544,end=7544,multiFeatures=False)" ] }, { "cell_type": "markdown", "id": "1d3be2a3-6d32-4efa-a351-5dd2761ade62", "metadata": {}, "source": [ "This sentence in the King James Version:\n", "> and the other five curtains he coupled one unto another.\n", "\n", "Note that in the KJV this is a subsentence." ] }, { "cell_type": "markdown", "id": "2ab19500-9c29-42bc-b185-4f1131541796", "metadata": {}, "source": [ "## 3.5 - Center clause " ] }, { "cell_type": "markdown", "id": "7fe30026-44e9-4b36-a0e0-fc8b7df63479", "metadata": {}, "source": [ "The following method is based upon the center clause. In this method the clause definition used is the one according to the ETCBC database, which may slightly differ in other implementations." ] }, { "cell_type": "code", "execution_count": 18, "id": "a3294875-8052-41d1-8ca9-44f7cbe6cd0b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.11s 21181 results\n" ] } ], "source": [ "# number of clauses in Torah\n", "ClauseQuery = '''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", " clause \n", "'''\n", "\n", "ClauseResults = BHS.search(ClauseQuery)" ] }, { "cell_type": "markdown", "id": "28264f93-2fa7-4d47-853d-28c86cfba2bf", "metadata": {}, "source": [ "Determining the interval of clause node-numbers." ] }, { "cell_type": "code", "execution_count": 19, "id": "db9f9a08-7e9e-4fea-92f8-bc9baf2220d4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(427559, 515689)" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "F.otype.sInterval('clause')" ] }, { "cell_type": "code", "execution_count": 20, "id": "92827185-0ce5-495a-bece-a17ac430e081", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('Leviticus', 4, 35)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# start + delta: 427559 + int(21181/2) = 427559 + 10590 = 438149\n", "T.sectionFromNode(438149)" ] }, { "cell_type": "code", "execution_count": 21, "id": "e901a175-6c56-4b25-950f-f0eaa71d5df5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'וְאֶת־כָּל־חֶלְבָּ֣ה יָסִ֗יר '" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T.text(438149)" ] }, { "cell_type": "code", "execution_count": 22, "id": "561be90b-a335-4fcd-b9ed-4251e181f9b7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 10590" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

book Leviticus
book=Leviticus
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
verse
book=Leviticus
sentence 75
clause WQtX NA
phrase CP Conj
phrase VP Pred
phrase NP Subj
sentence 76
clause WQt0 NA
sentence 77
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# 21181 results / 2 = 10590,5 -> midpoint = 10590\n", "BHS.show(ClauseResults,start=10590,end=10590, multiFeatures=False)" ] }, { "cell_type": "markdown", "id": "bdb286ff-5449-4c3f-b5b8-ec8e25d19a04", "metadata": {}, "source": [ "In the King James Version:\n", "> and shall pour out all the blood thereof at the bottom of the altar\n", "\n", "Note that while in the ETCBC BHSA sentences often contain multiple clauses, this clause constitutes a full sentence." ] }, { "cell_type": "markdown", "id": "653d6375-57b3-4863-bd0a-98c4010c698d", "metadata": {}, "source": [ "## 3.6 - Center phrase " ] }, { "cell_type": "markdown", "id": "cc4eb0c7-4ac0-45c3-86a1-03764367f169", "metadata": {}, "source": [ "The following method is based upon the center phrase. In this method the clause definition used is the one according to the ETCBC database, following a more-or-less general understanding of what does constitute a phrase." ] }, { "cell_type": "code", "execution_count": 23, "id": "30182b63-2ede-46dc-8837-e84ba46dccb6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.30s 64195 results\n" ] } ], "source": [ "# number of phrases in Torah\n", "PhraseQuery = '''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", " phrase \n", "'''\n", "\n", "PhraseResults = BHS.search(PhraseQuery)" ] }, { "cell_type": "markdown", "id": "6ba3dd7e-6cc2-45f5-a669-3f5bc9c4b0d6", "metadata": {}, "source": [ "Determining the interval of phrase node-numbers." ] }, { "cell_type": "code", "execution_count": 24, "id": "91c6c321-3910-4c53-b210-3a0cfddc8f4b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(651573, 904775)" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "F.otype.sInterval('phrase')" ] }, { "cell_type": "code", "execution_count": 25, "id": "84aac874-62aa-47f8-9be8-b48cf06ed1cb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('Leviticus', 4, 32)" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# start + delta: 651573 + int(64195/2) = 651573 + 32097 = 683670\n", "T.sectionFromNode(683670)" ] }, { "cell_type": "code", "execution_count": 26, "id": "81906af7-5a76-424a-8371-31302885101e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'נְקֵבָ֥ה תְמִימָ֖ה '" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T.text(683670)" ] }, { "cell_type": "code", "execution_count": 27, "id": "9c7280c5-4931-409f-8e7b-4fb17bbf22c3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 32098" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

book Leviticus
book=Leviticus
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
verse
book=Leviticus
sentence 71
clause WxY0 NA
phrase CP Conj
phrase CP Conj
phrase NP Objc
phrase VP Pred
sentence 72
clause xYq0 NA
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# 64195 results /2 = 32097,5 -> midpoint = 32098\n", "BHS.show(PhraseResults,start=32098,end=32098,multiFeatures=False)" ] }, { "cell_type": "markdown", "id": "51d802b5-92cc-4ebe-9ad4-620c7f5500f5", "metadata": {}, "source": [ "In the King James Version:\n", "\n", "> a female without blemish" ] }, { "cell_type": "markdown", "id": "d8098a6c-bc55-4a38-8d92-093d246ba36d", "metadata": {}, "source": [ "## 3.7 - Center word - based upon center word node" ] }, { "cell_type": "markdown", "id": "9257ec4c-1c27-4dab-b6b3-352b8fea19d3", "metadata": {}, "source": [ "This method assumes the mathematical center of the list of word nodes provides us the center of the Torah." ] }, { "cell_type": "code", "execution_count": 28, "id": "45f0cfbf-86e9-47c4-89be-3b31a1b6b85f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.48s 112927 results\n" ] } ], "source": [ "# number of words in Torah (WARNING: as per ETCBC definition!) \n", "WordQuery = '''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", " word \n", "'''\n", "\n", "WordResults = BHS.search(WordQuery)" ] }, { "cell_type": "markdown", "id": "76187fe5-9cde-46ec-b6a5-0d1d33e56b04", "metadata": {}, "source": [ "The following code validates that the word nodes are numbered starting from '1'." ] }, { "cell_type": "code", "execution_count": 29, "id": "7504d521-55bb-412f-a26b-a761ff33ab74", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1, 426590)" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "F.otype.sInterval('word')" ] }, { "cell_type": "markdown", "id": "0741e7a3-00e3-45b9-b34b-15da722a8067", "metadata": {}, "source": [ "Find the midle word node " ] }, { "cell_type": "code", "execution_count": 30, "id": "1c903e0c-946d-438e-92e3-d42d33419d62", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('Leviticus', 8, 21)" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# start + delta: 1 + int(112927/2) = 1 + 56463 = 56464\n", "T.sectionFromNode(56464)" ] }, { "cell_type": "code", "execution_count": 31, "id": "8bbe56f8-8640-41ab-99d5-7a755fd9a185", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'בַּ'" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T.text(56464)" ] }, { "cell_type": "code", "execution_count": 32, "id": "9c16893f-9d60-40ba-860c-6560e2e6c9df", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 56464" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

book Leviticus
book=Leviticus
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
verse
book=Leviticus
sentence 48
clause WxQ0 NA
sentence 49
clause WayX NA
phrase CP Conj
phrase VP Pred
phrase PrNP Subj
sentence 50
clause NmCl NA
phrase NP PreC
phrase PPrP Subj
sentence 51
clause NmCl NA
phrase NP PreC
phrase PPrP Subj
phrase PP Cmpl
clause xQtX Adju
phrase CP Conj
phrase VP Pred
phrase PrNP Subj
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# 112927 results /2 = 56463,5 -> midpoint = 56464\n", "BHS.show(WordResults,start=56464,end=56464,multiFeatures=False)" ] }, { "cell_type": "markdown", "id": "302013d5-2a96-4133-a28b-a687d129914c", "metadata": {}, "source": [ "If this would be 'translated' into a meaningfull 'center' clause, it could be:\n", "> 'wash in the water'. " ] }, { "cell_type": "markdown", "id": "03849162-6e7e-45ee-904b-70e3206c31ba", "metadata": { "tags": [] }, "source": [ "## 3.8 - Center word based on spaces and maqaf" ] }, { "cell_type": "markdown", "id": "aa074722-acfe-42e8-b820-36f817c63111", "metadata": {}, "source": [ "Here the number of words in the Torah is determined by items separeted by spaces OR maqaf (diacritical mark indicating a strong connection between words). \n", "\n", "First check what can be placed after an individual word" ] }, { "cell_type": "code", "execution_count": 33, "id": "cc6a238f-88b0-4fe1-b0a3-9df6bfb91f0d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((' ', 236930),\n", " ('', 121801),\n", " ('&', 42275),\n", " ('00 ', 20146),\n", " ('05 ', 2266),\n", " ('00_S ', 1892),\n", " ('00_P ', 1165),\n", " ('_S ', 76),\n", " (' 05 ', 17),\n", " ('_P ', 13),\n", " ('00_N ', 7),\n", " ('00_N_P ', 1),\n", " ('00_N_S ', 1))" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# note: this is for the full TeNaCH!\n", "F.trailer.freqList()" ] }, { "cell_type": "markdown", "id": "b147fb63-c711-4341-bd1e-7ca061e87008", "metadata": {}, "source": [ "In this list, the ' ' value (i.e. a space) is used when the word is joined to the next word, while '&' indicates a maqqef (־), a diacritical mark indicating a strong connection between words. We consider both as word separators. Examining the frequency list above there are two methods to determine the word boundaries. The first is utilizing the fact that all feature values indicating a wordboundary are of lenght 1 or higher, allowing the string `(.+)` to exclude all cases where the lenght is less than 1 character. The other option is to explicitly look for spaces and maqqefs, by using `[\\s&]` as regex expression. As expected, both product the same outcome. The following query determines the number of words in the torah based on this methond of counting." ] }, { "cell_type": "code", "execution_count": 34, "id": "5694148c-f4e0-40a1-bf01-a02fe891ab31", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.68s 79886 results\n" ] } ], "source": [ "# define query template\n", "# The preceding 'r' before the template allows for a raw strings, preventing Python from altering the regex.\n", "\n", "WordQuery2 = r'''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", " word trailer~[\\s&]\n", "'''\n", "\n", "WordResults2 = BHS.search(WordQuery2)" ] }, { "cell_type": "markdown", "id": "aaf17e1b-d375-4044-8fad-e15e7dd7ae84", "metadata": {}, "source": [ "Find the midpoint: 79886/2 = 39948" ] }, { "cell_type": "code", "execution_count": 36, "id": "d565efe2-25ae-469f-ab76-7635ded28348", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'תַעֲשׂ֖וּן '" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T.text(39949)" ] }, { "cell_type": "code", "execution_count": 37, "id": "bde4c27a-ea69-4889-9600-c88754950cfa", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 39948" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

book Leviticus
book=Leviticus
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
verse
book=Leviticus
sentence 33
clause Way0 NA
phrase CP Conj
trailer=
phrase VP Pred
sentence 34
clause WayX NA
phrase CP Conj
trailer=
phrase VP Pred
phrase PrNP Subj
trailer= 
phrase PP Objc
trailer=&
trailer=
trailer= 
sentence 35
clause Way0 NA
phrase CP Conj
trailer=
phrase VP Pred
trailer= 
phrase PP Cmpl
trailer=&
trailer=
phrase AdvP Modi
trailer= 
phrase PP Adju
sentence 36
clause Way0 NA
phrase CP Conj
trailer=
phrase VP Pred
phrase PP Objc
trailer=&
trailer=
sentence 37
clause WxQ0 NA
phrase CP Conj
trailer=
phrase PP Objc
trailer=&
trailer=
trailer= 
phrase VP Pred
trailer= 
phrase PP Cmpl
trailer=&
trailer= 
trailer=
sentence 38
clause Way0 NA
phrase CP Conj
trailer=
phrase VP PreO
clause InfC Adju
phrase VP Pred
trailer=
trailer= 
phrase PP Cmpl
trailer=00 
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "BHS.show(WordResults2,start=39948,end=39948,multiFeatures=False)" ] }, { "cell_type": "markdown", "id": "74fc9706-dfb8-4283-962d-adb59fa37c11", "metadata": { "tags": [] }, "source": [ "Following this method, the center would be: \n", ">and be holy" ] }, { "cell_type": "markdown", "id": "271b118e-639b-4b3c-9890-ea235bc58b02", "metadata": {}, "source": [ "## 3.9 - Center word based upon using feature 'wordboundary'" ] }, { "cell_type": "markdown", "id": "214d9347-6a57-4f28-a471-1ff9c1d50e5e", "metadata": {}, "source": [ "In this section we will use some of the additonal features made available by the [BHSaddons](https://github.com/tonyjurg/BHSaddons/) dataset." ] }, { "cell_type": "code", "execution_count": 61, "id": "01f6db11-9436-42c4-b739-717de8faa4a0", "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/etcbc/BHSA/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/etcbc/BHSA/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "rate limit is 5000 requests per hour, with 4943 left for this hour\n", "\tconnecting to online GitHub repo tonyjurg/BHSaddons ... connected\n" ] }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/tonyjurg/BHSaddons/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/etcbc/phono/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "The requested data is not available offline\n", "\t~/text-fabric-data/github/etcbc/parallels/tf/2021 not found\n" ] }, { "data": { "text/html": [ "Status: latest release online v2.1 versus None locally" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "downloading app, main data and requested additions ..." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "File is not a zip file\n", "\tcould not save corpus data to ~/text-fabric-data/github " ] }, { "name": "stdout", "output_type": "stream", "text": [ "rate limit is 5000 requests per hour, with 4940 left for this hour\n", "\tconnecting to online GitHub repo etcbc/parallels ... connected\n", "\tdownloading from https:/github.com/ETCBC/parallels/releases/download/v2.1/tf-2021.zip ... \n", "\tsaving data\n" ] }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/etcbc/parallels/tf/2021" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " | 0.11s T crossref from ~/text-fabric-data/github/etcbc/parallels/tf/2021\n" ] }, { "data": { "text/html": [ "\n", " TF: TF API 12.6.2, etcbc/BHSA/app v3, Search Reference
\n", " Data: etcbc - BHSA 2021, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots / node% coverage
book3910938.21100
chapter929459.19100
lex923046.22100
verse2321318.38100
half_verse451799.44100
sentence637176.70100
sentence_atom645146.61100
clause881314.84100
clause_atom907044.70100
phrase2532031.68100
phrase_atom2675321.59100
subphrase1138501.4238
word4265901.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
Parallel Passages\n", "
\n", "\n", "
\n", "
\n", "crossref\n", "
\n", "
int
\n", "\n", " 🆗 links between similar passages\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis\n", "
\n", "\n", "
\n", "
\n", "book\n", "
\n", "
str
\n", "\n", " ✅ book name in Latin (Genesis; Numeri; Reges1; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "book@ll\n", "
\n", "
str
\n", "\n", " ✅ book name in amharic (ኣማርኛ)\n", "\n", "
\n", "\n", "
\n", "
\n", "chapter\n", "
\n", "
int
\n", "\n", " ✅ chapter number (1; 2; 3; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "code\n", "
\n", "
int
\n", "\n", " ✅ identifier of a clause atom relationship (0; 74; 367; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "det\n", "
\n", "
str
\n", "\n", " ✅ determinedness of phrase(atom) (det; und; NA.)\n", "\n", "
\n", "\n", "
\n", "
\n", "domain\n", "
\n", "
str
\n", "\n", " ✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)\n", "\n", "
\n", "\n", "
\n", "
\n", "freq_lex\n", "
\n", "
int
\n", "\n", " ✅ frequency of lexemes\n", "\n", "
\n", "\n", "
\n", "
\n", "function\n", "
\n", "
str
\n", "\n", " ✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons\n", "
\n", "
str
\n", "\n", " ✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_cons_utf8\n", "
\n", "
str
\n", "\n", " ✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex\n", "
\n", "
str
\n", "\n", " ✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_lex_utf8\n", "
\n", "
str
\n", "\n", " ✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_word\n", "
\n", "
str
\n", "\n", " ✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)\n", "\n", "
\n", "\n", "
\n", "
\n", "g_word_utf8\n", "
\n", "
str
\n", "\n", " ✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)\n", "\n", "
\n", "\n", "
\n", "
\n", "gloss\n", "
\n", "
str
\n", "\n", " 🆗 english translation of lexeme (beginning create god(s))\n", "\n", "
\n", "\n", "
\n", "
\n", "gn\n", "
\n", "
str
\n", "\n", " ✅ grammatical gender (m; f; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "label\n", "
\n", "
str
\n", "\n", " ✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)\n", "\n", "
\n", "\n", "
\n", "
\n", "language\n", "
\n", "
str
\n", "\n", " ✅ of word or lexeme (Hebrew; Aramaic.)\n", "\n", "
\n", "\n", "
\n", "
\n", "lex\n", "
\n", "
str
\n", "\n", " ✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)\n", "\n", "
\n", "\n", "
\n", "
\n", "lex_utf8\n", "
\n", "
str
\n", "\n", " ✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)\n", "\n", "
\n", "\n", "
\n", "
\n", "ls\n", "
\n", "
str
\n", "\n", " ✅ lexical set, subclassification of part-of-speech (card; ques; mult)\n", "\n", "
\n", "\n", "
\n", "
\n", "nametype\n", "
\n", "
str
\n", "\n", " ⚠️ named entity type (pers; mens; gens; topo; ppde.)\n", "\n", "
\n", "\n", "
\n", "
\n", "nme\n", "
\n", "
str
\n", "\n", " ✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "nu\n", "
\n", "
str
\n", "\n", " ✅ grammatical number (sg; du; pl; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
int
\n", "\n", " ✅ sequence number of an object within its context\n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "pargr\n", "
\n", "
str
\n", "\n", " 🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "pdp\n", "
\n", "
str
\n", "\n", " ✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "pfm\n", "
\n", "
str
\n", "\n", " ✅ preformative consonantal-transliterated (absent; n/a; J, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_gn\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix gender (m; f; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_nu\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix number (sg; du; pl; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "prs_ps\n", "
\n", "
str
\n", "\n", " ✅ pronominal suffix person (p1; p2; p3; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "ps\n", "
\n", "
str
\n", "\n", " ✅ grammatical person (p1; p2; p3; NA; unknown.)\n", "\n", "
\n", "\n", "
\n", "
\n", "qere\n", "
\n", "
str
\n", "\n", " ✅ word pointed-transliterated masoretic reading correction\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer\n", "
\n", "
str
\n", "\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_trailer_utf8\n", "
\n", "
str
\n", "\n", " ✅ interword material -pointed-transliterated (Masoretic correction)\n", "\n", "
\n", "\n", "
\n", "
\n", "qere_utf8\n", "
\n", "
str
\n", "\n", " ✅ word pointed-Hebrew masoretic reading correction\n", "\n", "
\n", "\n", "
\n", "
\n", "rank_lex\n", "
\n", "
int
\n", "\n", " ✅ ranking of lexemes based on freqnuecy\n", "\n", "
\n", "\n", "
\n", "
\n", "rela\n", "
\n", "
str
\n", "\n", " ✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "sp\n", "
\n", "
str
\n", "\n", " ✅ part-of-speech (art; verb; subs; nmpr, ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "st\n", "
\n", "
str
\n", "\n", " ✅ state of a noun (a (absolute); c (construct); e (emphatic).)\n", "\n", "
\n", "\n", "
\n", "
\n", "tab\n", "
\n", "
int
\n", "\n", " ✅ clause atom: its level in the linguistic embedding\n", "\n", "
\n", "\n", "
\n", "
\n", "trailer\n", "
\n", "
str
\n", "\n", " ✅ interword material pointed-transliterated (& 00 05 00_P ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "trailer_utf8\n", "
\n", "
str
\n", "\n", " ✅ interword material pointed-Hebrew (־ ׃)\n", "\n", "
\n", "\n", "
\n", "
\n", "txt\n", "
\n", "
str
\n", "\n", " ✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)\n", "\n", "
\n", "\n", "
\n", "
\n", "typ\n", "
\n", "
str
\n", "\n", " ✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)\n", "\n", "
\n", "\n", "
\n", "
\n", "uvf\n", "
\n", "
str
\n", "\n", " ✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "vbe\n", "
\n", "
str
\n", "\n", " ✅ verbal ending consonantal-transliterated (n/a; W; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "vbs\n", "
\n", "
str
\n", "\n", " ✅ root formation consonantal-transliterated (absent; n/a; H; ...)\n", "\n", "
\n", "\n", "
\n", "
\n", "verse\n", "
\n", "
int
\n", "\n", " ✅ verse number\n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex\n", "
\n", "
str
\n", "\n", " ✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)\n", "\n", "
\n", "\n", "
\n", "
\n", "voc_lex_utf8\n", "
\n", "
str
\n", "\n", " ✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)\n", "\n", "
\n", "\n", "
\n", "
\n", "vs\n", "
\n", "
str
\n", "\n", " ✅ verbal stem (qal; piel; hif; apel; pael)\n", "\n", "
\n", "\n", "
\n", "
\n", "vt\n", "
\n", "
str
\n", "\n", " ✅ verbal tense (perf; impv; wayq; infc)\n", "\n", "
\n", "\n", "
\n", "
\n", "mother\n", "
\n", "
none
\n", "\n", " ✅ linguistic dependency between textual objects\n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
Phonetic Transcriptions\n", "
\n", "\n", "
\n", "
\n", "phono\n", "
\n", "
str
\n", "\n", " 🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)\n", "\n", "
\n", "\n", "
\n", "
\n", "phono_trailer\n", "
\n", "
str
\n", "\n", " 🆗 interword material in phonological transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", "
tonyjurg/BHSaddons/tf\n", "
\n", "\n", "
\n", "
\n", "aliyotnum\n", "
\n", "
str
\n", "\n", " The sequence number of the aliyot within the parasha\n", "\n", "
\n", "\n", "
\n", "
\n", "maftir\n", "
\n", "
str
\n", "\n", " Set to 1 if this verse is part of a maftir\n", "\n", "
\n", "\n", "
\n", "
\n", "parashahebr\n", "
\n", "
str
\n", "\n", " The name of the parasha in Hebrew\n", "\n", "
\n", "\n", "
\n", "
\n", "parashanum\n", "
\n", "
int
\n", "\n", " The sequence number of the parasha\n", "\n", "
\n", "\n", "
\n", "
\n", "parashatrans\n", "
\n", "
str
\n", "\n", " Transliteration of the Hebrew parasha name\n", "\n", "
\n", "\n", "
\n", "
\n", "parashaverse\n", "
\n", "
str
\n", "\n", " The sequence number of the verse within the parasha\n", "\n", "
\n", "\n", "
\n", "
\n", "wordboundary\n", "
\n", "
str
\n", "\n", " indicates wordboudaries (spaces OR maqaf)\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", " Settings:
specified
  1. apiVersion: 3
  2. appName: etcbc/BHSA
  3. appPath: C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/app
  4. commit: gd905e3fb6e80d0fa537600337614adc2af157309
  5. css: ''
  6. dataDisplay:
    • exampleSectionHtml:<code>Genesis 1:1</code> (use <a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names</a>)
    • excludedFeatures:
      • g_uvf_utf8
      • g_vbs
      • kq_hybrid
      • languageISO
      • g_nme
      • lex0
      • is_root
      • g_vbs_utf8
      • g_uvf
      • dist
      • root
      • suffix_person
      • g_vbe
      • dist_unit
      • suffix_number
      • distributional_parent
      • kq_hybrid_utf8
      • crossrefSET
      • instruction
      • g_prs
      • lexeme_count
      • rank_occ
      • g_pfm_utf8
      • freq_occ
      • crossrefLCS
      • functional_parent
      • g_pfm
      • g_nme_utf8
      • g_vbe_utf8
      • kind
      • g_prs_utf8
      • suffix_gender
      • mother_object_type
    • noneValues:
      • none
      • unknown
      • no value
      • NA
  7. docs:
    • docBase: {docRoot}/{repo}
    • docExt: ''
    • docPage: ''
    • docRoot: https://{org}.github.io
    • featurePage: 0_home
  8. interfaceDefaults: {}
  9. isCompatible: True
  10. local: local
  11. localDir: C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/_temp
  12. provenanceSpec:
    • corpus: BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
    • doi: 10.5281/zenodo.1007624
    • moduleSpecs:
      • :
        • backend: no value
        • corpus: Phonetic Transcriptions
        • docUrl:https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
        • doi: 10.5281/zenodo.1007636
        • org: etcbc
        • relative: /tf
        • repo: phono
      • :
        • backend: no value
        • corpus: Parallel Passages
        • docUrl:https://nbviewer.jupyter.org/github/etcbc/parallels/blob/master/programs/parallels.ipynb
        • doi: 10.5281/zenodo.1007642
        • org: etcbc
        • relative: /tf
        • repo: parallels
    • org: etcbc
    • relative: /tf
    • repo: BHSA
    • version: 2021
    • webBase: https://shebanq.ancient-data.org/hebrew
    • webHint: Show this on SHEBANQ
    • webLang: la
    • webLexId: True
    • webUrl:{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
    • webUrlLex: {webBase}/word?version={version}&id=<lid>
  13. release: v1.8
  14. typeDisplay:
    • clause:
      • label: {typ} {rela}
      • style: ''
    • clause_atom:
      • hidden: True
      • label: {code}
      • level: 1
      • style: ''
    • half_verse:
      • hidden: True
      • label: {label}
      • style: ''
      • verselike: True
    • lex:
      • featuresBare: gloss
      • label: {voc_lex_utf8}
      • lexOcc: word
      • style: orig
      • template: {voc_lex_utf8}
    • phrase:
      • label: {typ} {function}
      • style: ''
    • phrase_atom:
      • hidden: True
      • label: {typ} {rela}
      • level: 1
      • style: ''
    • sentence:
      • label: {number}
      • style: ''
    • sentence_atom:
      • hidden: True
      • label: {number}
      • level: 1
      • style: ''
    • subphrase:
      • hidden: True
      • label: {number}
      • style: ''
    • word:
      • features: pdp vs vt
      • featuresBare: lex:gloss
  15. writing: hbo
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# load the app and data with additial features (removed the hoist here)\n", "BHSAadd = use (\"etcbc/BHSA\", mod=\"tonyjurg/BHSaddons/tf/:hot\")" ] }, { "cell_type": "code", "execution_count": 62, "id": "eb3313c8-dbbf-4b75-bf3b-ec4977a5e8ba", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.41s 79886 results\n" ] } ], "source": [ "# find all 'end-of-word' word nodes within any parasha\n", "wordboundaryQuery = '''\n", "verse parashanum\n", " word wordboundary=1\n", "'''\n", "wordboundaryResult = BHSAadd.search(wordboundaryQuery)" ] }, { "cell_type": "code", "execution_count": 63, "id": "d4a65412-12e0-4b6e-95b6-7ad2b3d35744", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.44s 112927 results\n" ] } ], "source": [ "# find all word nodes within any parasha\n", "wordboundaryQuery = '''\n", "verse parashanum\n", " word \n", "'''\n", "wordboundaryResult = BHSAadd.search(wordboundaryQuery)" ] }, { "cell_type": "markdown", "id": "742d9ecb-e393-4a12-9318-848616d9736f", "metadata": {}, "source": [ "As can be seen from these queries, the result is (as expected) the same as for the previous section (3.8)." ] }, { "cell_type": "markdown", "id": "245c727b-d068-4fc2-8c84-540b19522aeb", "metadata": {}, "source": [ "## 3.10 - Center word based upon spaces" ] }, { "cell_type": "markdown", "id": "9ca66313-c42a-4fef-9856-96505bd5c9b4", "metadata": {}, "source": [ "In the following method words are defined as items separeted by spaces. " ] }, { "cell_type": "code", "execution_count": 38, "id": "8d421dff-ba72-42ea-b122-3249940118eb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.60s 68434 results\n" ] } ], "source": [ "# following regexp selects for values of feature trailer that are 1 or more characters in length {alternative regex: (.+) }\n", "\n", "wordQuery3 = r'''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", " word trailer~\\ $\n", "'''\n", "\n", "wordResults3 = BHS.search(wordQuery3)" ] }, { "cell_type": "code", "execution_count": 39, "id": "aaa0f50c-67e4-4435-b8d9-d65ef40c6f73", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.34s 11452 results\n" ] } ], "source": [ "# Just to check: query for maqafs\n", "\n", "maqafQuery = '''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", "\n", " word trailer=&\n", "'''\n", "\n", "maqafResults = BHS.search(maqafQuery)" ] }, { "cell_type": "markdown", "id": "7ae7a4bf-e921-4f42-9b25-b07d941df1e0", "metadata": {}, "source": [ "Check if the numbers do add up: 68434 (spaces) + 11452 (maqafs) =? 79886 (total) YES!" ] }, { "cell_type": "markdown", "id": "028840b1-eae8-4305-aa94-ee7eda51e38d", "metadata": {}, "source": [ "Find the midpoint in wordResults3: 68434/2 = 34217 and print its tuple:" ] }, { "cell_type": "code", "execution_count": 40, "id": "647013d9-bdee-4ce0-bb05-e4f2486e94a6", "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "(426593, 56509)" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wordResults3[34216]" ] }, { "cell_type": "markdown", "id": "0182be1c-158c-4f3c-9d1f-dc450bb1d0cd", "metadata": {}, "source": [ "Print associated text (we need second element in tuple):" ] }, { "cell_type": "code", "execution_count": 41, "id": "bd8b1eda-2e2d-45d1-af91-eff4d40c4a24", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'רֹ֥אשׁ '" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T.text(wordResults3[34216][1])" ] }, { "cell_type": "markdown", "id": "539a6e66-bbb3-49d4-a17b-9c2c80e5dffd", "metadata": {}, "source": [ "Displaying the syntax tree of the relevant verse:" ] }, { "cell_type": "code", "execution_count": 43, "id": "68ced52d-edc9-4c1b-8956-ad83dc25ba94", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

result 34217" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

book Leviticus
book=Leviticus
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
verse
book=Leviticus
sentence 52
clause Way0 NA
phrase CP Conj
trailer=
phrase VP Pred
phrase PP Objc
trailer=&
trailer=
trailer= 
trailer=
trailer= 
trailer=
sentence 53
clause WayX NA
phrase CP Conj
trailer=
phrase VP Pred
phrase PrNP Subj
trailer=
trailer= 
phrase PP Objc
trailer=&
phrase PP Cmpl
trailer=&
trailer= 
trailer=
trailer=00 
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "BHS.show(wordResults3,start=34217,end=34217,multiFeatures=False)" ] }, { "cell_type": "markdown", "id": "b0799e5e-d4e2-4f9a-be67-3c651f6f2573", "metadata": {}, "source": [ "Following this method, the center would be: \n", "> (on the) head of the ram" ] }, { "cell_type": "markdown", "id": "a4092288-38c6-4f24-b9e9-86362971b3ff", "metadata": {}, "source": [ "## 3.11 - Center word based upon selected part of speech" ] }, { "cell_type": "markdown", "id": "5c1f2176-a872-4f5e-9c7d-5fac3a3262dd", "metadata": {}, "source": [ "The following method is intended to exclude items like the Nota Accusativus / object marker (את) where they have a purely gramatical function only. " ] }, { "cell_type": "code", "execution_count": null, "id": "5081579a-e727-4ccd-8c4c-bb548ec3148c", "metadata": {}, "outputs": [], "source": [ "wordQuery4 = '''\n", "book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n", " word sp=adjv|advb|art|conj|intj|inrg|nega|nmpr|prep|prde|prin|prps|subs|verb\n", "'''\n", "\n", "wordResults4 = BHS.search(wordQuery4)" ] }, { "cell_type": "markdown", "id": "ea94e760-3e76-4954-bac3-3f02536db2e8", "metadata": {}, "source": [ "midpoint: int(112927/2)=56463" ] }, { "cell_type": "code", "execution_count": null, "id": "14f69e17-0602-4434-9131-91e0ca84cbe2", "metadata": {}, "outputs": [], "source": [ "BHS.show(wordResults4,start=56463,end=56463,multiFeatures=False)" ] }, { "cell_type": "markdown", "id": "9453d90a-b558-400b-a931-e0014bdcfceb", "metadata": {}, "source": [ "Following this method, the center would be: \n", ">he washed in the water" ] }, { "cell_type": "markdown", "id": "6f1dd61f-f539-4f47-8ade-36554da40ac8", "metadata": {}, "source": [ "## 3.12 - Other opinion - Stone Tenach" ] }, { "cell_type": "markdown", "id": "de0dab7c-f6d3-4fe3-a5b8-b01e41e512d7", "metadata": { "tags": [] }, "source": [ "According to the 'Stone Tanach':1\n", ">[Lev] 10:16 דָּרֹ֥שׁ דָּרַ֛שׁ - *inquired insistently* \\[lit. *inquire he inquired*\\]. This is the exact halfway mark of the word of the Torah. This teaches us that one must always *inquire;* one must never stop seeking an ever deeper and broader understanding of the Torah (*Degel Machaneh Ephraim*). " ] }, { "cell_type": "markdown", "id": "a2a735c7-0f66-4167-87d9-0c1d2c21a760", "metadata": {}, "source": [ "# 4 - Attribution and footnotes\n", "##### [Back to ToC](#TOC)\n", "\n", "#### Footnotes:\n", "\n", "1Rabbi Nosson Scherman (ed), *The Stone Edition Tanach*, Hebrew and English Edition (Brooklyn NY: Mesorah Publications Ltd, 1996), 266." ] }, { "cell_type": "markdown", "id": "5004cc5a-f4fb-4cdc-876b-22b4f6b8b145", "metadata": { "tags": [] }, "source": [ "# 5 - Required libraries\n", "##### [Back to ToC](#TOC)\n", "\n", "The scripts in this notebook require (beside `text-fabric`) the following Python libraries to be installed in the environment:\n", "\n", " {none}\n", "\n", "You can install any missing library from within Jupyter Notebook using either`pip` or `pip3`." ] }, { "cell_type": "markdown", "id": "b4b81ee0-f72c-46ae-9ee2-e98584588b06", "metadata": {}, "source": [ "# 6 - Notebook details\n", "##### [Back to ToC](#TOC)\n", "\n", "
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AuthorTony Jurg
Version1.1
Date14 Novermber 2024
\n", "
" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.7" } }, "nbformat": 4, "nbformat_minor": 5 }