| Name | \n", "# of nodes | \n", "# slots / node | \n", "% coverage | \n", "
|---|---|---|---|
| book | \n", "39 | \n", "10938.21 | \n", "100 | \n", "
| chapter | \n", "929 | \n", "459.19 | \n", "100 | \n", "
| lex | \n", "9230 | \n", "46.22 | \n", "100 | \n", "
| verse | \n", "23213 | \n", "18.38 | \n", "100 | \n", "
| half_verse | \n", "45179 | \n", "9.44 | \n", "100 | \n", "
| sentence | \n", "63717 | \n", "6.70 | \n", "100 | \n", "
| sentence_atom | \n", "64514 | \n", "6.61 | \n", "100 | \n", "
| clause | \n", "88131 | \n", "4.84 | \n", "100 | \n", "
| clause_atom | \n", "90704 | \n", "4.70 | \n", "100 | \n", "
| phrase | \n", "253203 | \n", "1.68 | \n", "100 | \n", "
| phrase_atom | \n", "267532 | \n", "1.59 | \n", "100 | \n", "
| subphrase | \n", "113850 | \n", "1.42 | \n", "38 | \n", "
| word | \n", "426590 | \n", "1.00 | \n", "100 | \n", "
3etcbc/BHSAC:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/appgd905e3fb6e80d0fa537600337614adc2af157309''<code>Genesis 1:1</code> (use <a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names</a>)g_uvf_utf8g_vbskq_hybridlanguageISOg_nmelex0is_rootg_vbs_utf8g_uvfdistrootsuffix_persong_vbedist_unitsuffix_numberdistributional_parentkq_hybrid_utf8crossrefSETinstructiong_prslexeme_countrank_occg_pfm_utf8freq_occcrossrefLCSfunctional_parentg_pfmg_nme_utf8g_vbe_utf8kindg_prs_utf8suffix_gendermother_object_typenoneunknownNA{docRoot}/{repo}''''https://{org}.github.io0_home{}TruelocalC:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/_tempBHSA = Biblia Hebraica Stuttgartensia Amstelodamensis10.5281/zenodo.1007624Phonetic Transcriptionshttps://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb10.5281/zenodo.1007636etcbc/tfphonoParallel Passageshttps://nbviewer.jupyter.org/github/etcbc/parallels/blob/master/programs/parallels.ipynb10.5281/zenodo.1007642etcbc/tfparallelsetcbc/tfBHSA2021https://shebanq.ancient-data.org/hebrewShow this on SHEBANQlaTrue{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt{webBase}/word?version={version}&id=<lid>v1.8{typ} {rela}''True{code}1''True{label}''Truegloss{voc_lex_utf8}wordorig{voc_lex_utf8}{typ} {function}''True{typ} {rela}1''{number}''True{number}1''True{number}''pdp vs vtlex:glosshboData generated by `hapax.ipynb` at `github.com/tonyjurg/Parashot`
'" ] }, { "cell_type": "markdown", "id": "53742352-a904-43b6-8ebe-a254dfba3be2", "metadata": {}, "source": [ "The following cell contains code that allows us to provide additional information in the table **as it is annotated in the BHSA dataset**. See also the caution on feature [nametype](https://github.com/ETCBC/bhsa/blob/master/docs/features/nametype.md): \n", "> It is unclear how completely and correctly this feature has been assigned." ] }, { "cell_type": "code", "execution_count": 6, "id": "47a7f4d7-8eae-4af5-8988-058534ec34b1", "metadata": {}, "outputs": [], "source": [ "# part of speech expantion table \n", "# https://github.com/ETCBC/bhsa/blob/master/docs/features/sp.md\n", "posMapping= {\n", " # \"abreviation\" (key) : \"description\"\n", " 'art':\t'article',\n", " 'verb':\t'verb',\n", " 'subs':\t'noun',\n", " 'nmpr':\t'proper noun',\n", " 'advb':\t'adverb',\n", " 'prep':\t'preposition',\n", " 'conj':\t'conjunction',\n", " 'prps':\t'personal pronoun',\n", " 'prde':\t'demonstrative pronoun',\n", " 'prin':\t'interrogative pronoun',\n", " 'intj':\t'interjection',\n", " 'nega':\t'negative particle',\n", " 'inrg':\t'interrogative particle',\n", " 'adjv':\t'adjective'\n", "}\n", "\n", "# Subclassification of part of speech (feature ls on word and lex nodes)\n", "# https://github.com/ETCBC/bhsa/blob/master/docs/features/ls.md\n", "subclassMapping = {\n", " # \"abreviation\" (key) : \"description\"\n", " 'nmdi':\t'distributive noun',\n", " 'nmcp':\t'copulative noun',\n", " 'padv':\t'potential adverb',\n", " 'afad':\t'anaphoric adverb',\n", " 'ppre':\t'potential preposition',\n", " 'cjad':\t'conjunctive adverb',\n", " 'ordn':\t'ordinal',\n", " 'vbcp':\t'copulative verb',\n", " 'mult':\t'noun of multitude',\n", " 'focp':\t'focus particle',\n", " 'ques':\t'interrogative particle',\n", " 'gntl':\t'gentilic',\n", " 'quot':\t'quotation verb',\n", " 'card':\t'cardinal',\n", " 'none': ''\n", "}\n", "\n", "# expand information in feature nametype (a comma separated list)\n", "# https://github.com/ETCBC/bhsa/blob/master/docs/features/nametype.md\n", "nametypeExpantions = {\n", " 'pers':\t'person',\n", " 'mens':\t'measurement unit',\n", " 'gens':\t'people',\n", " 'topo':\t'place',\n", " 'ppde':\t'demonstrative personal pronoun'\n", "}\n", "def expandNametype(inputText):\n", " outputText = inputText\n", " if inputText is not None:\n", " for old, new in nametypeExpantions.items():\n", " outputText = outputText.replace(old, new)\n", " return outputText" ] }, { "cell_type": "markdown", "id": "5e1cfcb4-f66d-43d3-98a8-9f0afb033d3a", "metadata": {}, "source": [ "The following cell performs the actual gathering of the hapax legomena:" ] }, { "cell_type": "code", "execution_count": 7, "id": "53bffc02-51ba-477b-b97f-2b70e3b8ec66", "metadata": {}, "outputs": [ { "data": { "text/html": [ "| Verse | Hebrew Word | English Gloss | Part of Speech | Subclass | Name Type |
|---|---|---|---|---|---|
| Exodus 6:9 | קֹּ֣צֶר | shortness | noun | ||
| Exodus 6:22 | סִתְרִֽי | Sithri | proper noun | person | |
| Exodus 6:23 | אֱלִישֶׁ֧בַע | Elisheba | proper noun | person | |
| Exodus 6:24 | אֲבִיאָסָ֑ף | Abiasaph | proper noun | person | |
| Exodus 6:25 | פּֽוּטִיאֵל֙ | Putiel | proper noun | person | |
| Exodus 7:11 | לַהֲטֵיהֶ֖ם | enchantments | noun | ||
| Exodus 9:31 | גִּבְעֹֽל | flower-bud | noun | ||
| Exodus 9:32 | אֲפִילֹ֖ת | late | adjective |
8 hapaxes found.
" ], "text/plain": [ "| Verse | Hebrew Word | English Gloss | Part of Speech | Subclass | Name Type |
|---|---|---|---|---|---|
| {linkSTEPbible} | \"\n", " f\"{wordLink} | \"\n", " f\"{escapeMarkdown(F.gloss.v(node))} | \"\n", " f\"{escapeMarkdown(posMapping.get(F.sp.v(node), ''))} | \"\n", " f\"{escapeMarkdown(subclassMapping.get(F.ls.v(node), ''))} | \"\n", " f\"{escapeMarkdown(expandNametype(F.nametype.v(node)))} | \"\n", " f\"
{numberOfHapax} hapaxes found.
\"\n", "\n", "# Save the content to an HTML file\n", "fileName = f\"hapax_legomena({parashaNameEnglish.replace(' ','%20')}).html\"\n", "with open(fileName, \"w\", encoding=\"utf-8\") as file:\n", " file.write(htmlContent)\n", "\n", "# Display the HTML content in the notebook\n", "display(HTML(htmlContent))\n", "\n", "# wrap html header and footer and display a download button\n", "htmlContentFull = f'{htmlStart}{htmlContent}{htmlFooter}'\n", "downloadButton = f\"\"\"\n", "', '>').replace('\"', '"').replace(\"'\", ''')}\" target=\"_blank\">\n", " \n", "\n", "\"\"\"\n", "display(HTML(downloadButton))" ] }, { "cell_type": "markdown", "id": "93852912-fa5c-420a-88ed-8be3b090eb3a", "metadata": { "tags": [] }, "source": [ "# 4 - Required libraries \n", "##### [Back to ToC](#TOC)\n", "\n", "The scripts in this notebook require (beside `text-fabric`) the following Python libraries to be installed in the environment:\n", "\n", " IPython\n", "\n", "You can install any missing library from within Jupyter Notebook using either`pip` or `pip3`." ] }, { "cell_type": "markdown", "id": "bc5b6c04-4855-4d2d-aa9a-a1dac0256074", "metadata": {}, "source": [ "# 5 - Further reading \n", "##### [Back to ToC](#TOC)\n", "\n", "An discussion regarding Hapax Legomena, including details about ten hapaxes in the Hebrew Bible can be found at [The Torah.com](https://www.thetorah.com/article/hapax-legomena-ten-biblical-examples)." ] }, { "cell_type": "markdown", "id": "68573424-b71f-4596-95e7-468cf9ef9c1e", "metadata": {}, "source": [ "# 6 - Notebook version details\n", "##### [Back to ToC](#TOC)\n", "\n", "| Author | \n", "Tony Jurg | \n", "
| Version | \n", "1.2 | \n", "
| Date | \n", "5 March 2025 | \n", "