"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer)\n",
"BHS.dh(BHS.getCss())"
]
},
{
"cell_type": "markdown",
"id": "947d7fd8-20c9-4b82-8b37-45ca74d18ef4",
"metadata": {},
"source": [
"# 3 - Performing the queries \n",
"##### [Back to ToC](#TOC)"
]
},
{
"cell_type": "markdown",
"id": "1ac06c2b-91bc-47a9-9ac8-1c574ce9473e",
"metadata": {},
"source": [
"An important feature used in the queries will be 'number'. This starts with 1. The manner of numbering objects differs per object type. The following are of interest for this research:\n",
"\n",
"type | numbering\n",
"--- | ---\n",
"phrase_atom | within the book\n",
"clause_atom | within the book\n",
"sentence_atom | within the book\n",
"word | within the book\n",
"\n",
"Note: Full Text-Fabric feature documentation is found [here](https://github.com/ETCBC/bhsa/blob/master/docs/features/number.md)"
]
},
{
"cell_type": "markdown",
"id": "b85b2729-5f44-47fd-bfd2-eb1d176f82ff",
"metadata": {},
"source": [
"**Important observation:** The BHSA inserts nodes for implicit articles (which are only visable in the vocalisation). See example below:\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "57699b9b-8c44-464f-a264-380ddbcf8c1e",
"metadata": {},
"source": [
"## 3.1 - Center book\n",
"##### [Back to TOC](#TOC)"
]
},
{
"cell_type": "markdown",
"id": "eb211e70-8eca-4629-91cc-38018991fb65",
"metadata": {},
"source": [
"Rather trivially, Leviticus constitutes the center of the five books of the Torah."
]
},
{
"cell_type": "markdown",
"id": "30adfe82-5d55-446b-9472-c0089c6b5541",
"metadata": {
"tags": []
},
"source": [
"## 3.2 - Center chapter \n",
"##### [Back to TOC](#TOC)"
]
},
{
"cell_type": "markdown",
"id": "08064c7e-5d49-4a54-ab33-a8511bd2670c",
"metadata": {},
"source": [
"The following method is based upon the center chapter."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "d12239fb-ad50-4dd0-b349-2302abc96be3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.05s 187 results\n"
]
}
],
"source": [
"# number of chapters in Torah\n",
"ChapterQuery = '''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
" chapter \n",
"'''\n",
"\n",
"ChapterResults = BHS.search(ChapterQuery)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "4b612d9f-9a35-439a-a604-28025472262e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(426630, 427558)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"F.otype.sInterval('chapter')"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "cbd00395-9dc8-42e0-a757-8d0fd94eb6da",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('Leviticus', 4)"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# start + delta: 426630 + int(187/2) = 426630 + 93 = 426723\n",
"T.sectionFromNode(426723)"
]
},
{
"cell_type": "markdown",
"id": "744154c9-69a4-44ae-994f-905936faf2f1",
"metadata": {},
"source": [
"## 3.3 - Center verse "
]
},
{
"cell_type": "markdown",
"id": "7c2e8831-a1e3-4531-a53c-f2b671b1238f",
"metadata": {},
"source": [
"This method is based upon the middle verse in the Torah."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e361025d-7166-4e20-afbf-0189fabc2d0f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.05s 5853 results\n"
]
}
],
"source": [
"# number of verses in Torah\n",
"VerseQuery = '''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
" verse \n",
"'''\n",
"\n",
"VerseResults = BHS.search(VerseQuery)"
]
},
{
"cell_type": "markdown",
"id": "e9993cb4-13c8-441e-8376-949852048b12",
"metadata": {},
"source": [
"Determine boundaries of the verse node-numbers."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d4d84db1-8411-4011-9fdf-b4d85a406960",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1414389, 1437601)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"F.otype.sInterval('verse')"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "dff076df-be15-4ba0-b0a6-4ade13c9bef3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('Leviticus', 8, 9)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# start + delta: 1414389 + int(5853/2) = 1414389 + 2926 = 1417315\n",
"T.sectionFromNode(1417315)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "13888b1c-20c5-4107-9fa1-943d614929a4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'וַיָּ֥שֶׂם אֶת־הַמִּצְנֶ֖פֶת עַל־רֹאשֹׁ֑ו וַיָּ֨שֶׂם עַֽל־הַמִּצְנֶ֜פֶת אֶל־מ֣וּל פָּנָ֗יו אֵ֣ת צִ֤יץ הַזָּהָב֙ נֵ֣זֶר הַקֹּ֔דֶשׁ כַּאֲשֶׁ֛ר צִוָּ֥ה יְהוָ֖ה אֶת־מֹשֶֽׁה׃ '"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"T.text(1417315)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "84a4378c-60c3-418f-b8f4-3f5adff650e7",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 2926"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"book Leviticus
book=Leviticus
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"BHS.show(VerseResults,start=2926,end=2926, multiFeatures=False)"
]
},
{
"cell_type": "markdown",
"id": "0caff32b-2ab5-4976-ae32-f9145df01b25",
"metadata": {},
"source": [
"This verse in the King James Version:\n",
"> And he put the mitre upon his head; also upon the mitre, even upon his forefront, did he put the golden plate, the holy crown; as the Lord commanded Moses."
]
},
{
"cell_type": "markdown",
"id": "935c6015-0b94-479e-890c-2e49eb4db512",
"metadata": {},
"source": [
"## 3.4 - Center sentence "
]
},
{
"cell_type": "markdown",
"id": "dba783d1-b399-4c20-a175-6a1d5f3518af",
"metadata": {},
"source": [
"The following method is based upon the center sentence. In this method the sentence definition used is the one according to the ETCBC database, which at places differs from other databases."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "66b221aa-04d4-42da-9a32-10bb2057ba63",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.10s 15088 results\n"
]
}
],
"source": [
"# number of sentences in Torah\n",
"SentenceQuery = '''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
" sentence \n",
"'''\n",
"\n",
"SentenceResults = BHS.search(SentenceQuery)"
]
},
{
"cell_type": "markdown",
"id": "52d70ff4-441f-43a5-87fe-e498dcb4fde2",
"metadata": {},
"source": [
"Determining the interval of sentence node-numbers."
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "384fc0c6-c270-4a0a-bd8d-cb5c8428ab7e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1172308, 1236024)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"F.otype.sInterval('sentence')"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "2d4c1485-48a5-474d-9479-9966c1c13c2d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('Exodus', 36, 11)"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# start + delta: 1172308 + int(15088/2) = 1172308 + 7544 = 1179852\n",
"T.sectionFromNode(1179852)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "9b0ea95f-1584-4c26-83b5-2231a906d80f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'וַיַּ֜עַשׂ לֻֽלְאֹ֣ת תְּכֵ֗לֶת עַ֣ל שְׂפַ֤ת הַיְרִיעָה֙ הָֽאֶחָ֔ת מִקָּצָ֖ה בַּמַּחְבָּ֑רֶת '"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"T.text(1179852)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "2420b17d-bb54-42e4-b57c-72085f5c271d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 7544"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# 15088 results / 2 = 7544 \n",
"BHS.show(SentenceResults,start=7544,end=7544,multiFeatures=False)"
]
},
{
"cell_type": "markdown",
"id": "1d3be2a3-6d32-4efa-a351-5dd2761ade62",
"metadata": {},
"source": [
"This sentence in the King James Version:\n",
"> and the other five curtains he coupled one unto another.\n",
"\n",
"Note that in the KJV this is a subsentence."
]
},
{
"cell_type": "markdown",
"id": "2ab19500-9c29-42bc-b185-4f1131541796",
"metadata": {},
"source": [
"## 3.5 - Center clause "
]
},
{
"cell_type": "markdown",
"id": "7fe30026-44e9-4b36-a0e0-fc8b7df63479",
"metadata": {},
"source": [
"The following method is based upon the center clause. In this method the clause definition used is the one according to the ETCBC database, which may slightly differ in other implementations."
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "a3294875-8052-41d1-8ca9-44f7cbe6cd0b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.11s 21181 results\n"
]
}
],
"source": [
"# number of clauses in Torah\n",
"ClauseQuery = '''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
" clause \n",
"'''\n",
"\n",
"ClauseResults = BHS.search(ClauseQuery)"
]
},
{
"cell_type": "markdown",
"id": "28264f93-2fa7-4d47-853d-28c86cfba2bf",
"metadata": {},
"source": [
"Determining the interval of clause node-numbers."
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "db9f9a08-7e9e-4fea-92f8-bc9baf2220d4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(427559, 515689)"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"F.otype.sInterval('clause')"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "92827185-0ce5-495a-bece-a17ac430e081",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('Leviticus', 4, 35)"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# start + delta: 427559 + int(21181/2) = 427559 + 10590 = 438149\n",
"T.sectionFromNode(438149)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "e901a175-6c56-4b25-950f-f0eaa71d5df5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'וְאֶת־כָּל־חֶלְבָּ֣ה יָסִ֗יר '"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"T.text(438149)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "561be90b-a335-4fcd-b9ed-4251e181f9b7",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 10590"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"book Leviticus
book=Leviticus
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# 21181 results / 2 = 10590,5 -> midpoint = 10590\n",
"BHS.show(ClauseResults,start=10590,end=10590, multiFeatures=False)"
]
},
{
"cell_type": "markdown",
"id": "bdb286ff-5449-4c3f-b5b8-ec8e25d19a04",
"metadata": {},
"source": [
"In the King James Version:\n",
"> and shall pour out all the blood thereof at the bottom of the altar\n",
"\n",
"Note that while in the ETCBC BHSA sentences often contain multiple clauses, this clause constitutes a full sentence."
]
},
{
"cell_type": "markdown",
"id": "653d6375-57b3-4863-bd0a-98c4010c698d",
"metadata": {},
"source": [
"## 3.6 - Center phrase "
]
},
{
"cell_type": "markdown",
"id": "cc4eb0c7-4ac0-45c3-86a1-03764367f169",
"metadata": {},
"source": [
"The following method is based upon the center phrase. In this method the clause definition used is the one according to the ETCBC database, following a more-or-less general understanding of what does constitute a phrase."
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "30182b63-2ede-46dc-8837-e84ba46dccb6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.30s 64195 results\n"
]
}
],
"source": [
"# number of phrases in Torah\n",
"PhraseQuery = '''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
" phrase \n",
"'''\n",
"\n",
"PhraseResults = BHS.search(PhraseQuery)"
]
},
{
"cell_type": "markdown",
"id": "6ba3dd7e-6cc2-45f5-a669-3f5bc9c4b0d6",
"metadata": {},
"source": [
"Determining the interval of phrase node-numbers."
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "91c6c321-3910-4c53-b210-3a0cfddc8f4b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(651573, 904775)"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"F.otype.sInterval('phrase')"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "84aac874-62aa-47f8-9be8-b48cf06ed1cb",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('Leviticus', 4, 32)"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# start + delta: 651573 + int(64195/2) = 651573 + 32097 = 683670\n",
"T.sectionFromNode(683670)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "81906af7-5a76-424a-8371-31302885101e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'נְקֵבָ֥ה תְמִימָ֖ה '"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"T.text(683670)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "9c7280c5-4931-409f-8e7b-4fb17bbf22c3",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 32098"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"book Leviticus
book=Leviticus
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# 64195 results /2 = 32097,5 -> midpoint = 32098\n",
"BHS.show(PhraseResults,start=32098,end=32098,multiFeatures=False)"
]
},
{
"cell_type": "markdown",
"id": "51d802b5-92cc-4ebe-9ad4-620c7f5500f5",
"metadata": {},
"source": [
"In the King James Version:\n",
"\n",
"> a female without blemish"
]
},
{
"cell_type": "markdown",
"id": "d8098a6c-bc55-4a38-8d92-093d246ba36d",
"metadata": {},
"source": [
"## 3.7 - Center word - based upon center word node"
]
},
{
"cell_type": "markdown",
"id": "9257ec4c-1c27-4dab-b6b3-352b8fea19d3",
"metadata": {},
"source": [
"This method assumes the mathematical center of the list of word nodes provides us the center of the Torah."
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "45f0cfbf-86e9-47c4-89be-3b31a1b6b85f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.48s 112927 results\n"
]
}
],
"source": [
"# number of words in Torah (WARNING: as per ETCBC definition!) \n",
"WordQuery = '''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
" word \n",
"'''\n",
"\n",
"WordResults = BHS.search(WordQuery)"
]
},
{
"cell_type": "markdown",
"id": "76187fe5-9cde-46ec-b6a5-0d1d33e56b04",
"metadata": {},
"source": [
"The following code validates that the word nodes are numbered starting from '1'."
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "7504d521-55bb-412f-a26b-a761ff33ab74",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1, 426590)"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"F.otype.sInterval('word')"
]
},
{
"cell_type": "markdown",
"id": "0741e7a3-00e3-45b9-b34b-15da722a8067",
"metadata": {},
"source": [
"Find the midle word node "
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "1c903e0c-946d-438e-92e3-d42d33419d62",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('Leviticus', 8, 21)"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# start + delta: 1 + int(112927/2) = 1 + 56463 = 56464\n",
"T.sectionFromNode(56464)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "8bbe56f8-8640-41ab-99d5-7a755fd9a185",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'בַּ'"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"T.text(56464)"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "9c16893f-9d60-40ba-860c-6560e2e6c9df",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 56464"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"book Leviticus
book=Leviticus
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# 112927 results /2 = 56463,5 -> midpoint = 56464\n",
"BHS.show(WordResults,start=56464,end=56464,multiFeatures=False)"
]
},
{
"cell_type": "markdown",
"id": "302013d5-2a96-4133-a28b-a687d129914c",
"metadata": {},
"source": [
"If this would be 'translated' into a meaningfull 'center' clause, it could be:\n",
"> 'wash in the water'. "
]
},
{
"cell_type": "markdown",
"id": "03849162-6e7e-45ee-904b-70e3206c31ba",
"metadata": {
"tags": []
},
"source": [
"## 3.8 - Center word based on spaces and maqaf"
]
},
{
"cell_type": "markdown",
"id": "aa074722-acfe-42e8-b820-36f817c63111",
"metadata": {},
"source": [
"Here the number of words in the Torah is determined by items separeted by spaces OR maqaf (diacritical mark indicating a strong connection between words). \n",
"\n",
"First check what can be placed after an individual word"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "cc6a238f-88b0-4fe1-b0a3-9df6bfb91f0d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"((' ', 236930),\n",
" ('', 121801),\n",
" ('&', 42275),\n",
" ('00 ', 20146),\n",
" ('05 ', 2266),\n",
" ('00_S ', 1892),\n",
" ('00_P ', 1165),\n",
" ('_S ', 76),\n",
" (' 05 ', 17),\n",
" ('_P ', 13),\n",
" ('00_N ', 7),\n",
" ('00_N_P ', 1),\n",
" ('00_N_S ', 1))"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# note: this is for the full TeNaCH!\n",
"F.trailer.freqList()"
]
},
{
"cell_type": "markdown",
"id": "b147fb63-c711-4341-bd1e-7ca061e87008",
"metadata": {},
"source": [
"In this list, the ' ' value (i.e. a space) is used when the word is joined to the next word, while '&' indicates a maqqef (־), a diacritical mark indicating a strong connection between words. We consider both as word separators. Examining the frequency list above there are two methods to determine the word boundaries. The first is utilizing the fact that all feature values indicating a wordboundary are of lenght 1 or higher, allowing the string `(.+)` to exclude all cases where the lenght is less than 1 character. The other option is to explicitly look for spaces and maqqefs, by using `[\\s&]` as regex expression. As expected, both product the same outcome. The following query determines the number of words in the torah based on this methond of counting."
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "5694148c-f4e0-40a1-bf01-a02fe891ab31",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.68s 79886 results\n"
]
}
],
"source": [
"# define query template\n",
"# The preceding 'r' before the template allows for a raw strings, preventing Python from altering the regex.\n",
"\n",
"WordQuery2 = r'''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
" word trailer~[\\s&]\n",
"'''\n",
"\n",
"WordResults2 = BHS.search(WordQuery2)"
]
},
{
"cell_type": "markdown",
"id": "aaf17e1b-d375-4044-8fad-e15e7dd7ae84",
"metadata": {},
"source": [
"Find the midpoint: 79886/2 = 39948"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "d565efe2-25ae-469f-ab76-7635ded28348",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'תַעֲשׂ֖וּן '"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"T.text(39949)"
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "bde4c27a-ea69-4889-9600-c88754950cfa",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 39948"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"book Leviticus
book=Leviticus
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"BHS.show(WordResults2,start=39948,end=39948,multiFeatures=False)"
]
},
{
"cell_type": "markdown",
"id": "74fc9706-dfb8-4283-962d-adb59fa37c11",
"metadata": {
"tags": []
},
"source": [
"Following this method, the center would be: \n",
">and be holy"
]
},
{
"cell_type": "markdown",
"id": "271b118e-639b-4b3c-9890-ea235bc58b02",
"metadata": {},
"source": [
"## 3.9 - Center word based upon using feature 'wordboundary'"
]
},
{
"cell_type": "markdown",
"id": "214d9347-6a57-4f28-a471-1ff9c1d50e5e",
"metadata": {},
"source": [
"In this section we will use some of the additonal features made available by the [BHSaddons](https://github.com/tonyjurg/BHSaddons/) dataset."
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "01f6db11-9436-42c4-b739-717de8faa4a0",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"**Locating corpus resources ...**"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"app: ~/text-fabric-data/github/etcbc/BHSA/app"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/etcbc/BHSA/tf/2021"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"rate limit is 5000 requests per hour, with 4943 left for this hour\n",
"\tconnecting to online GitHub repo tonyjurg/BHSaddons ... connected\n"
]
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/tonyjurg/BHSaddons/tf/2021"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/etcbc/phono/tf/2021"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"The requested data is not available offline\n",
"\t~/text-fabric-data/github/etcbc/parallels/tf/2021 not found\n"
]
},
{
"data": {
"text/html": [
"Status: latest release online v2.1 versus None locally"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"downloading app, main data and requested additions ..."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"File is not a zip file\n",
"\tcould not save corpus data to ~/text-fabric-data/github "
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"rate limit is 5000 requests per hour, with 4940 left for this hour\n",
"\tconnecting to online GitHub repo etcbc/parallels ... connected\n",
"\tdownloading from https:/github.com/ETCBC/parallels/releases/download/v2.1/tf-2021.zip ... \n",
"\tsaving data\n"
]
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/etcbc/parallels/tf/2021"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" | 0.11s T crossref from ~/text-fabric-data/github/etcbc/parallels/tf/2021\n"
]
},
{
"data": {
"text/html": [
"\n",
" TF: TF API 12.6.2, etcbc/BHSA/app v3, Search Reference
\n",
" Data: etcbc - BHSA 2021, Character table, Feature docs
\n",
" Node types
\n",
"\n",
" \n",
" Name | \n",
" # of nodes | \n",
" # slots / node | \n",
" % coverage | \n",
"
\n",
"\n",
"\n",
" book | \n",
" 39 | \n",
" 10938.21 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" chapter | \n",
" 929 | \n",
" 459.19 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" lex | \n",
" 9230 | \n",
" 46.22 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" verse | \n",
" 23213 | \n",
" 18.38 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" half_verse | \n",
" 45179 | \n",
" 9.44 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" sentence | \n",
" 63717 | \n",
" 6.70 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" sentence_atom | \n",
" 64514 | \n",
" 6.61 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" clause | \n",
" 88131 | \n",
" 4.84 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" clause_atom | \n",
" 90704 | \n",
" 4.70 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" phrase | \n",
" 253203 | \n",
" 1.68 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" phrase_atom | \n",
" 267532 | \n",
" 1.59 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" subphrase | \n",
" 113850 | \n",
" 1.42 | \n",
" 38 | \n",
"
\n",
"\n",
"\n",
" word | \n",
" 426590 | \n",
" 1.00 | \n",
" 100 | \n",
"
\n",
"
\n",
" Sets: no custom sets
\n",
" Features:
\n",
"Parallel Passages
\n",
" \n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
🆗 links between similar passages\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ book name in Latin (Genesis; Numeri; Reges1; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ book name in amharic (ኣማርኛ)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ chapter number (1; 2; 3; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ identifier of a clause atom relationship (0; 74; 367; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ determinedness of phrase(atom) (det; und; NA.)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ frequency of lexemes\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 english translation of lexeme (beginning create god(s))\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ grammatical gender (m; f; NA; unknown.)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ of word or lexeme (Hebrew; Aramaic.)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ lexical set, subclassification of part-of-speech (card; ques; mult)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
⚠️ named entity type (pers; mens; gens; topo; ppde.)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ grammatical number (sg; du; pl; NA; unknown.)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ sequence number of an object within its context\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ preformative consonantal-transliterated (absent; n/a; J, ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ pronominal suffix gender (m; f; NA; unknown.)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ pronominal suffix number (sg; du; pl; NA; unknown.)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ pronominal suffix person (p1; p2; p3; NA; unknown.)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ grammatical person (p1; p2; p3; NA; unknown.)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ word pointed-transliterated masoretic reading correction\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ interword material -pointed-transliterated (Masoretic correction)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ interword material -pointed-transliterated (Masoretic correction)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ word pointed-Hebrew masoretic reading correction\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ ranking of lexemes based on freqnuecy\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ part-of-speech (art; verb; subs; nmpr, ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ state of a noun (a (absolute); c (construct); e (emphatic).)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ clause atom: its level in the linguistic embedding\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ interword material pointed-transliterated (& 00 05 00_P ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ interword material pointed-Hebrew (־ ׃)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ verbal ending consonantal-transliterated (n/a; W; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ root formation consonantal-transliterated (absent; n/a; H; ...)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ verse number\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ verbal stem (qal; piel; hif; apel; pael)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ verbal tense (perf; impv; wayq; infc)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
✅ linguistic dependency between textual objects\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"Phonetic Transcriptions
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 interword material in phonological transcription\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"tonyjurg/BHSaddons/tf
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
The sequence number of the aliyot within the parasha\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
Set to 1 if this verse is part of a maftir\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
The name of the parasha in Hebrew\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
The sequence number of the parasha\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
Transliteration of the Hebrew parasha name\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
The sequence number of the verse within the parasha\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
indicates wordboudaries (spaces OR maqaf)\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" Settings:
specified
- apiVersion:
3
- appName:
etcbc/BHSA
- appPath:
C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/app
- commit:
gd905e3fb6e80d0fa537600337614adc2af157309
- css:
''
dataDisplay:
exampleSectionHtml:
<code>Genesis 1:1</code> (use <a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names</a>)
excludedFeatures:
g_uvf_utf8
g_vbs
kq_hybrid
languageISO
g_nme
lex0
is_root
g_vbs_utf8
g_uvf
dist
root
suffix_person
g_vbe
dist_unit
suffix_number
distributional_parent
kq_hybrid_utf8
crossrefSET
instruction
g_prs
lexeme_count
rank_occ
g_pfm_utf8
freq_occ
crossrefLCS
functional_parent
g_pfm
g_nme_utf8
g_vbe_utf8
kind
g_prs_utf8
suffix_gender
mother_object_type
noneValues:
docs:
- docBase:
{docRoot}/{repo}
- docExt:
''
- docPage:
''
- docRoot:
https://{org}.github.io
- featurePage:
0_home
- interfaceDefaults:
{}
- isCompatible:
True
- local:
local
- localDir:
C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/_temp
provenanceSpec:
- corpus:
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
- doi:
10.5281/zenodo.1007624
moduleSpecs:
:
- backend: no value
- corpus:
Phonetic Transcriptions
docUrl:
https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
- doi:
10.5281/zenodo.1007636
- org:
etcbc
- relative:
/tf
- repo:
phono
:
- backend: no value
- corpus:
Parallel Passages
docUrl:
https://nbviewer.jupyter.org/github/etcbc/parallels/blob/master/programs/parallels.ipynb
- doi:
10.5281/zenodo.1007642
- org:
etcbc
- relative:
/tf
- repo:
parallels
- org:
etcbc
- relative:
/tf
- repo:
BHSA
- version:
2021
- webBase:
https://shebanq.ancient-data.org/hebrew
- webHint:
Show this on SHEBANQ
- webLang:
la
- webLexId:
True
webUrl:
{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
- webUrlLex:
{webBase}/word?version={version}&id=<lid>
- release:
v1.8
typeDisplay:
clause:
- label:
{typ} {rela}
- style:
''
clause_atom:
- hidden:
True
- label:
{code}
- level:
1
- style:
''
half_verse:
- hidden:
True
- label:
{label}
- style:
''
- verselike:
True
lex:
- featuresBare:
gloss
- label:
{voc_lex_utf8}
- lexOcc:
word
- style:
orig
- template:
{voc_lex_utf8}
phrase:
- label:
{typ} {function}
- style:
''
phrase_atom:
- hidden:
True
- label:
{typ} {rela}
- level:
1
- style:
''
sentence:
sentence_atom:
- hidden:
True
- label:
{number}
- level:
1
- style:
''
subphrase:
- hidden:
True
- label:
{number}
- style:
''
word:
- features:
pdp vs vt
- featuresBare:
lex:gloss
- writing:
hbo
\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# load the app and data with additial features (removed the hoist here)\n",
"BHSAadd = use (\"etcbc/BHSA\", mod=\"tonyjurg/BHSaddons/tf/:hot\")"
]
},
{
"cell_type": "code",
"execution_count": 62,
"id": "eb3313c8-dbbf-4b75-bf3b-ec4977a5e8ba",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.41s 79886 results\n"
]
}
],
"source": [
"# find all 'end-of-word' word nodes within any parasha\n",
"wordboundaryQuery = '''\n",
"verse parashanum\n",
" word wordboundary=1\n",
"'''\n",
"wordboundaryResult = BHSAadd.search(wordboundaryQuery)"
]
},
{
"cell_type": "code",
"execution_count": 63,
"id": "d4a65412-12e0-4b6e-95b6-7ad2b3d35744",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.44s 112927 results\n"
]
}
],
"source": [
"# find all word nodes within any parasha\n",
"wordboundaryQuery = '''\n",
"verse parashanum\n",
" word \n",
"'''\n",
"wordboundaryResult = BHSAadd.search(wordboundaryQuery)"
]
},
{
"cell_type": "markdown",
"id": "742d9ecb-e393-4a12-9318-848616d9736f",
"metadata": {},
"source": [
"As can be seen from these queries, the result is (as expected) the same as for the previous section (3.8)."
]
},
{
"cell_type": "markdown",
"id": "245c727b-d068-4fc2-8c84-540b19522aeb",
"metadata": {},
"source": [
"## 3.10 - Center word based upon spaces"
]
},
{
"cell_type": "markdown",
"id": "9ca66313-c42a-4fef-9856-96505bd5c9b4",
"metadata": {},
"source": [
"In the following method words are defined as items separeted by spaces. "
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "8d421dff-ba72-42ea-b122-3249940118eb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.60s 68434 results\n"
]
}
],
"source": [
"# following regexp selects for values of feature trailer that are 1 or more characters in length {alternative regex: (.+) }\n",
"\n",
"wordQuery3 = r'''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
" word trailer~\\ $\n",
"'''\n",
"\n",
"wordResults3 = BHS.search(wordQuery3)"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "aaa0f50c-67e4-4435-b8d9-d65ef40c6f73",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.34s 11452 results\n"
]
}
],
"source": [
"# Just to check: query for maqafs\n",
"\n",
"maqafQuery = '''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
"\n",
" word trailer=&\n",
"'''\n",
"\n",
"maqafResults = BHS.search(maqafQuery)"
]
},
{
"cell_type": "markdown",
"id": "7ae7a4bf-e921-4f42-9b25-b07d941df1e0",
"metadata": {},
"source": [
"Check if the numbers do add up: 68434 (spaces) + 11452 (maqafs) =? 79886 (total) YES!"
]
},
{
"cell_type": "markdown",
"id": "028840b1-eae8-4305-aa94-ee7eda51e38d",
"metadata": {},
"source": [
"Find the midpoint in wordResults3: 68434/2 = 34217 and print its tuple:"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "647013d9-bdee-4ce0-bb05-e4f2486e94a6",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"(426593, 56509)"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"wordResults3[34216]"
]
},
{
"cell_type": "markdown",
"id": "0182be1c-158c-4f3c-9d1f-dc450bb1d0cd",
"metadata": {},
"source": [
"Print associated text (we need second element in tuple):"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "bd8b1eda-2e2d-45d1-af91-eff4d40c4a24",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'רֹ֥אשׁ '"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"T.text(wordResults3[34216][1])"
]
},
{
"cell_type": "markdown",
"id": "539a6e66-bbb3-49d4-a17b-9c2c80e5dffd",
"metadata": {},
"source": [
"Displaying the syntax tree of the relevant verse:"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "68ced52d-edc9-4c1b-8956-ad83dc25ba94",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 34217"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"book Leviticus
book=Leviticus
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"BHS.show(wordResults3,start=34217,end=34217,multiFeatures=False)"
]
},
{
"cell_type": "markdown",
"id": "b0799e5e-d4e2-4f9a-be67-3c651f6f2573",
"metadata": {},
"source": [
"Following this method, the center would be: \n",
"> (on the) head of the ram"
]
},
{
"cell_type": "markdown",
"id": "a4092288-38c6-4f24-b9e9-86362971b3ff",
"metadata": {},
"source": [
"## 3.11 - Center word based upon selected part of speech"
]
},
{
"cell_type": "markdown",
"id": "5c1f2176-a872-4f5e-9c7d-5fac3a3262dd",
"metadata": {},
"source": [
"The following method is intended to exclude items like the Nota Accusativus / object marker (את) where they have a purely gramatical function only. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5081579a-e727-4ccd-8c4c-bb548ec3148c",
"metadata": {},
"outputs": [],
"source": [
"wordQuery4 = '''\n",
"book book=Genesis|Exodus|Leviticus|Numeri|Deuteronomium\n",
" word sp=adjv|advb|art|conj|intj|inrg|nega|nmpr|prep|prde|prin|prps|subs|verb\n",
"'''\n",
"\n",
"wordResults4 = BHS.search(wordQuery4)"
]
},
{
"cell_type": "markdown",
"id": "ea94e760-3e76-4954-bac3-3f02536db2e8",
"metadata": {},
"source": [
"midpoint: int(112927/2)=56463"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "14f69e17-0602-4434-9131-91e0ca84cbe2",
"metadata": {},
"outputs": [],
"source": [
"BHS.show(wordResults4,start=56463,end=56463,multiFeatures=False)"
]
},
{
"cell_type": "markdown",
"id": "9453d90a-b558-400b-a931-e0014bdcfceb",
"metadata": {},
"source": [
"Following this method, the center would be: \n",
">he washed in the water"
]
},
{
"cell_type": "markdown",
"id": "6f1dd61f-f539-4f47-8ade-36554da40ac8",
"metadata": {},
"source": [
"## 3.12 - Other opinion - Stone Tenach"
]
},
{
"cell_type": "markdown",
"id": "de0dab7c-f6d3-4fe3-a5b8-b01e41e512d7",
"metadata": {
"tags": []
},
"source": [
"According to the 'Stone Tanach':1\n",
">[Lev] 10:16 דָּרֹ֥שׁ דָּרַ֛שׁ - *inquired insistently* \\[lit. *inquire he inquired*\\]. This is the exact halfway mark of the word of the Torah. This teaches us that one must always *inquire;* one must never stop seeking an ever deeper and broader understanding of the Torah (*Degel Machaneh Ephraim*). "
]
},
{
"cell_type": "markdown",
"id": "a2a735c7-0f66-4167-87d9-0c1d2c21a760",
"metadata": {},
"source": [
"# 4 - Attribution and footnotes\n",
"##### [Back to ToC](#TOC)\n",
"\n",
"#### Footnotes:\n",
"\n",
"1Rabbi Nosson Scherman (ed), *The Stone Edition Tanach*, Hebrew and English Edition (Brooklyn NY: Mesorah Publications Ltd, 1996), 266."
]
},
{
"cell_type": "markdown",
"id": "5004cc5a-f4fb-4cdc-876b-22b4f6b8b145",
"metadata": {
"tags": []
},
"source": [
"# 5 - Required libraries\n",
"##### [Back to ToC](#TOC)\n",
"\n",
"The scripts in this notebook require (beside `text-fabric`) the following Python libraries to be installed in the environment:\n",
"\n",
" {none}\n",
"\n",
"You can install any missing library from within Jupyter Notebook using either`pip` or `pip3`."
]
},
{
"cell_type": "markdown",
"id": "b4b81ee0-f72c-46ae-9ee2-e98584588b06",
"metadata": {},
"source": [
"# 6 - Notebook details\n",
"##### [Back to ToC](#TOC)\n",
"\n",
"\n",
"
\n",
" \n",
" Author | \n",
" Tony Jurg | \n",
"
\n",
" \n",
" Version | \n",
" 1.1 | \n",
"
\n",
" \n",
" Date | \n",
" 14 Novermber 2024 | \n",
"
\n",
"
\n",
"
"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}