{
"cells": [
{
"cell_type": "markdown",
"id": "5306bdc6-1d2d-4d2b-80ab-5d64efe92003",
"metadata": {},
"source": [
"# Missing verses (N1904LFT)"
]
},
{
"cell_type": "markdown",
"id": "c9bf60c4-73e3-43b4-a60a-314ebbbf1426",
"metadata": {},
"source": [
"## Table of content \n",
"* 1 - Introduction\n",
"* 2 - Load Text-Fabric app and data\n",
"* 3 - Identifying the holes ('missing' verses) "
]
},
{
"cell_type": "markdown",
"id": "fc1cd1c6-7d00-47b4-9749-8b7f2f703fbf",
"metadata": {},
"source": [
"# 1 - Introduction \n",
"##### [Back to TOC](#TOC)\n",
"\n",
"When using verse numbers inside a script, it is not save to assume verse number within a chapter are sequential without gaps. The folling script will produce a list of 'missing' verses."
]
},
{
"cell_type": "markdown",
"id": "31b88ba9-ee5e-4dc9-ae32-83c92c6a51b6",
"metadata": {},
"source": [
"# 2 - Load Text-Fabric app and data \n",
"##### [Back to TOC](#TOC)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4c667b6f-cdfa-4fde-87e9-4ac0c82cfe3b",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"%load_ext autoreload\n",
"%autoreload 2"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "26fbe0c2-5a35-4ae1-8fee-f0e3553c08aa",
"metadata": {},
"outputs": [],
"source": [
"# Loading the Text-Fabric code\n",
"# Note: it is assumed Text-Fabric is installed in your environment\n",
"from tf.fabric import Fabric\n",
"from tf.app import use"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "be4fe0bd-ddc5-4c28-833a-0114c3e5ce8e",
"metadata": {
"scrolled": true,
"tags": []
},
"outputs": [
{
"data": {
"text/markdown": [
"**Locating corpus resources ...**"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"app: ~/text-fabric-data/github/tonyjurg/Nestle1904LFT/app"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/tonyjurg/Nestle1904LFT/tf/0.6"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" TF: TF API 12.2.2, tonyjurg/Nestle1904LFT/app v3, Search Reference
\n",
" Data: tonyjurg - Nestle1904LFT 0.6, Character table, Feature docs
\n",
" Node types
\n",
"\n",
" \n",
" Name | \n",
" # of nodes | \n",
" # slots / node | \n",
" % coverage | \n",
"
\n",
"\n",
"\n",
" book | \n",
" 27 | \n",
" 5102.93 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" chapter | \n",
" 260 | \n",
" 529.92 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" verse | \n",
" 7943 | \n",
" 17.35 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" sentence | \n",
" 8011 | \n",
" 17.20 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" wg | \n",
" 105430 | \n",
" 6.85 | \n",
" 524 | \n",
"
\n",
"\n",
"\n",
" word | \n",
" 137779 | \n",
" 1.00 | \n",
" 100 | \n",
"
\n",
"
\n",
" Sets: no custom sets
\n",
" Features:
\n",
"Nestle 1904 (Low Fat Tree)
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Characters (eg. punctuations) following the word\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Book name (in English language)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ NT book number (Matthew=1, Mark=2, ..., Revelation=27)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Book name (abbreviated)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Gramatical case (Nominative, Genitive, Dative, Accusative, Vocative)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ Chapter number inside book\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Clause type details (e.g. Verbless, Minor)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 Contained clause (WG number)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Degree (e.g. Comparitative, Superlative)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ English gloss\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Gramatical gender (Masculine, Feminine, Neuter)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Start verse number of a sentence\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Junction data related to a wordgroup\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Lexeme (lemma)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Lexical domain according to Semantic Dictionary of Biblical Greek, SDBG (not present everywhere?)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Lauw-Nida lexical classification (not present everywhere?)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 Text critical marker after word\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 Text critical marker before word\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
Order of punctuation and text critical marker\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ Monad (smallest token matching word order in the corpus)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Gramatical mood of the verb (passive, etc)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Morphological tag (Sandborg-Petersen morphology)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Node ID (as in the XML source data)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Surface word with accents normalized and trailing punctuations removed\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Gramatical number (Singular, Plural)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Gramatical number of the verb (e.g. singular, plural)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Gramatical person of the verb (first, second, third)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Punctuation after word\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Value of the ref ID (taken from XML sourcedata)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Reference (to nodeID in XML source data, not yet post-processes)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
⚠️ Distance to the wordgroup defining the syntactical role of this word\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ Sentence number (counted per chapter)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Part of Speech (abbreviated)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Part of Speech (long description)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Strongs number\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 Subject reference (to nodeID in XML source data, not yet post-processes)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Gramatical tense of the verb (e.g. Present, Aorist)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Gramatical type of noun or pronoun (e.g. Common, Personal)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Word as it apears in the text in Unicode (incl. punctuations)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ Verse number inside chapter\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Gramatical voice of the verb (e.g. active,passive)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Class of the wordgroup (e.g. cl, np, vp)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
🆗 Number of the parent wordgroups for a wordgroup\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
✅ Wordgroup number (counted per book)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Syntactical role of the wordgroup (abbreviated)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Syntactical role of the wordgroup (full)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Wordgroup rule information (e.g. Np-Appos, ClCl2, PrepNp)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Wordgroup type details (e.g. group, apposition)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Word as it appears in the text (excl. punctuations)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 Number of the parent wordgroups for a word\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Syntactical role of the word (abbreviated)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Syntactical role of the word (full)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
🆗 Transliteration of the text (in latin letters, excl. punctuations)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
✅ Word without accents (excl. punctuations)\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" Settings:
specified
- apiVersion:
3
- appName:
tonyjurg/Nestle1904LFT
appPath:
C:/Users/tonyj/text-fabric-data/github/tonyjurg/Nestle1904LFT/app
- commit:
e68bd68c7c4c862c1464d995d51e27db7691254f
- css:
''
dataDisplay:
excludedFeatures:
orig_order
verse
book
chapter
noneValues:
- showVerseInTuple:
0
- textFormat:
text-orig-full
docs:
- docBase:
https://github.com/tonyjurg/Nestle1904LFT/blob/main/docs/
- docPage:
about
- docRoot:
https://github.com/tonyjurg/Nestle1904LFT
featureBase:
https://github.com/tonyjurg/Nestle1904LFT/blob/main/docs/features/<feature>.md
- interfaceDefaults: {fmt:
layout-orig-full
} - isCompatible:
True
- local:
local
localDir:
C:/Users/tonyj/text-fabric-data/github/tonyjurg/Nestle1904LFT/_temp
provenanceSpec:
- corpus:
Nestle 1904 (Low Fat Tree)
- doi:
10.5281/zenodo.10182594
- org:
tonyjurg
- relative:
/tf
- repo:
Nestle1904LFT
- repro:
Nestle1904LFT
- version:
0.6
- webBase:
https://learner.bible/text/show_text/nestle1904/
- webHint:
Show this on the Bible Online Learner website
- webLang:
en
webUrl:
https://learner.bible/text/show_text/nestle1904/<1>/<2>/<3>
- webUrlLex:
{webBase}/word?version={version}&id=<lid>
- release:
v0.6
typeDisplay:
book:
- condense:
True
- hidden:
True
- label:
{book}
- style:
''
chapter:
- condense:
True
- hidden:
True
- label:
{chapter}
- style:
''
sentence:
- hidden:
0
- label:
#{sentence} (start: {book} {chapter}:{headverse})
- style:
''
verse:
- condense:
True
- excludedFeatures:
chapter verse
- label:
{book} {chapter}:{verse}
- style:
''
wg:
- hidden:
0
label:
#{wgnum}: {wgtype} {wgclass} {clausetype} {wgrole} {wgrule} {junction}
- style:
''
word:
- base:
True
- features:
lemma
- featuresBare:
gloss
- surpress:
chapter verse
- writing:
grc
\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# load the N1904 app and data\n",
"N1904 = use (\"tonyjurg/Nestle1904LFT\", version=\"0.6\", hoist=globals())"
]
},
{
"cell_type": "markdown",
"id": "39fec5a4-f3d5-4bdf-a4eb-44c0420ea6f4",
"metadata": {},
"source": [
"# 3 - Identifying the holes ('missing' verses) \n",
"##### [Back to TOC](#TOC)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "a5c06c66-49cd-4e8c-a1b7-3ee8ddcdd588",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hole of 1 verse(s) between ('Matthew', 17, 20) and ('Matthew', 17, 22)\n",
"Hole of 1 verse(s) between ('Matthew', 18, 10) and ('Matthew', 18, 12)\n",
"Hole of 1 verse(s) between ('Matthew', 23, 13) and ('Matthew', 23, 15)\n",
"Hole of 1 verse(s) between ('Mark', 7, 15) and ('Mark', 7, 17)\n",
"Hole of 1 verse(s) between ('Mark', 9, 43) and ('Mark', 9, 45)\n",
"Hole of 1 verse(s) between ('Mark', 9, 45) and ('Mark', 9, 47)\n",
"Hole of 1 verse(s) between ('Mark', 11, 25) and ('Mark', 11, 27)\n",
"Hole of 1 verse(s) between ('Mark', 15, 27) and ('Mark', 15, 29)\n",
"Hole of 78 verse(s) between ('Mark', 16, 20) and ('Mark', 16, 99)\n",
"Hole of 1 verse(s) between ('Luke', 17, 35) and ('Luke', 17, 37)\n",
"Hole of 1 verse(s) between ('Luke', 23, 16) and ('Luke', 23, 18)\n",
"Hole of 1 verse(s) between ('Acts', 8, 36) and ('Acts', 8, 38)\n",
"Hole of 1 verse(s) between ('Acts', 15, 33) and ('Acts', 15, 35)\n",
"Hole of 1 verse(s) between ('Acts', 24, 6) and ('Acts', 24, 8)\n",
"Hole of 1 verse(s) between ('Acts', 28, 28) and ('Acts', 28, 30)\n",
"Hole of 1 verse(s) between ('Romans', 16, 23) and ('Romans', 16, 25)\n"
]
}
],
"source": [
"# Initialize variables for tracking the previous verse and node\n",
"previousVerse = 1\n",
"previousNode = 0 # Start with a dummy value for the previous node\n",
"\n",
"# Iterate over all verse nodes in the dataset\n",
"for verseNode in F.otype.s('verse'):\n",
" # Retrieve the verse number for the current node\n",
" ThisVerse = F.verse.v(verseNode)\n",
"\n",
" # Check if the current verse is different from the previous one\n",
" if ThisVerse != previousVerse:\n",
" # Check for a gap in verse numbering that is not at the start\n",
" if ThisVerse != previousVerse + 1 and ThisVerse != 1:\n",
" # Calculate the size of the gap and print details\n",
" print(f'Hole of {ThisVerse - previousVerse - 1} verse(s) between {T.sectionFromNode(previousNode)} and {T.sectionFromNode(verseNode)}')\n",
"\n",
" # Update the previous verse and node to the current ones\n",
" previousVerse = ThisVerse\n",
" previousNode = verseNode"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8da41ec1-61e1-4d86-94d1-14802659ade5",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}