{ "cells": [ { "cell_type": "markdown", "id": "a522cef7-5dc1-4ffd-9f82-cbb0d6589eff", "metadata": {}, "source": [ "# Position of generally postpositive conjunctions in a clause (Nestle1904LFT)" ] }, { "cell_type": "markdown", "id": "401af193-cbf1-47db-a4f4-2d9381c38d42", "metadata": {}, "source": [ "## Table of content \n", "* 1 - Introduction\n", "* 2 - Load Text-Fabric app and data\n", "* 3 - Performing the queries\n", " * 3.1 - Identifying the occurences of the lemmata\n", " * 3.2 - Position of conjunction γάρ within a clause\n", " * 3.3 - Position of conjunction δέ within a clause\n", " * 3.4 - Position of conjunction μέν within a clause\n", " * 3.4 - Position of conjunction οὖν within a clause\n", "* 4 - Attribution and footnotes" ] }, { "cell_type": "markdown", "id": "1aeeb0b9-9e57-4e0c-9db9-6df7935b4396", "metadata": {}, "source": [ "# 1 - Introduction \n", "##### [Back to TOC](#TOC)\n", "\n", "In ancient Greek, postpositive conjunctions like δέ and γάρ often occupy the second position in a (sub)clause, following the first significant word. This placement not only structures the syntax but also subtly nuances the meaning and flow of the text. This notebook determines the positional frequency of these conjunctions within a (sub)clause within the corpus of the Greek New Testament (based upon the LowFat treebank).\n", "\n", "According to Stanley E. Porter *et.al.* the following conjuctions can be regarded to be postpositive: γάρ, δέ, μέν, and οὖν.1\n" ] }, { "cell_type": "markdown", "id": "07bd0541-8f54-425d-a5d9-5dd91acac36c", "metadata": {}, "source": [ "# 2 - Load Text-Fabric app and data \n", "##### [Back to TOC](#TOC)" ] }, { "cell_type": "code", "execution_count": 1, "id": "0fb41f21-a71c-44f1-a253-8508dd69779a", "metadata": { "tags": [] }, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 2, "id": "6c5469c2-2d60-4b56-9168-deb24692551c", "metadata": {}, "outputs": [], "source": [ "# Loading the Text-Fabric code\n", "# Note: it is assumed Text-Fabric is installed in your environment\n", "from tf.fabric import Fabric\n", "from tf.app import use" ] }, { "cell_type": "code", "execution_count": 3, "id": "6502aec4-b794-4669-94ee-7929041eebea", "metadata": { "scrolled": true, "tags": [] }, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/tonyjurg/Nestle1904LFT/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/tonyjurg/Nestle1904LFT/tf/0.6" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " TF: TF API 12.2.2, tonyjurg/Nestle1904LFT/app v3, Search Reference
\n", " Data: tonyjurg - Nestle1904LFT 0.6, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots / node% coverage
book275102.93100
chapter260529.92100
verse794317.35100
sentence801117.20100
wg1054306.85524
word1377791.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
Nestle 1904 (Low Fat Tree)\n", "
\n", "\n", "
\n", "
\n", "after\n", "
\n", "
str
\n", "\n", " ✅ Characters (eg. punctuations) following the word\n", "\n", "
\n", "\n", "
\n", "
\n", "book\n", "
\n", "
str
\n", "\n", " ✅ Book name (in English language)\n", "\n", "
\n", "\n", "
\n", "
\n", "booknumber\n", "
\n", "
int
\n", "\n", " ✅ NT book number (Matthew=1, Mark=2, ..., Revelation=27)\n", "\n", "
\n", "\n", "
\n", "
\n", "bookshort\n", "
\n", "
str
\n", "\n", " ✅ Book name (abbreviated)\n", "\n", "
\n", "\n", "
\n", "
\n", "case\n", "
\n", "
str
\n", "\n", " ✅ Gramatical case (Nominative, Genitive, Dative, Accusative, Vocative)\n", "\n", "
\n", "\n", "
\n", "
\n", "chapter\n", "
\n", "
int
\n", "\n", " ✅ Chapter number inside book\n", "\n", "
\n", "\n", "
\n", "
\n", "clausetype\n", "
\n", "
str
\n", "\n", " ✅ Clause type details (e.g. Verbless, Minor)\n", "\n", "
\n", "\n", "
\n", "
\n", "containedclause\n", "
\n", "
str
\n", "\n", " 🆗 Contained clause (WG number)\n", "\n", "
\n", "\n", "
\n", "
\n", "degree\n", "
\n", "
str
\n", "\n", " ✅ Degree (e.g. Comparitative, Superlative)\n", "\n", "
\n", "\n", "
\n", "
\n", "gloss\n", "
\n", "
str
\n", "\n", " ✅ English gloss\n", "\n", "
\n", "\n", "
\n", "
\n", "gn\n", "
\n", "
str
\n", "\n", " ✅ Gramatical gender (Masculine, Feminine, Neuter)\n", "\n", "
\n", "\n", "
\n", "
\n", "headverse\n", "
\n", "
str
\n", "\n", " ✅ Start verse number of a sentence\n", "\n", "
\n", "\n", "
\n", "
\n", "junction\n", "
\n", "
str
\n", "\n", " ✅ Junction data related to a wordgroup\n", "\n", "
\n", "\n", "
\n", "
\n", "lemma\n", "
\n", "
str
\n", "\n", " ✅ Lexeme (lemma)\n", "\n", "
\n", "\n", "
\n", "
\n", "lex_dom\n", "
\n", "
str
\n", "\n", " ✅ Lexical domain according to Semantic Dictionary of Biblical Greek, SDBG (not present everywhere?)\n", "\n", "
\n", "\n", "
\n", "
\n", "ln\n", "
\n", "
str
\n", "\n", " ✅ Lauw-Nida lexical classification (not present everywhere?)\n", "\n", "
\n", "\n", "
\n", "
\n", "markafter\n", "
\n", "
str
\n", "\n", " 🆗 Text critical marker after word\n", "\n", "
\n", "\n", "
\n", "
\n", "markbefore\n", "
\n", "
str
\n", "\n", " 🆗 Text critical marker before word\n", "\n", "
\n", "\n", "
\n", "
\n", "markorder\n", "
\n", "
str
\n", "\n", "  Order of punctuation and text critical marker\n", "\n", "
\n", "\n", "
\n", "
\n", "monad\n", "
\n", "
int
\n", "\n", " ✅ Monad (smallest token matching word order in the corpus)\n", "\n", "
\n", "\n", "
\n", "
\n", "mood\n", "
\n", "
str
\n", "\n", " ✅ Gramatical mood of the verb (passive, etc)\n", "\n", "
\n", "\n", "
\n", "
\n", "morph\n", "
\n", "
str
\n", "\n", " ✅ Morphological tag (Sandborg-Petersen morphology)\n", "\n", "
\n", "\n", "
\n", "
\n", "nodeID\n", "
\n", "
str
\n", "\n", " ✅ Node ID (as in the XML source data)\n", "\n", "
\n", "\n", "
\n", "
\n", "normalized\n", "
\n", "
str
\n", "\n", " ✅ Surface word with accents normalized and trailing punctuations removed\n", "\n", "
\n", "\n", "
\n", "
\n", "nu\n", "
\n", "
str
\n", "\n", " ✅ Gramatical number (Singular, Plural)\n", "\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
str
\n", "\n", " ✅ Gramatical number of the verb (e.g. singular, plural)\n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "person\n", "
\n", "
str
\n", "\n", " ✅ Gramatical person of the verb (first, second, third)\n", "\n", "
\n", "\n", "
\n", "
\n", "punctuation\n", "
\n", "
str
\n", "\n", " ✅ Punctuation after word\n", "\n", "
\n", "\n", "
\n", "
\n", "ref\n", "
\n", "
str
\n", "\n", " ✅ Value of the ref ID (taken from XML sourcedata)\n", "\n", "
\n", "\n", "
\n", "
\n", "reference\n", "
\n", "
str
\n", "\n", " ✅ Reference (to nodeID in XML source data, not yet post-processes)\n", "\n", "
\n", "\n", "
\n", "
\n", "roleclausedistance\n", "
\n", "
str
\n", "\n", " ⚠️ Distance to the wordgroup defining the syntactical role of this word\n", "\n", "
\n", "\n", "
\n", "
\n", "sentence\n", "
\n", "
int
\n", "\n", " ✅ Sentence number (counted per chapter)\n", "\n", "
\n", "\n", "
\n", "
\n", "sp\n", "
\n", "
str
\n", "\n", " ✅ Part of Speech (abbreviated)\n", "\n", "
\n", "\n", "
\n", "
\n", "sp_full\n", "
\n", "
str
\n", "\n", " ✅ Part of Speech (long description)\n", "\n", "
\n", "\n", "
\n", "
\n", "strongs\n", "
\n", "
str
\n", "\n", " ✅ Strongs number\n", "\n", "
\n", "\n", "
\n", "
\n", "subj_ref\n", "
\n", "
str
\n", "\n", " 🆗 Subject reference (to nodeID in XML source data, not yet post-processes)\n", "\n", "
\n", "\n", "
\n", "
\n", "tense\n", "
\n", "
str
\n", "\n", " ✅ Gramatical tense of the verb (e.g. Present, Aorist)\n", "\n", "
\n", "\n", "
\n", "
\n", "type\n", "
\n", "
str
\n", "\n", " ✅ Gramatical type of noun or pronoun (e.g. Common, Personal)\n", "\n", "
\n", "\n", "
\n", "
\n", "unicode\n", "
\n", "
str
\n", "\n", " ✅ Word as it apears in the text in Unicode (incl. punctuations)\n", "\n", "
\n", "\n", "
\n", "
\n", "verse\n", "
\n", "
int
\n", "\n", " ✅ Verse number inside chapter\n", "\n", "
\n", "\n", "
\n", "
\n", "voice\n", "
\n", "
str
\n", "\n", " ✅ Gramatical voice of the verb (e.g. active,passive)\n", "\n", "
\n", "\n", "
\n", "
\n", "wgclass\n", "
\n", "
str
\n", "\n", " ✅ Class of the wordgroup (e.g. cl, np, vp)\n", "\n", "
\n", "\n", "
\n", "
\n", "wglevel\n", "
\n", "
int
\n", "\n", " 🆗 Number of the parent wordgroups for a wordgroup\n", "\n", "
\n", "\n", "
\n", "
\n", "wgnum\n", "
\n", "
int
\n", "\n", " ✅ Wordgroup number (counted per book)\n", "\n", "
\n", "\n", "
\n", "
\n", "wgrole\n", "
\n", "
str
\n", "\n", " ✅ Syntactical role of the wordgroup (abbreviated)\n", "\n", "
\n", "\n", "
\n", "
\n", "wgrolelong\n", "
\n", "
str
\n", "\n", " ✅ Syntactical role of the wordgroup (full)\n", "\n", "
\n", "\n", "
\n", "
\n", "wgrule\n", "
\n", "
str
\n", "\n", " ✅ Wordgroup rule information (e.g. Np-Appos, ClCl2, PrepNp)\n", "\n", "
\n", "\n", "
\n", "
\n", "wgtype\n", "
\n", "
str
\n", "\n", " ✅ Wordgroup type details (e.g. group, apposition)\n", "\n", "
\n", "\n", "
\n", "
\n", "word\n", "
\n", "
str
\n", "\n", " ✅ Word as it appears in the text (excl. punctuations)\n", "\n", "
\n", "\n", "
\n", "
\n", "wordlevel\n", "
\n", "
str
\n", "\n", " 🆗 Number of the parent wordgroups for a word\n", "\n", "
\n", "\n", "
\n", "
\n", "wordrole\n", "
\n", "
str
\n", "\n", " ✅ Syntactical role of the word (abbreviated)\n", "\n", "
\n", "\n", "
\n", "
\n", "wordrolelong\n", "
\n", "
str
\n", "\n", " ✅ Syntactical role of the word (full)\n", "\n", "
\n", "\n", "
\n", "
\n", "wordtranslit\n", "
\n", "
str
\n", "\n", " 🆗 Transliteration of the text (in latin letters, excl. punctuations)\n", "\n", "
\n", "\n", "
\n", "
\n", "wordunacc\n", "
\n", "
str
\n", "\n", " ✅ Word without accents (excl. punctuations)\n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", " Settings:
specified
  1. apiVersion: 3
  2. appName: tonyjurg/Nestle1904LFT
  3. appPath:C:/Users/tonyj/text-fabric-data/github/tonyjurg/Nestle1904LFT/app
  4. commit: e68bd68c7c4c862c1464d995d51e27db7691254f
  5. css: ''
  6. dataDisplay:
    • excludedFeatures:
      • orig_order
      • verse
      • book
      • chapter
    • noneValues:
      • none
      • unknown
      • no value
      • NA
      • ''
    • showVerseInTuple: 0
    • textFormat: text-orig-full
  7. docs:
    • docBase: https://github.com/tonyjurg/Nestle1904LFT/blob/main/docs/
    • docPage: about
    • docRoot: https://github.com/tonyjurg/Nestle1904LFT
    • featureBase:https://github.com/tonyjurg/Nestle1904LFT/blob/main/docs/features/<feature>.md
  8. interfaceDefaults: {fmt: layout-orig-full}
  9. isCompatible: True
  10. local: local
  11. localDir:C:/Users/tonyj/text-fabric-data/github/tonyjurg/Nestle1904LFT/_temp
  12. provenanceSpec:
    • corpus: Nestle 1904 (Low Fat Tree)
    • doi: 10.5281/zenodo.10182594
    • org: tonyjurg
    • relative: /tf
    • repo: Nestle1904LFT
    • repro: Nestle1904LFT
    • version: 0.6
    • webBase: https://learner.bible/text/show_text/nestle1904/
    • webHint: Show this on the Bible Online Learner website
    • webLang: en
    • webUrl:https://learner.bible/text/show_text/nestle1904/<1>/<2>/<3>
    • webUrlLex: {webBase}/word?version={version}&id=<lid>
  13. release: v0.6
  14. typeDisplay:
    • book:
      • condense: True
      • hidden: True
      • label: {book}
      • style: ''
    • chapter:
      • condense: True
      • hidden: True
      • label: {chapter}
      • style: ''
    • sentence:
      • hidden: 0
      • label: #{sentence} (start: {book} {chapter}:{headverse})
      • style: ''
    • verse:
      • condense: True
      • excludedFeatures: chapter verse
      • label: {book} {chapter}:{verse}
      • style: ''
    • wg:
      • hidden: 0
      • label:#{wgnum}: {wgtype} {wgclass} {clausetype} {wgrole} {wgrule} {junction}
      • style: ''
    • word:
      • base: True
      • features: lemma
      • featuresBare: gloss
      • surpress: chapter verse
  15. writing: grc
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
TF API: names N F E L T S C TF Fs Fall Es Eall Cs Call directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# load the N1904 app and data\n", "N1904 = use (\"tonyjurg/Nestle1904LFT\", version=\"0.6\", hoist=globals())" ] }, { "cell_type": "code", "execution_count": 19, "id": "918e0518-5f14-4785-9e11-58e5657ccc96", "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display of tables with notebook viewer)\n", "N1904.dh(N1904.getCss())" ] }, { "cell_type": "markdown", "id": "994fea19-a622-44ce-9b7a-35c0a124a384", "metadata": {}, "source": [ "# 3 - Performing the queries \n", "##### [Back to TOC](#TOC)" ] }, { "cell_type": "markdown", "id": "7bd746f0-4a5d-4611-818d-ff2418281d84", "metadata": { "tags": [] }, "source": [ "## 3.1 - Identifying the occurences of the lemmata\n", "##### [Back to TOC](#TOC)\n", "\n", "Identifing the occurences of the conjunction under investigation can be done using a straight forward query. This will provide us with the node numbers of the word nodes containing the various lemmata which will allow for further processing." ] }, { "cell_type": "code", "execution_count": 20, "id": "9c1684e7-7175-4e6b-9957-b3bdebfe066d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "γάρ: 0.10s 1038 results\n", "δέ: 0.11s 2787 results\n", "μέν: 0.10s 180 results\n", "οὖν: 0.10s 496 results\n" ] } ], "source": [ "# Define the query template\n", "GarQuery= '''\n", "word lemma=γάρ\n", "'''\n", "\n", "DeQuery= '''\n", "word lemma=δέ\n", "'''\n", "\n", "MenQuery= '''\n", "word lemma=μέν\n", "'''\n", "\n", "OunQuery='''\n", "word lemma=οὖν\n", "'''\n", "\n", "# The following will create a list containing ordered tuples consisting of node numbers of the items as they appear in the query\n", "print('γάρ:',end='')\n", "GarResult = N1904.search(GarQuery)\n", "print('δέ: ',end='')\n", "DeResult = N1904.search(DeQuery)\n", "print('μέν:',end='')\n", "MenResult = N1904.search(MenQuery)\n", "print('οὖν:',end='')\n", "OunResult = N1904.search(OunQuery)" ] }, { "cell_type": "markdown", "id": "57d906ea-eb97-4b7f-ab0c-fcd9e2e94904", "metadata": { "tags": [] }, "source": [ "## 3.2 - Position of γάρ within a clause\n", "##### [Back to TOC](#TOC)" ] }, { "cell_type": "markdown", "id": "bc04d82f-7ff2-443c-8515-76f154b44b21", "metadata": {}, "source": [ "The conjunctions γάρ is generally postpositive, appearing as the second word in a (sub)clause of the surface text. Its primary function is to provide explanation or justification for a statement. This script will determine the frequency of the positions of the conjunction γάρ within a clause (wordgroup)." ] }, { "cell_type": "code", "execution_count": 21, "id": "8451505b-682f-44e3-96f3-d1455220ac52", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of occurances of γάρ: 1038\n" ] }, { "data": { "text/markdown": [ "Position | Frequency | Percentage \n", " --- | --- | ---\n", " 2 | 959 | 92.39%\n", "3 | 74 | 7.13%\n", "4 | 4 | 0.39%\n", "5 | 1 | 0.10%\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import unicodedata\n", "import string\n", "from unidecode import unidecode\n", "\n", "def remove_punctuation(input_string):\n", " # Create a string of all punctuation characters\n", " punctuation_chars = \".,*;\"\n", " \n", " # Use str.translate to replace punctuation characters with empty string\n", " result_string = input_string.translate(str.maketrans(\"\", \"\", punctuation_chars))\n", " \n", " return result_string\n", "\n", "# small function to find position of a word\n", "def find_word_position(sentence, target_word):\n", " words = sentence.split()\n", " try:\n", " position = words.index(target_word) + 1 \n", " # Adding 1 to make it more 'natural' (i.e. 1-based index)\n", " return position\n", " except ValueError:\n", " # following print reveals any occurence of 'de' which is not accounted for\n", " print ('NOT:',sentence)\n", " return -1 # Word not found in the sentence\n", " \n", "target_word = unidecode('γάρ')\n", "position_frequency = {}\n", "number_results=0\n", " \n", "# DeResult is a list of tuples each consisting of two integers, we need the second one _,\n", "for word in GarResult:\n", " # get first item from tuple of integers \n", " parent_wg=L.u(word[0])[0]\n", " number_results+=1\n", " # decoded text of the parent wordgroup with punctuations removed and abreviations 'repaired'\n", " parent_wg_text=remove_punctuation(unidecode(T.text(parent_wg)))\n", " position = find_word_position(parent_wg_text, target_word)\n", " # Check if the position is found\n", " if position != -1:\n", " # Update the frequency dictionary\n", " position_frequency[position] = position_frequency.get(position, 0) + 1\n", "\n", "print('Total number of occurances of γάρ:',number_results)\n", "\n", "# Calculate percentages\n", "total_positions = sum(position_frequency.values())\n", "position_percentage = {pos: count / total_positions * 100 for pos, count in position_frequency.items()}\n", "\n", "# Print the table\n", "table_output=\"Position | Frequency | Percentage \\n --- | --- | ---\\n \"\n", "for pos in sorted(position_percentage.keys()):\n", " table_output +=f\"{pos} | {position_frequency.get(pos, 0)} | {position_percentage.get(pos, 0):.2f}%\\n\"\n", "N1904.dm(table_output)" ] }, { "cell_type": "markdown", "id": "88d1c3bd-0a94-4b39-9e6f-c913f2234eb7", "metadata": { "tags": [] }, "source": [ "## 3.3 - Position of δέ within a clause\n", "##### [Back to TOC](#TOC)" ] }, { "cell_type": "markdown", "id": "3db84821-80c5-4af4-812a-27d8c0664181", "metadata": {}, "source": [ "The conjunctions δέ is generally postpositive, appearing as the second word in a (sub)clause of the surface text. Although its functions are diverse, it plays a crucial role in the structure and flow of Greek sentences. This script will determine the frequency of the positions of the conjunction δέ within a clause (wordgroup)." ] }, { "cell_type": "code", "execution_count": 22, "id": "7c5fb2a9-3862-4d39-a92c-6e6400afcee5", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of occurances of δέ: 2787\n" ] }, { "data": { "text/markdown": [ "Position | Frequency | Percentage \n", " --- | --- | ---\n", " 1 | 1 | 0.04%\n", "2 | 2687 | 96.41%\n", "3 | 75 | 2.69%\n", "4 | 14 | 0.50%\n", "5 | 2 | 0.07%\n", "6 | 1 | 0.04%\n", "7 | 3 | 0.11%\n", "9 | 2 | 0.07%\n", "11 | 1 | 0.04%\n", "12 | 1 | 0.04%\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import unicodedata\n", "import string\n", "from unidecode import unidecode\n", "\n", "def remove_punctuation(input_string):\n", " # Create a string of all punctuation characters\n", " punctuation_chars = \".,*;\"\n", " \n", " # Use str.translate to replace punctuation characters with empty string\n", " result_string = input_string.translate(str.maketrans(\"\", \"\", punctuation_chars))\n", " \n", " return result_string\n", "\n", "def fix_abbreviated(input_string):\n", " fixed_string = input_string.replace(\"d'\", \"de\")\n", " return fixed_string\n", "\n", "\n", "# small function to find position of a word\n", "def find_word_position(sentence, target_word):\n", " words = sentence.split()\n", " try:\n", " position = words.index(target_word) + 1 \n", " # Adding 1 to make it more 'natural' (i.e. 1-based index)\n", " return position\n", " except ValueError:\n", " # following print reveals any occurence of 'de' which is not accounted for\n", " print ('NOT:',sentence)\n", " return -1 # Word not found in the sentence\n", " \n", "target_word = unidecode('δέ')\n", "position_frequency = {}\n", "number_results=0\n", " \n", "# DeResult is a list of tuples each consisting of two integers, we need the second one _,\n", "for word in DeResult:\n", " # get first item from tuple of integers \n", " parent_wg=L.u(word[0])[0]\n", " number_results+=1\n", " # decoded text of the parent wordgroup with punctuations removed and abreviations 'repaired'\n", " parent_wg_text=fix_abbreviated(remove_punctuation(unidecode(T.text(parent_wg))))\n", " position = find_word_position(parent_wg_text, target_word)\n", " # Check if the position is found\n", " if position != -1:\n", " # Update the frequency dictionary\n", " position_frequency[position] = position_frequency.get(position, 0) + 1\n", "\n", "print('Total number of occurances of δέ:',number_results)\n", "\n", "# Calculate percentages\n", "total_positions = sum(position_frequency.values())\n", "position_percentage = {pos: count / total_positions * 100 for pos, count in position_frequency.items()}\n", "\n", "# Print the table\n", "table_output=\"Position | Frequency | Percentage \\n --- | --- | ---\\n \"\n", "for pos in sorted(position_percentage.keys()):\n", " table_output +=f\"{pos} | {position_frequency.get(pos, 0)} | {position_percentage.get(pos, 0):.2f}%\\n\"\n", "N1904.dm(table_output)" ] }, { "cell_type": "markdown", "id": "4044bcf8-e5ac-4e34-859b-ba1d942b3bb9", "metadata": {}, "source": [ "## 3.4 - Position of μέν within a clause\n", "##### [Back to TOC](#TOC)" ] }, { "cell_type": "markdown", "id": "29b7599b-80b1-4491-8a08-9024a4c293a3", "metadata": {}, "source": [ "The conjunctions μέν is generally postpositive, appearing as the second word in a (sub)clause of the surface text. Often used in contrast with δέ, μέν does not have a direct English equivalent but is used to set up a contrast or comparison, functioning similarly to \"on the one hand.\" This script will determine the frequency of the positions of the conjunction μέν within a clause (wordgroup)." ] }, { "cell_type": "code", "execution_count": 23, "id": "a93f43ba-c33c-463d-a031-72597e9c6606", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of occurances of μέν: 180\n" ] }, { "data": { "text/markdown": [ "Position | Frequency | Percentage \n", " --- | --- | ---\n", " 1 | 2 | 1.11%\n", "2 | 151 | 83.89%\n", "3 | 18 | 10.00%\n", "4 | 4 | 2.22%\n", "5 | 1 | 0.56%\n", "6 | 3 | 1.67%\n", "7 | 1 | 0.56%\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import unicodedata\n", "import string\n", "from unidecode import unidecode\n", "\n", "def remove_punctuation(input_string):\n", " # Create a string of all punctuation characters\n", " punctuation_chars = \".,*;\"\n", " \n", " # Use str.translate to replace punctuation characters with empty string\n", " result_string = input_string.translate(str.maketrans(\"\", \"\", punctuation_chars))\n", " \n", " return result_string\n", "\n", "# small function to find position of a word\n", "def find_word_position(sentence, target_word):\n", " words = sentence.split()\n", " try:\n", " position = words.index(target_word) + 1 \n", " # Adding 1 to make it more 'natural' (i.e. 1-based index)\n", " return position\n", " except ValueError:\n", " # following print reveals any occurence of 'de' which is not accounted for\n", " print ('NOT:',sentence)\n", " return -1 # Word not found in the sentence\n", " \n", "target_word = unidecode('μέν')\n", "position_frequency = {}\n", "number_results=0\n", " \n", "# DeResult is a list of tuples each consisting of two integers, we need the second one _,\n", "for word in MenResult:\n", " # get first item from tuple of integers \n", " parent_wg=L.u(word[0])[0]\n", " number_results+=1\n", " # decoded text of the parent wordgroup with punctuations removed and abreviations 'repaired'\n", " parent_wg_text=remove_punctuation(unidecode(T.text(parent_wg)))\n", " position = find_word_position(parent_wg_text, target_word)\n", " # Check if the position is found\n", " if position != -1:\n", " # Update the frequency dictionary\n", " position_frequency[position] = position_frequency.get(position, 0) + 1\n", "\n", "print('Total number of occurances of μέν:',number_results)\n", "\n", "# Calculate percentages\n", "total_positions = sum(position_frequency.values())\n", "position_percentage = {pos: count / total_positions * 100 for pos, count in position_frequency.items()}\n", "\n", "# Print the table\n", "table_output=\"Position | Frequency | Percentage \\n --- | --- | ---\\n \"\n", "for pos in sorted(position_percentage.keys()):\n", " table_output +=f\"{pos} | {position_frequency.get(pos, 0)} | {position_percentage.get(pos, 0):.2f}%\\n\"\n", "N1904.dm(table_output)" ] }, { "cell_type": "markdown", "id": "2209de87-f1ba-441d-9e12-e0d265d24d03", "metadata": {}, "source": [ "# 4 - Attribution and footnotes\n", "##### [Back to TOC](#TOC)\n", "\n", "#### Footnotes:\n", "\n", "1 Porter, Stanley E., Jeffrey T. Reed, and Matthew Brook O’Donnell. *Fundamentals of New Testament Greek* (Grand Rapids, MI; Cambridge: William B. Eerdmans Publishing Company, 2010), 181." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }