# Position of generally postpositive conjunctions in a clause (Nestle1904LFT)

## Table of content <a class="anchor" id="TOC"></a>
* <a href="#bullet1">1 - Introduction</a>
* <a href="#bullet2">2 - Load Text-Fabric app and data</a>
* <a href="#bullet3">3 - Performing the queries</a>
    * <a href="#bullet3x1">3.1 - Identifying the occurences of the lemmata</a>
    * <a href="#bullet3x2">3.2 - Position of conjunction Œ≥Œ¨œÅ within a clause</a>
    * <a href="#bullet3x3">3.3 - Position of conjunction Œ¥Œ≠ within a clause</a>
    * <a href="#bullet3x4">3.4 - Position of conjunction ŒºŒ≠ŒΩ within a clause</a>
    * <a href="#bullet3x4">3.4 - Position of conjunction Œø·ΩñŒΩ within a clause</a>
* <a href="#bullet4">4 - Attribution and footnotes</a>

# 1 - Introduction <a class="anchor" id="bullet1"></a>
##### [Back to TOC](#TOC)

In ancient Greek, postpositive conjunctions like Œ¥Œ≠ and Œ≥Œ¨œÅ often occupy the second position in a (sub)clause, following the first significant word. This placement not only structures the syntax but also subtly nuances the meaning and flow of the text. This notebook determines the positional frequency of these conjunctions within a (sub)clause within the corpus of the Greek New Testament (based upon the LowFat treebank).

According to Stanley E. Porter *et.al.* the following conjuctions can be regarded to be postpositive: Œ≥Œ¨œÅ, Œ¥Œ≠, ŒºŒ≠ŒΩ, and Œø·ΩñŒΩ.<a href="#note1"><sup>1</sup></a>


# 2 - Load Text-Fabric app and data <a class="anchor" id="bullet2"></a>
##### [Back to TOC](#TOC)

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
# Loading the Text-Fabric code
# Note: it is assumed Text-Fabric is installed in your environment
from tf.fabric import Fabric
from tf.app import use

In [3]:
# load the N1904 app and data
N1904 = use ("tonyjurg/Nestle1904LFT", version="0.6", hoist=globals())

**Locating corpus resources ...**

Name,# of nodes,# slots / node,% coverage
book,27,5102.93,100
chapter,260,529.92,100
verse,7943,17.35,100
sentence,8011,17.2,100
wg,105430,6.85,524
word,137779,1.0,100


In [19]:
# The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display of tables with notebook viewer)
N1904.dh(N1904.getCss())

# 3 - Performing the queries <a class="anchor" id="bullet3"></a>
##### [Back to TOC](#TOC)

## 3.1 - Identifying the occurences of the lemmata<a class="anchor" id="bullet3x1"></a>
##### [Back to TOC](#TOC)

Identifing the occurences of the conjunction under investigation can be done using a straight forward query. This will provide us with the node numbers of the word nodes containing the various lemmata which will allow for further processing.

In [20]:
# Define the query template
GarQuery= '''
word lemma=Œ≥Œ¨œÅ
'''

DeQuery= '''
word lemma=Œ¥Œ≠
'''

MenQuery= '''
word lemma=ŒºŒ≠ŒΩ
'''

OunQuery='''
word lemma=Œø·ΩñŒΩ
'''

# The following will create a list containing ordered tuples consisting of node numbers of the items as they appear in the query
print('Œ≥Œ¨œÅ:',end='')
GarResult = N1904.search(GarQuery)
print('Œ¥Œ≠: ',end='')
DeResult = N1904.search(DeQuery)
print('ŒºŒ≠ŒΩ:',end='')
MenResult = N1904.search(MenQuery)
print('Œø·ΩñŒΩ:',end='')
OunResult = N1904.search(OunQuery)

Œ≥Œ¨œÅ:  0.10s 1038 results
Œ¥Œ≠:   0.11s 2787 results
ŒºŒ≠ŒΩ:  0.10s 180 results
Œø·ΩñŒΩ:  0.10s 496 results


## 3.2 - Position of Œ≥Œ¨œÅ within a clause<a class="anchor" id="bullet3x2"></a>
##### [Back to TOC](#TOC)

The conjunctions Œ≥Œ¨œÅ is generally postpositive, appearing as the second word in a (sub)clause of the surface text. Its primary function is to provide explanation or justification for a statement. This script will determine the frequency of the positions of the conjunction Œ≥Œ¨œÅ within a clause (wordgroup).

In [21]:
import unicodedata
import string
from unidecode import unidecode

def remove_punctuation(input_string):
    # Create a string of all punctuation characters
    punctuation_chars = ".,*;"
    
    # Use str.translate to replace punctuation characters with empty string
    result_string = input_string.translate(str.maketrans("", "", punctuation_chars))
    
    return result_string

# small function to find position of a word
def find_word_position(sentence, target_word):
    words = sentence.split()
    try:
        position = words.index(target_word) + 1  
        # Adding 1 to make it more 'natural' (i.e. 1-based index)
        return position
    except ValueError:
        # following print reveals any occurence of 'de' which is not accounted for
        print ('NOT:',sentence)
        return -1  # Word not found in the sentence
    
target_word = unidecode('Œ≥Œ¨œÅ')
position_frequency = {}
number_results=0
    
# DeResult is a list of tuples each consisting of two integers, we need the second one  _,
for word in GarResult:
    # get first item from tuple of integers 
    parent_wg=L.u(word[0])[0]
    number_results+=1
    # decoded text of the parent wordgroup with punctuations removed and abreviations 'repaired'
    parent_wg_text=remove_punctuation(unidecode(T.text(parent_wg)))
    position = find_word_position(parent_wg_text, target_word)
    # Check if the position is found
    if position != -1:
        # Update the frequency dictionary
        position_frequency[position] = position_frequency.get(position, 0) + 1

print('Total number of occurances of Œ≥Œ¨œÅ:',number_results)

# Calculate percentages
total_positions = sum(position_frequency.values())
position_percentage = {pos: count / total_positions * 100 for pos, count in position_frequency.items()}

# Print the table
table_output="Position | Frequency | Percentage \n --- | --- | ---\n "
for pos in sorted(position_percentage.keys()):
   table_output +=f"{pos} | {position_frequency.get(pos, 0)} | {position_percentage.get(pos, 0):.2f}%\n"
N1904.dm(table_output)

Total number of occurances of Œ≥Œ¨œÅ: 1038


Position | Frequency | Percentage 
 --- | --- | ---
 2 | 959 | 92.39%
3 | 74 | 7.13%
4 | 4 | 0.39%
5 | 1 | 0.10%


## 3.3 - Position of Œ¥Œ≠ within a clause<a class="anchor" id="bullet3x3"></a>
##### [Back to TOC](#TOC)

The conjunctions Œ¥Œ≠ is generally postpositive, appearing as the second word in a (sub)clause of the surface text. Although its functions are diverse, it plays a crucial role in the structure and flow of Greek sentences. This script will determine the frequency of the positions of the conjunction Œ¥Œ≠ within a clause (wordgroup).

In [22]:
import unicodedata
import string
from unidecode import unidecode

def remove_punctuation(input_string):
    # Create a string of all punctuation characters
    punctuation_chars = ".,*;"
    
    # Use str.translate to replace punctuation characters with empty string
    result_string = input_string.translate(str.maketrans("", "", punctuation_chars))
    
    return result_string

def fix_abbreviated(input_string):
    fixed_string = input_string.replace("d'", "de")
    return fixed_string


# small function to find position of a word
def find_word_position(sentence, target_word):
    words = sentence.split()
    try:
        position = words.index(target_word) + 1  
        # Adding 1 to make it more 'natural' (i.e. 1-based index)
        return position
    except ValueError:
        # following print reveals any occurence of 'de' which is not accounted for
        print ('NOT:',sentence)
        return -1  # Word not found in the sentence
    
target_word = unidecode('Œ¥Œ≠')
position_frequency = {}
number_results=0
    
# DeResult is a list of tuples each consisting of two integers, we need the second one  _,
for word in DeResult:
    # get first item from tuple of integers 
    parent_wg=L.u(word[0])[0]
    number_results+=1
    # decoded text of the parent wordgroup with punctuations removed and abreviations 'repaired'
    parent_wg_text=fix_abbreviated(remove_punctuation(unidecode(T.text(parent_wg))))
    position = find_word_position(parent_wg_text, target_word)
    # Check if the position is found
    if position != -1:
        # Update the frequency dictionary
        position_frequency[position] = position_frequency.get(position, 0) + 1

print('Total number of occurances of Œ¥Œ≠:',number_results)

# Calculate percentages
total_positions = sum(position_frequency.values())
position_percentage = {pos: count / total_positions * 100 for pos, count in position_frequency.items()}

# Print the table
table_output="Position | Frequency | Percentage \n --- | --- | ---\n "
for pos in sorted(position_percentage.keys()):
   table_output +=f"{pos} | {position_frequency.get(pos, 0)} | {position_percentage.get(pos, 0):.2f}%\n"
N1904.dm(table_output)

Total number of occurances of Œ¥Œ≠: 2787


Position | Frequency | Percentage 
 --- | --- | ---
 1 | 1 | 0.04%
2 | 2687 | 96.41%
3 | 75 | 2.69%
4 | 14 | 0.50%
5 | 2 | 0.07%
6 | 1 | 0.04%
7 | 3 | 0.11%
9 | 2 | 0.07%
11 | 1 | 0.04%
12 | 1 | 0.04%


## 3.4 - Position of ŒºŒ≠ŒΩ within a clause<a class="anchor" id="bullet3x4"></a>
##### [Back to TOC](#TOC)

The conjunctions ŒºŒ≠ŒΩ is generally postpositive, appearing as the second word in a (sub)clause of the surface text. Often used in contrast with Œ¥Œ≠, ŒºŒ≠ŒΩ does not have a direct English equivalent but is used to set up a contrast or comparison, functioning similarly to "on the one hand." This script will determine the frequency of the positions of the conjunction ŒºŒ≠ŒΩ within a clause (wordgroup).

In [23]:
import unicodedata
import string
from unidecode import unidecode

def remove_punctuation(input_string):
    # Create a string of all punctuation characters
    punctuation_chars = ".,*;"
    
    # Use str.translate to replace punctuation characters with empty string
    result_string = input_string.translate(str.maketrans("", "", punctuation_chars))
    
    return result_string

# small function to find position of a word
def find_word_position(sentence, target_word):
    words = sentence.split()
    try:
        position = words.index(target_word) + 1  
        # Adding 1 to make it more 'natural' (i.e. 1-based index)
        return position
    except ValueError:
        # following print reveals any occurence of 'de' which is not accounted for
        print ('NOT:',sentence)
        return -1  # Word not found in the sentence
    
target_word = unidecode('ŒºŒ≠ŒΩ')
position_frequency = {}
number_results=0
    
# DeResult is a list of tuples each consisting of two integers, we need the second one  _,
for word in MenResult:
    # get first item from tuple of integers 
    parent_wg=L.u(word[0])[0]
    number_results+=1
    # decoded text of the parent wordgroup with punctuations removed and abreviations 'repaired'
    parent_wg_text=remove_punctuation(unidecode(T.text(parent_wg)))
    position = find_word_position(parent_wg_text, target_word)
    # Check if the position is found
    if position != -1:
        # Update the frequency dictionary
        position_frequency[position] = position_frequency.get(position, 0) + 1

print('Total number of occurances of ŒºŒ≠ŒΩ:',number_results)

# Calculate percentages
total_positions = sum(position_frequency.values())
position_percentage = {pos: count / total_positions * 100 for pos, count in position_frequency.items()}

# Print the table
table_output="Position | Frequency | Percentage \n --- | --- | ---\n "
for pos in sorted(position_percentage.keys()):
   table_output +=f"{pos} | {position_frequency.get(pos, 0)} | {position_percentage.get(pos, 0):.2f}%\n"
N1904.dm(table_output)

Total number of occurances of ŒºŒ≠ŒΩ: 180


Position | Frequency | Percentage 
 --- | --- | ---
 1 | 2 | 1.11%
2 | 151 | 83.89%
3 | 18 | 10.00%
4 | 4 | 2.22%
5 | 1 | 0.56%
6 | 3 | 1.67%
7 | 1 | 0.56%


# 4 - Attribution and footnotes<a class="anchor" id="bullet4"></a>
##### [Back to TOC](#TOC)

#### Footnotes:

<a class="anchor" id="note1"></a><sup>1</sup> Porter, Stanley E., Jeffrey T. Reed, and Matthew Brook O‚ÄôDonnell. *Fundamentals of New Testament Greek* (Grand Rapids, MI; Cambridge: William B. Eerdmans Publishing Company, 2010), 181.