"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.table(results, end=5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we do a pretty display, the `sense` feature shows up."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 1"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.show(results, start=1, end=1, withNodes=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Lingo heads\n",
"If you click the triangle before **etcbc/lingo/heads/tf** you see what features it contributes.\n",
"Unfortunately, the authors have not provided a description of this feature, but if you click\n",
"on the triangle after *heads* none, you see where the feature comes from and who has made it.\n",
"\n",
"Moreover, the fact that *heads* is in italics makes clear that it is an edge feature.\n",
"\n",
"Let's use it in a query:\n",
"Now, `heads` is an edge feature, we cannot directly make it visible in pretty displays, but we can use it in queries.\n",
"\n",
"We also want to make the feature `sense` visible, so we mention the feature in the query, without restricting the results."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.42s 402 results\n"
]
}
],
"source": [
"results = A.search(\n",
" \"\"\"\n",
"book book=Genesis\n",
" chapter chapter=1\n",
" clause\n",
" phrase\n",
" -heads> word sense*\n",
"\"\"\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 1"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"chapter Genesis 1
book=Genesischapter=1
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"verse
book=Genesischapter=1
sentence 1
clause xQtX NA
phrase PP Time
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase VP Pred
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"sense=d-
phrase NP Subj
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase PP Objc
heads•\n",
"↦\n",
" \n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"result 2"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"chapter Genesis 1
book=Genesischapter=1
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"verse
book=Genesischapter=1
sentence 1
clause xQtX NA
phrase PP Time
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase VP Pred
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"sense=d-
phrase NP Subj
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase PP Objc
heads•\n",
"↦\n",
" \n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.show(results, start=1, end=2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note how the words that are ***heads*** of their phrases are highlighted within their phrases."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Participants\n",
"\n",
"Now we are going to add another promising module, provided by Christian Canu Højgaard, from this repo:\n",
"[participants](https://github.com/ch-jensen/participants).\n",
"\n",
"Let's do it in the straightforward way:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"**Locating corpus resources ...**"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"app: ~/text-fabric-data/github/ETCBC/bhsa/app"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/bhsa/tf/2021"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/lingo/heads/tf/2021"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/valence/tf/2021"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"The requested data is not available offline\n",
"\t~/text-fabric-data/github/ch-jensen/participants/actor/tf/2021 not found\n",
"rate limit is 5000 requests per hour, with 5000 left for this hour\n",
"\tconnecting to online GitHub repo ch-jensen/participants ... connected\n",
"No directory /actor/tf/2021 in #9671910a329c069cfd3d366526ea816de57666dcWill try something else\n",
"\tFailed\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"No directory /actor/tf/2021 in #9671910a329c069cfd3d366526ea816de57666dc\tFailed\n"
]
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/phono/tf/2021"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/parallels/tf/2021"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"There were problems with loading data.\n",
"The TF API has not been loaded!\n",
"The app \"ETCBC/bhsa\" will not work!\n"
]
}
],
"source": [
"A = use(\n",
" 'ETCBC/bhsa',\n",
" mod=(\n",
" \"ETCBC/lingo/heads/tf\",\n",
" \"ETCBC/valence/tf\",\n",
" \"ch-jensen/participants/actor/tf\"\n",
" ),\n",
" hoist=globals(),\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The features are not there!\n",
"\n",
"If we have a look on GitHub in this repo we see under\n",
"[actor/tf](https://github.com/ch-jensen/participants/tree/master/actor/tf)\n",
"the directory `c` only. Christian has produced his features against version `c` of the BHSA.\n",
"\n",
"Ok, then we go back, and run our command for version `c`."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"**Locating corpus resources ...**"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"app: ~/text-fabric-data/github/ETCBC/bhsa/app"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/bhsa/tf/c"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/lingo/heads/tf/c"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/valence/tf/c"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ch-jensen/participants/actor/tf/c"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/phono/tf/c"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"data: ~/text-fabric-data/github/ETCBC/parallels/tf/c"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" TF: TF API 12.1.7, ETCBC/bhsa/app v3, Search Reference
\n",
" Data: ETCBC - bhsa c, Character table, Feature docs
\n",
" Node types
\n",
"\n",
" \n",
" Name | \n",
" # of nodes | \n",
" # slots / node | \n",
" % coverage | \n",
"
\n",
"\n",
"\n",
" book | \n",
" 39 | \n",
" 10938.05 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" chapter | \n",
" 929 | \n",
" 459.19 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" lex | \n",
" 9233 | \n",
" 46.20 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" verse | \n",
" 23213 | \n",
" 18.38 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" half_verse | \n",
" 45180 | \n",
" 9.44 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" sentence | \n",
" 63727 | \n",
" 6.69 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" sentence_atom | \n",
" 64525 | \n",
" 6.61 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" clause | \n",
" 88121 | \n",
" 4.84 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" clause_atom | \n",
" 90688 | \n",
" 4.70 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" phrase | \n",
" 253207 | \n",
" 1.68 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" phrase_atom | \n",
" 267541 | \n",
" 1.59 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" subphrase | \n",
" 113812 | \n",
" 1.42 | \n",
" 38 | \n",
"
\n",
"\n",
"\n",
" word | \n",
" 426584 | \n",
" 1.00 | \n",
" 100 | \n",
"
\n",
"
\n",
" Sets: no custom sets
\n",
" Features:
\n",
"Parallel Passages
\n",
" \n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"ETCBC/lingo/heads/tf
\n",
" \n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"Phonetic Transcriptions
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"ETCBC/valence/tf
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
corrected phrase function, only present for phrases that were in a correction sheet\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
whether the phrase function has been manually corrected\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
constituent role main classification\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
additional lexical characteristics\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
default value before enrichment logic has been applied\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
verbal function main classification\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
whether the generated enrichment features have been manually changed\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
additional semantic characteristics\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
sense label verb occurrences, computed by the flowchart algorithm, see https://github.com/ETCBC/valence/wiki/Legend\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
verbal valence main classification\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"ch-jensen/participants/actor/tf
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
Participant references for words, subphrases and phrases. The references are adapted from Eep Talstra's work on participant tracking. http://doi.org/10.5281/zenodo.1479491\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
Participant references for pronominal suffixes. The references are adapted from Eep Talstra's work on participant tracking. http://doi.org/10.5281/zenodo.1479491\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
Edges to co-referring actors on chapter-level. The references are adapted from Eep Talstra's work on participant tracking. http://doi.org/10.5281/zenodo.1479491\n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" Settings:
specified
- apiVersion:
3
- appName:
ETCBC/bhsa
- appPath:
/Users/me/text-fabric-data/github/ETCBC/bhsa/app
- commit:
gb112c161cfd21eae403d51a2733740d8743460e7
- css:
''
dataDisplay:
exampleSectionHtml:
<code>Genesis 1:1</code> (use <a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names</a>)
excludedFeatures:
g_uvf_utf8
g_vbs
kq_hybrid
languageISO
g_nme
lex0
is_root
g_vbs_utf8
g_uvf
dist
root
suffix_person
g_vbe
dist_unit
suffix_number
distributional_parent
kq_hybrid_utf8
crossrefSET
instruction
g_prs
lexeme_count
rank_occ
g_pfm_utf8
freq_occ
crossrefLCS
functional_parent
g_pfm
g_nme_utf8
g_vbe_utf8
kind
g_prs_utf8
suffix_gender
mother_object_type
noneValues:
absent
n/a
none
unknown
- no value
NA
docs:
- docBase:
{docRoot}/{repo}
- docExt:
''
- docPage:
''
- docRoot:
https://{org}.github.io
- featurePage:
0_home
- interfaceDefaults:
{}
- isCompatible:
True
- local:
local
- localDir:
/Users/me/text-fabric-data/github/ETCBC/bhsa/_temp
provenanceSpec:
- corpus:
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
- doi:
10.5281/zenodo.1007624
- extraData:
ner
moduleSpecs:
:
- backend: no value
- corpus:
Phonetic Transcriptions
docUrl:
https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
- doi:
10.5281/zenodo.1007636
- org:
ETCBC
- relative:
/tf
- repo:
phono
:
- backend: no value
- corpus:
Parallel Passages
docUrl:
https://nbviewer.jupyter.org/github/ETCBC/parallels/blob/master/programs/parallels.ipynb
- doi:
10.5281/zenodo.1007642
- org:
ETCBC
- relative:
/tf
- repo:
parallels
- org:
ETCBC
- relative:
/tf
- repo:
bhsa
- version:
c
- webBase:
https://shebanq.ancient-data.org/hebrew
- webHint:
Show this on SHEBANQ
- webLang:
la
- webLexId:
True
webUrl:
{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
- webUrlLex:
{webBase}/word?version={version}&id=<lid>
- release:
v1.8.1
typeDisplay:
clause:
- label:
{typ} {rela}
- style:
''
clause_atom:
- hidden:
True
- label:
{code}
- level:
1
- style:
''
half_verse:
- hidden:
True
- label:
{label}
- style:
''
- verselike:
True
lex:
- featuresBare:
gloss
- label:
{voc_lex_utf8}
- lexOcc:
word
- style:
orig
- template:
{voc_lex_utf8}
phrase:
- label:
{typ} {function}
- style:
''
phrase_atom:
- hidden:
True
- label:
{typ} {rela}
- level:
1
- style:
''
sentence:
sentence_atom:
- hidden:
True
- label:
{number}
- level:
1
- style:
''
subphrase:
- hidden:
True
- label:
{number}
- style:
''
word:
- features:
pdp vs vt
- featuresBare:
lex:gloss
- writing:
hbo
\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A = use(\n",
" 'ETCBC/bhsa',\n",
" version=\"c\",\n",
" mod=(\n",
" \"ETCBC/lingo/heads/tf\",\n",
" \"ETCBC/valence/tf\",\n",
" \"ch-jensen/participants/actor/tf\"\n",
" ),\n",
" hoist=globals(),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" TF: TF API 12.1.7, ETCBC/bhsa/app v3, Search Reference
\n",
" Data: ETCBC - bhsa 2021, Character table, Feature docs
\n",
" Node types
\n",
"\n",
" \n",
" Name | \n",
" # of nodes | \n",
" # slots / node | \n",
" % coverage | \n",
"
\n",
"\n",
"\n",
" book | \n",
" 39 | \n",
" 10938.21 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" chapter | \n",
" 929 | \n",
" 459.19 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" lex | \n",
" 9230 | \n",
" 46.22 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" verse | \n",
" 23213 | \n",
" 18.38 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" half_verse | \n",
" 45179 | \n",
" 9.44 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" sentence | \n",
" 63717 | \n",
" 6.70 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" sentence_atom | \n",
" 64514 | \n",
" 6.61 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" clause | \n",
" 88131 | \n",
" 4.84 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" clause_atom | \n",
" 90704 | \n",
" 4.70 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" phrase | \n",
" 253203 | \n",
" 1.68 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" phrase_atom | \n",
" 267532 | \n",
" 1.59 | \n",
" 100 | \n",
"
\n",
"\n",
"\n",
" subphrase | \n",
" 113850 | \n",
" 1.42 | \n",
" 38 | \n",
"
\n",
"\n",
"\n",
" word | \n",
" 426590 | \n",
" 1.00 | \n",
" 100 | \n",
"
\n",
"
\n",
" Sets: no custom sets
\n",
" Features:
\n",
"Parallel Passages
\n",
" \n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
" 🆗 links between similar passages
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ book name in Latin (Genesis; Numeri; Reges1; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ book name in amharic (ኣማርኛ)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
" ✅ chapter number (1; 2; 3; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
" ✅ identifier of a clause atom relationship (0; 74; 367; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ determinedness of phrase(atom) (det; und; NA.)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
" ✅ frequency of lexemes
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" 🆗 english translation of lexeme (beginning create god(s))
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ grammatical gender (m; f; NA; unknown.)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ of word or lexeme (Hebrew; Aramaic.)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ lexical set, subclassification of part-of-speech (card; ques; mult)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ⚠️ named entity type (pers; mens; gens; topo; ppde.)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ grammatical number (sg; du; pl; NA; unknown.)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
" ✅ sequence number of an object within its context
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
"
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" 🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ preformative consonantal-transliterated (absent; n/a; J, ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ pronominal suffix gender (m; f; NA; unknown.)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ pronominal suffix number (sg; du; pl; NA; unknown.)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ pronominal suffix person (p1; p2; p3; NA; unknown.)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ grammatical person (p1; p2; p3; NA; unknown.)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ word pointed-transliterated masoretic reading correction
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ interword material -pointed-transliterated (Masoretic correction)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ interword material -pointed-transliterated (Masoretic correction)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ word pointed-Hebrew masoretic reading correction
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
" ✅ ranking of lexemes based on freqnuecy
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ part-of-speech (art; verb; subs; nmpr, ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ state of a noun (a (absolute); c (construct); e (emphatic).)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
" ✅ clause atom: its level in the linguistic embedding
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ interword material pointed-transliterated (& 00 05 00_P ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ interword material pointed-Hebrew (־ ׃)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ verbal ending consonantal-transliterated (n/a; W; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ root formation consonantal-transliterated (absent; n/a; H; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
int
\n",
"\n",
"
\n",
" ✅ verse number
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ verbal stem (qal; piel; hif; apel; pael)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ✅ verbal tense (perf; impv; wayq; infc)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
" ✅ linguistic dependency between textual objects
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"Phonetic Transcriptions
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" 🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" 🆗 interword material in phonological transcription
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"etcbc/lingo/heads/tf
\n",
" \n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
none
\n",
"\n",
"
\n",
"
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
"etcbc/valence/tf
\n",
" \n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ corrected phrase function, only present for phrases that were in a correction sheet
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ whether the phrase function has been manually corrected
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ constituent role main classification
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ additional lexical characteristics
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ default value before enrichment logic has been applied
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ verbal function main classification
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ whether the generated enrichment features have been manually changed
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ additional semantic characteristics
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ sense label of verb occurrences (d-; i.; -p; d-; ...)
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
str
\n",
"\n",
"
\n",
" ❗️ verbal valence main classification
\n",
" \n",
" \n",
"\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" Settings:
specified
- apiVersion:
3
- appName:
ETCBC/bhsa
- appPath:
/Users/me/text-fabric-data/github/ETCBC/bhsa/app
- commit:
gb112c161cfd21eae403d51a2733740d8743460e7
- css:
''
dataDisplay:
exampleSectionHtml:
<code>Genesis 1:1</code> (use <a href=\"https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf\" target=\"_blank\">English book names</a>)
excludedFeatures:
g_uvf_utf8
g_vbs
kq_hybrid
languageISO
g_nme
lex0
is_root
g_vbs_utf8
g_uvf
dist
root
suffix_person
g_vbe
dist_unit
suffix_number
distributional_parent
kq_hybrid_utf8
crossrefSET
instruction
g_prs
lexeme_count
rank_occ
g_pfm_utf8
freq_occ
crossrefLCS
functional_parent
g_pfm
g_nme_utf8
g_vbe_utf8
kind
g_prs_utf8
suffix_gender
mother_object_type
noneValues:
absent
n/a
none
unknown
- no value
NA
docs:
- docBase:
{docRoot}/{repo}
- docExt:
''
- docPage:
''
- docRoot:
https://{org}.github.io
- featurePage:
0_home
- interfaceDefaults:
{}
- isCompatible:
True
- local:
local
- localDir:
/Users/me/text-fabric-data/github/ETCBC/bhsa/_temp
provenanceSpec:
- corpus:
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
- doi:
10.5281/zenodo.1007624
- extraData:
ner
moduleSpecs:
:
- backend: no value
- corpus:
Phonetic Transcriptions
docUrl:
https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
- doi:
10.5281/zenodo.1007636
- org:
ETCBC
- relative:
/tf
- repo:
phono
:
- backend: no value
- corpus:
Parallel Passages
docUrl:
https://nbviewer.jupyter.org/github/ETCBC/parallels/blob/master/programs/parallels.ipynb
- doi:
10.5281/zenodo.1007642
- org:
ETCBC
- relative:
/tf
- repo:
parallels
- org:
ETCBC
- relative:
/tf
- repo:
bhsa
- version:
2021
- webBase:
https://shebanq.ancient-data.org/hebrew
- webHint:
Show this on SHEBANQ
- webLang:
la
- webLexId:
True
webUrl:
{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
- webUrlLex:
{webBase}/word?version={version}&id=<lid>
- release:
v1.8.1
typeDisplay:
clause:
- label:
{typ} {rela}
- style:
''
clause_atom:
- hidden:
True
- label:
{code}
- level:
1
- style:
''
half_verse:
- hidden:
True
- label:
{label}
- style:
''
- verselike:
True
lex:
- featuresBare:
gloss
- label:
{voc_lex_utf8}
- lexOcc:
word
- style:
orig
- template:
{voc_lex_utf8}
phrase:
- label:
{typ} {function}
- style:
''
phrase_atom:
- hidden:
True
- label:
{typ} {rela}
- level:
1
- style:
''
sentence:
sentence_atom:
- hidden:
True
- label:
{number}
- level:
1
- style:
''
subphrase:
- hidden:
True
- label:
{number}
- style:
''
word:
- features:
pdp vs vt
- featuresBare:
lex:gloss
- writing:
hbo
\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.header(allMeta=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": []
},
"source": [
"While this succeeded, there are scenarios where you have more trouble.\n",
"For example, you decide that you really, really need the bhsa data as in release 1.7.1.\n",
"\n",
"Then you discover that this does note work:\n",
"\n",
"```\n",
"A = use(\n",
" 'etcbc/bhsa',\n",
" version=\"c\",\n",
" checkout=\"v1.7.1\",\n",
" mod=(\"etcbc/lingo/heads/tf\" ,\"etcbc/valence/tf\", \"ch-jensen/participants/actor/tf\"), \n",
" hoist=globals(),\n",
")\n",
"```\n",
"\n",
"because the BHSA invokes two standard modules, `etcbc/phono/tf` and `etcbc/parallels/tf` and if you go to their\n",
"GitHub repos, you see that they do not have a release `v1.7.1`.\n",
"You have to walk through their releases and find one with the right data version.\n",
"Having found them, you can then get it all like this:\n",
"\n",
"```\n",
"A = use(\n",
" 'etcbc/bhsa',\n",
" version=\"c\",\n",
" checkout=\"v1.7.1\",\n",
" mod=(\n",
" \"etcbc/phono/tf:1.2\",\n",
" \"etcbc/parallels/tf:v1.2\",\n",
" \"etcbc/lingo/heads/tf\",\n",
" \"etcbc/valence/tf\",\n",
" \"ch-jensen/participants/actor/tf\",\n",
" ),\n",
" hoist=globals(),\n",
")\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Semantic actors\n",
"\n",
"Let's find out about *actor*.\n",
"\n",
"Again, we can click on the triangles and see information about the features.\n",
"Christian has provided descriptions in the metadata of the features.\n",
"\n",
"And we can look into the data itself."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"415"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fl = F.actor.freqList()\n",
"len(fl)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(('JHWH', 358),\n",
" ('BN JFR>L', 205),\n",
" ('>JC', 101),\n",
" ('2sm\"YOUSgmas\"', 67),\n",
" ('MCH', 60),\n",
" ('>RY', 58),\n",
" ('>TM', 45),\n",
" ('>X \"YOUSgmas\"', 36),\n",
" ('JFR>L', 35),\n",
" ('KHN', 33))"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fl[0:10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Which nodes have an actor feature?"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'phrase_atom', 'subphrase'}"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"{F.otype.v(n) for n in N.walk() if F.actor.v(n)}"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.08s 2062 results\n"
]
}
],
"source": [
"results = A.search(\n",
" \"\"\"\n",
"phrase_atom actor\n",
"\"\"\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's show some of the rarer actor values:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.10s 30 results\n"
]
}
],
"source": [
"results = A.search(\n",
" \"\"\"\n",
"phrase_atom actor=KHN\n",
"\"\"\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"n | p | phrase_atom |
\n",
"1 | Leviticus 17:5 | אֶל־הַכֹּהֵ֑ן |
\n",
"2 | Leviticus 17:6 | זָרַ֨ק |
\n",
"3 | Leviticus 17:6 | הַכֹּהֵ֤ן |
\n",
"4 | Leviticus 17:6 | הִקְטִ֣יר |
\n",
"5 | Leviticus 19:22 | כִפֶּר֩ |
\n",
"6 | Leviticus 19:22 | הַכֹּהֵ֜ן |
\n",
"7 | Leviticus 21:1 | אֶל־הַכֹּהֲנִ֖ים |
\n",
"8 | Leviticus 21:1 | בְּנֵ֣י אַהֲרֹ֑ן |
\n",
"9 | Leviticus 21:5 | יִקְרְח֤וּ |
\n",
"10 | Leviticus 21:5 | יְגַלֵּ֑חוּ |
\n",
"11 | Leviticus 21:5 | יִשְׂרְט֖וּ |
\n",
"12 | Leviticus 21:6 | קְדֹשִׁ֤ים |
\n",
"13 | Leviticus 21:6 | יִהְיוּ֙ |
\n",
"14 | Leviticus 21:6 | יְחַלְּל֔וּ |
\n",
"15 | Leviticus 21:6 | הֵ֥ם |
\n",
"16 | Leviticus 21:6 | מַקְרִיבִ֖ם |
\n",
"17 | Leviticus 21:6 | הָ֥יוּ |
\n",
"18 | Leviticus 21:6 | קֹֽדֶשׁ׃ |
\n",
"19 | Leviticus 21:7 | יִקָּ֔חוּ |
\n",
"20 | Leviticus 21:7 | יִקָּ֑חוּ |
\n",
"21 | Leviticus 22:11 | כֹהֵ֗ן |
\n",
"22 | Leviticus 22:11 | יִקְנֶ֥ה |
\n",
"23 | Leviticus 22:14 | לַכֹּהֵ֖ן |
\n",
"24 | Leviticus 23:10 | אֶל־הַכֹּהֵֽן׃ |
\n",
"25 | Leviticus 23:11 | הֵנִ֧יף |
\n",
"26 | Leviticus 23:11 | יְנִיפֶ֖נּוּ |
\n",
"27 | Leviticus 23:11 | הַכֹּהֵֽן׃ |
\n",
"28 | Leviticus 23:20 | הֵנִ֣יף |
\n",
"29 | Leviticus 23:20 | הַכֹּהֵ֣ן׀ |
\n",
"30 | Leviticus 23:20 | לַכֹּהֵֽן׃ |
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.table(results)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 1"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.show(results, start=1, end=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We see no highlights!\n",
"That is because phrase atoms are hidden by default. So let's unhide:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"A.displaySetup(hiddenTypes=\"subphrase clause_atom sentence_atom half_verse\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The next calls to `show()` will work as if `hiddenTypes=\"subphrase clause_atom sentence_atom half_verse\"` is passed to them. "
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 1"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"verse
sentence 7
clause xYqX Adju
phrase VP Pred
phrase_atom VP NA
actor=BN JFR>L
phrase NP Subj
phrase_atom NP NA
actor=BN JFR>L
phrase PP Objc
phrase_atom PP NA
actor=ZBX BN JFR>L
clause Ptcp Attr
phrase PPrP Subj
phrase_atom PPrP NA
actor=BN JFR>L
phrase VP PreC
phrase_atom VP NA
actor=BN JFR>L
clause WQt0 Coor
phrase VP PreO
phrase_atom VP NA
actor=BN JFR>L
phrase PP Cmpl
phrase_atom PP NA
actor=JHWH
phrase PP Cmpl
phrase_atom PP NA
actor=PTX >HL MW<D
phrase PP Cmpl
phrase_atom PP NA
actor=KHN
clause WQt0 Coor
phrase VP Pred
phrase_atom VP NA
actor=BN JFR>L
phrase NP Objc
phrase_atom PP Spec
actor=JHWH
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.show(results, start=1, end=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We make the feature `sense` from the valence module visible:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"result 1"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"verse:1417594
sentence:1181377 7
clause:439665 xYqX Adju
phrase:688388 VP Pred
phrase_atom:943219 VP NA
actor=BN JFR>L
phrase:688389 NP Subj
phrase_atom:943220 NP NA
actor=BN JFR>L
phrase:688390 PP Objc
phrase_atom:943221 PP NA
actor=ZBX BN JFR>L
clause:439666 Ptcp Attr
phrase:688392 PPrP Subj
phrase_atom:943223 PPrP NA
actor=BN JFR>L
phrase:688393 VP PreC
phrase_atom:943224 VP NA
actor=BN JFR>L
clause:439667 WQt0 Coor
phrase:688396 VP PreO
phrase_atom:943227 VP NA
actor=BN JFR>L
phrase:688397 PP Cmpl
phrase_atom:943228 PP NA
actor=JHWH
phrase:688398 PP Cmpl
phrase_atom:943229 PP NA
actor=PTX >HL MW<D
phrase:688399 PP Cmpl
phrase_atom:943230 PP NA
actor=KHN
clause:439668 WQt0 Coor
phrase:688401 VP Pred
phrase_atom:943232 VP NA
actor=BN JFR>L
phrase:688402 NP Objc
phrase_atom:943234 PP Spec
actor=JHWH
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"result 2"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"verse:1417595
sentence:1181378 8
clause:439669 WQtX NA
phrase:688405 VP Pred
phrase_atom:943237 VP NA
actor=KHN
phrase:688406 NP Subj
phrase_atom:943238 NP NA
actor=KHN
phrase:688407 PP Objc
phrase_atom:943239 PP NA
actor=DM
phrase:688408 PP Cmpl
phrase_atom:943240 PP NA
actor=MZBX JHWH
phrase_atom:943241 NP Spec
actor=PTX >HL MW<D
sentence:1181379 9
clause:439670 WQt0 NA
phrase:688410 VP Pred
phrase_atom:943243 VP NA
actor=KHN
phrase:688412 PP Cmpl
phrase_atom:943246 PP Spec
actor=JHWH
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"result 3"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"verse:1417595
sentence:1181378 8
clause:439669 WQtX NA
phrase:688405 VP Pred
phrase_atom:943237 VP NA
actor=KHN
phrase:688406 NP Subj
phrase_atom:943238 NP NA
actor=KHN
phrase:688407 PP Objc
phrase_atom:943239 PP NA
actor=DM
phrase:688408 PP Cmpl
phrase_atom:943240 PP NA
actor=MZBX JHWH
phrase_atom:943241 NP Spec
actor=PTX >HL MW<D
sentence:1181379 9
clause:439670 WQt0 NA
phrase:688410 VP Pred
phrase_atom:943243 VP NA
actor=KHN
phrase:688412 PP Cmpl
phrase_atom:943246 PP Spec
actor=JHWH
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.show(results, start=1, end=3, withNodes=True, extraFeatures=\"sense\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# All together!\n",
"\n",
"Here is a query that shows results with all features."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.26s 30 results\n"
]
}
],
"source": [
"results = A.search(\n",
" \"\"\"\n",
"book book=Leviticus\n",
" phrase sense*\n",
" phrase_atom actor=KHN\n",
" -heads> word\n",
"\"\"\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"verse 8"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"verse
book=Leviticus
sentence 27
clause CPen NA
phrase CP Conj
heads•\n",
"↦\n",
"
phrase_atom CP NA
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase NP Frnt
heads•\n",
"↦\n",
"
phrase_atom NP NA
actor=KHNheads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
clause xYq0 Resu
phrase CP Conj
heads•\n",
"↦\n",
"
phrase_atom CP NA
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase VP Pred
heads•\n",
"↦\n",
"
phrase_atom VP NA
actor=KHNheads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"sense=d-
phrase NP Objc
heads•\n",
"↦\n",
"
phrase_atom NP NA
actor=NPC_3heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase NP Adju
heads•\n",
"↦\n",
"
phrase_atom NP NA
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
clause XYqt Coor
phrase PPrP Subj
heads•\n",
"↦\n",
"
phrase_atom PPrP NA
actor=NPC_3heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase VP Pred
heads•\n",
"↦\n",
"
phrase_atom VP NA
actor=NPC_3heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"sense=-p
phrase PP Cmpl
heads•\n",
"↦\n",
"
phrase_atom PP NA
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
sentence 28
clause CPen NA
phrase CP Conj
heads•\n",
"↦\n",
"
phrase_atom CP NA
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase NP Frnt
heads•\n",
"↦\n",
"
phrase_atom NP NA
actor=JLJD BJT KHNheads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
clause XYqt Resu
phrase PPrP Subj
heads•\n",
"↦\n",
"
phrase_atom PPrP NA
actor=JLJD BJT KHNheads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
phrase VP Pred
heads•\n",
"↦\n",
"
phrase_atom VP NA
actor=JLJD BJT KHNheads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"sense=-p
phrase PP Cmpl
heads•\n",
"↦\n",
"
phrase_atom PP NA
heads•\n",
"↦\n",
"
heads•\n",
"⇥\n",
" \n",
"⇥\n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.displaySetup(\n",
" condensed=True,\n",
" condenseType=\"verse\",\n",
" hiddenTypes=\"subphrase clause_atom sentence_atom half_verse\",\n",
")\n",
"A.show(results, start=8, end=8)\n",
"A.displaySetup()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercise\n",
"\n",
"See whether you can find the quote in the Easter egg that is in\n",
"`etcbc/lingo/easter/tf` !"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# All steps\n",
"\n",
"* **[start](start.ipynb)** your first step in mastering the bible computationally\n",
"* **[display](display.ipynb)** become an expert in creating pretty displays of your text structures\n",
"* **[search](search.ipynb)** turbo charge your hand-coding with search templates\n",
"* **[export Excel](exportExcel.ipynb)** make tailor-made spreadsheets out of your results\n",
"* **share** draw in other people's data and let them use yours\n",
"* **[export](export.ipynb)** export your dataset as an Emdros database\n",
"* **[annotate](annotate.ipynb)** annotate plain text by means of other tools and import the annotations as TF features\n",
"* **[map](map.ipynb)** map somebody else's annotations to a new version of the corpus\n",
"* **[volumes](volumes.ipynb)** work with selected books only\n",
"* **[trees](trees.ipynb)** work with the BHSA data as syntax trees\n",
"\n",
"CC-BY Dirk Roorda"
]
}
],
"metadata": {
"jupytext": {
"encoding": "# -*- coding: utf-8 -*-"
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.1"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}