{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Heads2TF\n", "\n", "In this NB, we produce two text-fabric features on the BHSA data using the get_heads method developed in [getting_heads.ipynb](getting_heads.ipynb). See that notebook for a detailed description of the motivation, method, and shortcomings for this data.\n", "\n", "N.B. this data is experimental and a work in progress!\n", "\n", "## Production\n", "\n", "Three features are produced herein:\n", "* heads.tf - an edge feature from a phrase(atom) node to its phrase head + its coordinated head words.\n", "* prep_obj.tf - an edge feature from a prepositional phrase type to its noun object.\n", "* noun_heads.tf - an edge feature from a phrase(atom) node to its noun heads, regardless of whether the phrase is a prepositional phrase or not. \"noun\" is meant loosely and includes adjectives and other parts of speech." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Export\n", "\n", "### Updates\n", "\n", "#### 06.11.18\n", "Added a new feature, `noun_heads`, to pluck noun heads from both noun phrases or prepositional phrases.\n", "\n", "#### 23.10.18\n", "New export for the updated C version of BHSA data.\n", "\n", "#### 21.04.18\n", "A new function has been added to double check phrase heads. Prepositional phrases whose objects are also prepositions have resulted in some false heads being assigned. This is because prepositional objects receive no subphrase relations in BHSA and appeared to the algorithm as independent. An additional check is required to make sure that a given preposition does not serve as the head of its phrase. The new function, `check_preposition`, looks one word behind a candidate head noun (within the phrase boundaries) and validates only those cases that are not immediately preceded by another preposition.\n", "\n", "#### 20.04.18\n", "In discussion with Stephen Ku, I've decided to apply the `quantifier` algorithm to prepositional objects so that we retrieve the head of the prepositonal object noun phrase rather than a quantifier. For good measure, I will also apply the `attributed` function (see [getting_heads.ipynb](getting_heads.ipynb) for a description of both functions)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os, collections, random\n", "from tf.fabric import Fabric\n", "from tf.extra.bhsa import Bhsa\n", "from heads import get_heads, find_quantified, find_attributed" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "processing version c \n", "\n", "This is Text-Fabric 6.4.4\n", "Api reference : https://dans-labs.github.io/text-fabric/Api/General/\n", "Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb\n", "Example data : https://github.com/Dans-labs/text-fabric-data\n", "\n", "114 features found and 0 ignored\n", " 0.00s loading features ...\n", " | 0.01s B book from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.00s B chapter from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.01s B verse from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.10s B lex from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.18s B typ from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.10s B pdp from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.19s B rela from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.17s B mother from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.06s B function from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.10s B sp from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.10s B ls from /Users/cody/github/etcbc/bhsa/tf/c\n", " 5.03s All features loaded/computed - for details use loadLog()\n", "\n", "processing heads...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 0.00s Feature \"otype\" not available in\n", "/Users/cody/github/etcbc/lingo/heads/tf/c\n", " 0.00s Not all features could be loaded/computed\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "exporting TF...\n", " | 1.11s T heads to /Users/cody/github/etcbc/lingo/heads/tf/c\n", " | 1.05s T noun_heads to /Users/cody/github/etcbc/lingo/heads/tf/c\n", " | 0.14s T prep_obj to /Users/cody/github/etcbc/lingo/heads/tf/c\n", "\n", "done with c\n", "processing version 2017 \n", "\n", "This is Text-Fabric 6.4.4\n", "Api reference : https://dans-labs.github.io/text-fabric/Api/General/\n", "Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb\n", "Example data : https://github.com/Dans-labs/text-fabric-data\n", "\n", "115 features found and 0 ignored\n", " 0.00s loading features ...\n", " | 0.01s B book from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.00s B chapter from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.01s B verse from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.19s B typ from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.11s B pdp from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.19s B rela from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.12s B mother from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.06s B function from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.11s B lex from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.11s B sp from /Users/cody/github/etcbc/bhsa/tf/2017\n", " | 0.10s B ls from /Users/cody/github/etcbc/bhsa/tf/2017\n", " 5.76s All features loaded/computed - for details use loadLog()\n", "\n", "processing heads...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 0.00s Feature \"otype\" not available in\n", "/Users/cody/github/etcbc/lingo/heads/tf/2017\n", " 0.00s Not all features could be loaded/computed\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "exporting TF...\n", " | 1.13s T heads to /Users/cody/github/etcbc/lingo/heads/tf/2017\n", " | 1.07s T noun_heads to /Users/cody/github/etcbc/lingo/heads/tf/2017\n", " | 0.14s T prep_obj to /Users/cody/github/etcbc/lingo/heads/tf/2017\n", "\n", "done with 2017\n", "processing version 2016 \n", "\n", "This is Text-Fabric 6.4.4\n", "Api reference : https://dans-labs.github.io/text-fabric/Api/General/\n", "Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb\n", "Example data : https://github.com/Dans-labs/text-fabric-data\n", "\n", "109 features found and 0 ignored\n", " 0.00s loading features ...\n", " | 0.01s B book from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.00s B chapter from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.00s B verse from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.18s B typ from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.10s B pdp from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.18s B rela from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.63s B mother from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.06s B function from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.11s B lex from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.11s B sp from /Users/cody/github/etcbc/bhsa/tf/2016\n", " | 0.10s B ls from /Users/cody/github/etcbc/bhsa/tf/2016\n", " 5.59s All features loaded/computed - for details use loadLog()\n", "\n", "processing heads...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 0.00s Feature \"otype\" not available in\n", "/Users/cody/github/etcbc/lingo/heads/tf/2016\n", " 0.00s Not all features could be loaded/computed\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "exporting TF...\n", " | 1.14s T heads to /Users/cody/github/etcbc/lingo/heads/tf/2016\n", " | 1.08s T noun_heads to /Users/cody/github/etcbc/lingo/heads/tf/2016\n", " | 0.14s T prep_obj to /Users/cody/github/etcbc/lingo/heads/tf/2016\n", "\n", "done with 2016\n" ] } ], "source": [ "# export heads.tf & prep_obj.tf for all TF versions\n", "for version in ['c', '2017', '2016']:\n", " \n", " print('processing version ', version, '\\n')\n", " \n", " # load Text-Fabric and data\n", " TF = Fabric(locations='~/github/etcbc/bhsa/tf', modules=version)\n", " api = TF.load('''\n", " book chapter verse\n", " typ pdp rela mother \n", " function lex sp ls\n", " ''')\n", "\n", " F, E, T, L = api.F, api.E, api.T, api.L # TF data methods\n", " \n", " # get heads\n", " heads_features = collections.defaultdict(dict)\n", " \n", " print('\\nprocessing heads...')\n", " \n", " for phrase in list(F.otype.s('phrase')) + list(F.otype.s('phrase_atom')):\n", " \n", " heads = get_heads(phrase, api)\n", " \n", " if heads:\n", " heads_features['heads'][phrase] = set(heads)\n", " \n", " # make noun heads part 1\n", " if F.typ.v(phrase) != 'PP' and heads: \n", " heads_features['noun_heads'][phrase] = set(heads)\n", " \n", " # do prep objects and noun heads part 2\n", " if F.typ.v(phrase) == 'PP' and heads:\n", " for head in heads:\n", " obj = head + 1 if F.pdp.v(head + 1) != 'art' else head + 2\n", " phrase_bounds = L.d(phrase, 'word')\n", " if obj in phrase_bounds:\n", " obj = find_quantified(obj, api) or find_attributed(obj, api) or obj\n", " heads_features['prep_obj'][head] = set([obj])\n", " heads_features['noun_heads'][phrase] = set([obj]) # make noun heads part 2\n", " \n", " # export TF data\n", " print('\\nexporting TF...')\n", " meta = {'': {'created_by': 'Cody Kingham',\n", " 'coreData': 'BHSA',\n", " 'coreVersion': version\n", " },\n", " 'heads' : {'source': 'see the notebook at https://github.com/etcbc/lingo/heads',\n", " 'valueType': 'int',\n", " 'edgeValues': False},\n", " 'prep_obj': {'source': 'see the notebook at https://github.com/etcbc/lingo/heads',\n", " 'valueType': 'int',\n", " 'edgeValues': False},\n", " 'noun_heads': {'source': 'see the notebook at https://github.com/etcbc/lingo/heads',\n", " 'valueType': 'int',\n", " 'edgeValues': False}\n", " }\n", "\n", " save_tf = Fabric(locations='~/github/etcbc/lingo/heads/tf', modules=version, silent=True)\n", " save_api = save_tf.load('', silent=True)\n", " save_tf.save(nodeFeatures={}, edgeFeatures=heads_features, metaData=meta)\n", " \n", " print(f'\\ndone with {version}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Tests" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This is Text-Fabric 6.4.4\n", "Api reference : https://dans-labs.github.io/text-fabric/Api/General/\n", "Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb\n", "Example data : https://github.com/Dans-labs/text-fabric-data\n", "\n", "117 features found and 0 ignored\n", " 0.00s loading features ...\n", " | 0.01s B book from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.00s B chapter from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.01s B verse from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.10s B lex from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.18s B typ from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.10s B pdp from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.18s B rela from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.11s B mother from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.06s B function from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.11s B sp from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 0.10s B ls from /Users/cody/github/etcbc/bhsa/tf/c\n", " | 2.09s T heads from /Users/cody/github/etcbc/lingo/heads/tf/c\n", " | 0.20s T prep_obj from /Users/cody/github/etcbc/lingo/heads/tf/c\n", " | 2.10s T noun_heads from /Users/cody/github/etcbc/lingo/heads/tf/c\n", " 11s All features loaded/computed - for details use loadLog()\n" ] }, { "data": { "text/markdown": [ "**Documentation:** BHSA Character table Feature docs BHSA API Text-Fabric API 6.4.4 Search Reference" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Loaded features:\n", "book@ll book chapter function gloss label language lex ls number otype pdp rela sp typ verse voc_lex voc_lex_utf8 vs vt heads mother noun_heads oslots prep_obj
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "\n", "This notebook online:\n", "NBViewer\n", "GitHub\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "data_locs = ['~/github/etcbc/bhsa/tf',\n", " '~/github/etcbc/lingo/heads/tf']\n", "\n", "# load Text-Fabric and data\n", "TF = Fabric(locations=data_locs, modules='c')\n", "\n", "api = TF.load('''\n", " book chapter verse\n", " typ pdp rela mother \n", " function lex sp ls\n", " heads prep_obj noun_heads\n", " ''')\n", "\n", "F, E, T, L = api.F, api.E, api.T, api.L # TF data methods\n", "\n", "B = Bhsa(api, name='Heads2TF', version='c')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## noun_heads" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.63s 45182 results\n" ] }, { "data": { "text/markdown": [ "\n", "\n", "**verse** *1*\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "
\n", " \n", " \n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 1\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause xQtX\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Time PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep in
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs beginning
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb create qal perf
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs god(s)
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Objc PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep <object marker>
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs heavens
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
prep <object marker>
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs earth
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "\n", "\n", "**verse** *2*\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "
\n", " \n", " \n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 2\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause WXQt\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs earth
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb be qal perf
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase PreC NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs emptiness
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs emptiness
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 3\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause NmCl\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs darkness
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase PreC PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep upon
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs face
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs primeval ocean
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 4\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause Ptcp\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs wind
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs god(s)
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase PreC VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb shake piel ptca
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Cmpl PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep upon
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs face
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs water
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "\n", "\n", "**verse** *3*\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "
\n", " \n", " \n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 8\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause WayX\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb see qal wayq
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs god(s)
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Objc PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep <object marker>
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs light
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " clause Objc xQt0\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj that
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb be good qal perf
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 9\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause WayX\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb separate hif wayq
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs god(s)
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Cmpl PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep interval
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs light
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
prep interval
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs darkness
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "\n", "\n", "**verse** *4*\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "
\n", " \n", " \n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 10\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause WayX\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb call qal wayq
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs god(s)
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Cmpl PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep to
\n", "\n", "\n", "
\n", "\n", "
\n", "
\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs light
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Objc NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs day
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 11\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause WxQ0\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Cmpl PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep to
\n", "\n", "\n", "
\n", "\n", "
\n", "
\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs darkness
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb call qal perf
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Objc NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs night
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 12\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause WayX\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb be qal wayq
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs evening
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 13\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause WayX\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb be qal wayq
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs morning
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 14\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause NmCl\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase PreC NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs day
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs one
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "\n", "\n", "**verse** *5*\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "
\n", " \n", " \n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 15\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause WayX\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb say qal wayq
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs god(s)
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 16\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause ZYqX\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb be qal impf
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Subj NP\n", "
\n", "
\n", "\n", "
\n", "\n", "
subs firmament
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase PreC PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep in
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs midst
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
art the
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs water
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " sentence 17\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " clause WYq0\n", "
\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Conj CP\n", "
\n", "
\n", "\n", "
\n", "\n", "
conj and
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Pred VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb be qal impf
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase PreC VP\n", "
\n", "
\n", "\n", "
\n", "\n", "
verb separate hif ptca
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Cmpl PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep interval
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs water
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", " phrase Cmpl PP\n", "
\n", "
\n", "\n", "
\n", "\n", "
prep to
\n", "\n", "\n", "
\n", "\n", "
\n", "\n", "
subs water
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "B.show(\n", "\n", "B.search('''\n", "\n", "phrase typ=PP\n", " -noun_heads> word\n", "\n", "''')[:10]\n", "\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## prep_obj.tf" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "test_prep = []\n", "\n", "for ph in F.typ.s('PP'):\n", " heads = E.heads.f(ph)\n", " objs = [E.prep_obj.f(prep)[0] for prep in heads\n", " if E.prep_obj.f(prep)]\n", " test_prep.append(tuple(objs))\n", " \n", "random.shuffle(test_prep)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "#B.show(test_prep[:50]) # uncomment me" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See what the prepositional object looks like for Genesis 1:21:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "example phrase 651768 phrase number 14 in verse\n", "אֵ֨ת כָּל־עֹ֤וף כָּנָף֙ \n", "\n", "Gen 1:21 phrase 14's heads, a preposition:\n", "אֵ֨ת \n", "\n", "Gen 1:21 phrase 14's prepositional object:\n", "עֹ֤וף \n" ] } ], "source": [ "gen_121_case = L.d(T.nodeFromSection(('Genesis', 1, 21)), 'phrase')[13]\n", "\n", "print('example phrase', gen_121_case, 'phrase number 14 in verse')\n", "print(T.text(L.d(gen_121_case, 'word')))\n", "\n", "print('\\nGen 1:21 phrase 14\\'s heads, a preposition:')\n", "heads = E.heads.f(gen_121_case)\n", "print(T.text(heads))\n", "\n", "\n", "print('\\nGen 1:21 phrase 14\\'s prepositional object:')\n", "print(T.text(E.prep_obj.f(heads[0])))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## heads.tf" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "heads = [E.heads.f(ph) for ph in F.otype.s('phrase') if F.typ.v(ph) == 'NP']\n", "random.shuffle(heads)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "#B.show(heads[:50]) # uncomment me" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" } }, "nbformat": 4, "nbformat_minor": 2 }