{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "\n", "# Verbal valence\n", "\n", "*Verbal valence* is a kind of signature of a verb, not unlike overloading in programming languages.\n", "The meaning of a verb depends on the number and kind of its complements, i.e. the linguistic entities that act as arguments for the semantic function of the verb.\n", "\n", "We will use a set of flowcharts to specify and compute the sense of a verb in specific contexts depending on the verbal valence. The flowcharts have been composed by Janet Dyk. Although they are not difficult to understand, it takes a good deal of ingenuity to apply them in all the real world situations that we encounter in our corpus.\n", "\n", "Read more in the [wiki](https://github.com/ETCBC/valence/wiki)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Pipeline\n", "See [operation](https://github.com/ETCBC/pipeline/blob/master/README.md#operation)\n", "for how to run this script in the pipeline." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "import sys\n", "import os\n", "import collections\n", "import yaml\n", "from copy import deepcopy\n", "import utils\n", "from tf.fabric import Fabric\n", "from tf.core.helpers import formatMeta" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "if \"SCRIPT\" not in locals():\n", " SCRIPT = False\n", " FORCE = True\n", " CORE_NAME = \"bhsa\"\n", " NAME = \"valence\"\n", " VERSION = \"c\"\n", " CORE_MODULE = \"core\"" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "def stop(good=False):\n", " if SCRIPT:\n", " sys.exit(0 if good else 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Authors\n", "\n", "[Janet Dyk and Dirk Roorda](https://github.com/ETCBC/valence/wiki/Authors)\n", "\n", "Last modified 2017-09-13." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "[References](https://github.com/ETCBC/valence/wiki/References)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data\n", "We have carried out the valence project against the Hebrew Text Database of the BHSA, version `4b`.\n", "See the description of the [sources](https://github.com/ETCBC/valence/wiki/Sources).\n", "\n", "However, we can run our stuff also against the newer versions.\n", "\n", "We also make use of corrected and enriched data delivered by the\n", "[enrich notebook](enrich.ipynb).\n", "The features of that data module are specified\n", "[here](https://github.com/ETCBC/valence/wiki/Data)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Results\n", "\n", "We produce a text-fabric feature `sense` with the sense labels per verb occurrence, and add\n", "this to the *valence* data module created in the\n", "[enrich](enrich.ipynb) notebook.\n", "\n", "We also show the results in\n", "[SHEBANQ](https://shebanq.ancient-data.org), the website of the ETCBC that exposes its Hebrew Text Database in such a way\n", "that users can query it, save their queries, add manual annotations and even upload bulks of generated annotations.\n", "That is exactly what we do: the valency results are visible in SHEBANQ in notes view, so that every outcome can be viewed in context." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Flowchart logic\n", "\n", "Valence flowchart logic translates the verb context into a label that is characteristic for the context.\n", "You could say, it is a fingerprint of the context.\n", "Verb meanings are complex, depending on context. It turns out that we can organize\n", "the meaning selection of verbs around these finger prints.\n", "\n", "For each verb, the we can specify a *flowchart* as a mapping of fingerprints to concrete meanings.\n", "We have flowcharts for a limited, but open set of verbs.\n", "They are listed in the\n", "[wiki](https://github.com/ETCBC/valence/wiki),\n", "and will be referred to from the resulting valence annotations in SHEBANQ.\n", "\n", "For each verb, the flowchart is represented as a mapping of *sense labels* to meaning templates.\n", "A sense label is a code for the presence and nature of direct objects and complements that are present in the context.\n", "See the [legend](https://github.com/ETCBC/valence/wiki/Legend) of sense labels.\n", "\n", "The interesting part is the *sense template*,\n", "which consist of a translation text augmented with placeholders for the direct objects and complements.\n", "\n", "See for example the flowchart of [NTN](https://github.com/ETCBC/valence/wiki/FC_NTN).\n", "\n", "* `{verb}` the verb occurrence in question\n", "* `{pdos}` principal direct objects (phrase)\n", "* `{kdos}` K-objects (phrase)\n", "* `{ldos}` L-objects (phrase)\n", "* `{ndos}` direct objects (phrase) (none of the above)\n", "* `{idos}` infinitive construct (clause) objects\n", "* `{cdos}` direct objects (clause) (none of the above)\n", "* `{inds}` indirect objects\n", "* `{bens}` benefactive adjuncts\n", "* `{locs}` locatives\n", "* `{cpls}` complements, not marked as either indirect object or locative\n", "\n", "In case there are multiple entities, the algorithm returns them chunked as phrases/clauses.\n", "\n", "Apart from the template, there is also a *status* and an optional *account*.\n", "\n", "The status is ``!`` in normal cases, ``?`` in dubious cases, and ``-`` in erroneous cases.\n", "In SHEBANQ these statuses are translated into `colors` of the notes (blue/orange/red).\n", "\n", "The account contains information about the grounds of which the algorithm has arrived at its conclusions." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "senses = set(\n", " \"\"\"\n", "\n", "CJT\n", "DBQ\n", "FJM\n", "NTN\n", "QR>\n", "ZQN\n", "\"\"\".strip().split()\n", ")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "senseLabels = \"\"\"\n", "--\n", "-i\n", "-b\n", "-p\n", "-c\n", "d-\n", "di\n", "db\n", "dp\n", "dc\n", "n.\n", "l.\n", "k.\n", "i.\n", "c.\n", "\"\"\".strip().split()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "constKindSpecs = \"\"\"\n", "verb:verb\n", "dos:direct object\n", "pdos:principal direct object\n", "kdos:K-object\n", "ldos:L-object\n", "ndos:NP-object\n", "idos:infinitive object clause\n", "cdos:direct object clause\n", "inds:indirect object\n", "bens:benefactive\n", "locs:locative\n", "cpls:complement\n", "\"\"\".strip().split(\n", " \"\\n\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Results\n", "\n", "The complete set of results is in SHEBANQ.\n", "It is the note set\n", "[valence](https://shebanq.ancient-data.org/hebrew/note?version=4b&id=Mnx2YWxlbmNl&tp=txt_tb1&nget=v)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Firing up the engines" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Setting up the context: source file and target directories\n", "\n", "The conversion is executed in an environment of directories, so that sources, temp files and\n", "results are in convenient places and do not have to be shifted around." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[4]:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "repoBase = os.path.expanduser(\"~/github/etcbc\")\n", "coreRepo = \"{}/{}\".format(repoBase, CORE_NAME)\n", "thisRepo = \"{}/{}\".format(repoBase, NAME)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "coreTf = \"{}/tf/{}\".format(coreRepo, VERSION)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "thisSource = \"{}/source/{}\".format(thisRepo, VERSION)\n", "thisTemp = \"{}/_temp/{}\".format(thisRepo, VERSION)\n", "thisTempTf = \"{}/tf\".format(thisTemp)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "thisTf = \"{}/tf/{}\".format(thisRepo, VERSION)\n", "thisNotes = \"{}/shebanq/{}\".format(thisRepo, VERSION)" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[5]:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "notesFile = \"valenceNotes.csv\"\n", "flowchartBase = \"https://github.com/ETCBC/valence/wiki\"" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "if not os.path.exists(thisNotes):\n", " os.makedirs(thisNotes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Test\n", "\n", "Check whether this conversion is needed in the first place.\n", "Only when run as a script." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[6]:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "if SCRIPT:\n", " (good, work) = utils.mustRun(\n", " None, \"{}/.tf/{}.tfx\".format(thisTf, \"sense\"), force=FORCE\n", " )\n", " if not good:\n", " stop(good=False)\n", " if not work:\n", " stop(good=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Loading the feature data\n", "\n", "We load the features we need from the BHSA core database and from the valence module,\n", "as far as generated by the\n", "[enrich](https://github.com/ETCBC/valence/blob/master/programs/enrich.ipynb) notebook." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[7]:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 0.00s Load the existing TF dataset .\n", "..............................................................................................\n", "This is Text-Fabric 9.2.0\n", "Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html\n", "\n", "124 features found and 0 ignored\n" ] } ], "source": [ "utils.caption(4, \"Load the existing TF dataset\")\n", "TF = Fabric(locations=[coreTf, thisTf], modules=[\"\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We instruct the API to load data." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[8]:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1.44s Dataset without structure sections in otext:no structure functions in the T-API\n", " | | 1.17s C __characters__ from otext\n", " | | 1.04s T f_correction from ~/github/etcbc/valence/tf/c\n", " | | 1.16s T grammatical from ~/github/etcbc/valence/tf/c\n", " | | 1.06s T lexical from ~/github/etcbc/valence/tf/c\n", " | | 1.01s T original from ~/github/etcbc/valence/tf/c\n", " | | 1.19s T predication from ~/github/etcbc/valence/tf/c\n", " | | 1.03s T s_manual from ~/github/etcbc/valence/tf/c\n", " | | 1.08s T semantic from ~/github/etcbc/valence/tf/c\n", " | | 1.17s T valence from ~/github/etcbc/valence/tf/c\n", " 21s All features loaded/computed - for details use TF.isLoaded()\n" ] }, { "data": { "text/plain": [ "[('Computed',\n", " 'computed-data',\n", " ('C Computed', 'Call AllComputeds', 'Cs ComputedString')),\n", " ('Features', 'edge-features', ('E Edge', 'Eall AllEdges', 'Es EdgeString')),\n", " ('Fabric', 'loading', ('TF',)),\n", " ('Locality', 'locality', ('L Locality',)),\n", " ('Nodes', 'navigating-nodes', ('N Nodes',)),\n", " ('Features',\n", " 'node-features',\n", " ('F Feature', 'Fall AllFeatures', 'Fs FeatureString')),\n", " ('Search', 'search', ('S Search',)),\n", " ('Text', 'text', ('T Text',))]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "api = TF.load(\n", " \"\"\"\n", " function rela typ\n", " g_word_utf8 trailer_utf8\n", " lex prs uvf sp pdp ls vs vt nametype gloss\n", " book chapter verse label number\n", " s_manual f_correction\n", " valence predication grammatical original lexical semantic\n", " mother\n", "\"\"\"\n", ")\n", "api.makeAvailableIn(globals())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Indicators\n", "\n", "Here we specify by what features we recognize key constituents.\n", "We use predominantly features that come from the correction/enrichment workflow." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[9]:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`pf_`... : predication feature\n", "`gf_`... : grammatical feature\n", "`vf_`... : valence feature\n", "`sf_`... : lexical feature\n", "`of_`... : original feature" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "pf_predicate = {\n", " \"regular\",\n", "}\n", "gf_direct_object = {\n", " \"principal_direct_object\",\n", " \"NP_direct_object\",\n", " \"direct_object\",\n", " \"L_object\",\n", " \"K_object\",\n", " \"infinitive_object\",\n", "}\n", "gf_indirect_object = {\n", " \"indirect_object\",\n", "}\n", "gf_complement = {\n", " \"*\",\n", "}\n", "sf_locative = {\n", " \"location\",\n", "}\n", "sf_benefactive = {\n", " \"benefactive\",\n", "}\n", "vf_locative = {\n", " \"complement\",\n", " \"adjunct\",\n", "}" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "verbal_stems = set(\n", " \"\"\"\n", " qal\n", "\"\"\".strip().split()\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Pronominal suffixes\n", "We collect the information to determine how to render pronominal suffixes on words.\n", "On verbs, they must be rendered *accusatively*, like `see him`.\n", "But on nouns, they must be rendered *genitively*, like `hand my`.\n", "So we make an inventory of part of speech types and the pronominal suffixes that occur on them.\n", "On that basis we make the translation dictionaries `pronominal suffix` and `switch_prs`.\n", "\n", "Finally, we define a function `get_prs_info` that for each word delivers the pronominal suffix info and gloss,\n", "if there is any, and else `(None, None)`." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[10]:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "adjv H : 16\n", "adjv HM : 10\n", "adjv J : 25\n", "adjv K : 35\n", "adjv K= : 3\n", "adjv KM : 7\n", "adjv M : 8\n", "adjv MW : 1\n", "adjv NW : 5\n", "adjv W : 59\n", "adjv absent : 9273\n", "advb n/a : 4550\n", "art n/a : 30386\n", "conj n/a : 62722\n", "inrg K : 1\n", "inrg M : 2\n", "inrg W : 5\n", "inrg absent : 1277\n", "intj K : 13\n", "intj K= : 7\n", "intj KM : 2\n", "intj M : 37\n", "intj NJ : 181\n", "intj NW : 8\n", "intj W : 3\n", "intj absent : 1634\n", "nega n/a : 6053\n", "nmpr n/a : 33081\n", "prde n/a : 2660\n", "prep H : 1019\n", "prep H= : 36\n", "prep HJ : 13\n", "prep HM : 1499\n", "prep HN : 74\n", "prep HW : 174\n", "prep HWN : 19\n", "prep J : 1853\n", "prep K : 1634\n", "prep K= : 353\n", "prep KM : 1181\n", "prep KN : 2\n", "prep KWN : 1\n", "prep M : 684\n", "prep MW : 68\n", "prep N : 3\n", "prep N> : 4\n", "prep NJ : 105\n", "prep NW : 539\n", "prep W : 3247\n", "prep absent : 60765\n", "prin n/a : 1021\n", "prps n/a : 5011\n", "subs H : 1635\n", "subs H= : 108\n", "subs HJ : 58\n", "subs HM : 1417\n", "subs HN : 114\n", "subs HW : 340\n", "subs HWN : 32\n", "subs J : 4332\n", "subs K : 4362\n", "subs K= : 744\n", "subs KM : 1335\n", "subs KN : 16\n", "subs KWN : 7\n", "subs M : 1919\n", "subs MW : 25\n", "subs N : 29\n", "subs N> : 3\n", "subs NJ : 19\n", "subs NW : 809\n", "subs W : 7653\n", "subs absent : 96548\n", "verb H : 682\n", "verb H= : 17\n", "verb HJ : 6\n", "verb HM : 121\n", "verb HN : 4\n", "verb HW : 1097\n", "verb J : 356\n", "verb K : 1089\n", "verb K= : 201\n", "verb KM : 132\n", "verb KN : 1\n", "verb KWN : 2\n", "verb M : 1288\n", "verb MW : 23\n", "verb N : 15\n", "verb N> : 3\n", "verb NJ : 1016\n", "verb NW : 274\n", "verb W : 938\n", "verb absent : 66445\n" ] } ], "source": [ "prss = collections.defaultdict(lambda: collections.defaultdict(lambda: 0))\n", "for w in F.otype.s(\"word\"):\n", " prss[F.sp.v(w)][F.prs.v(w)] += 1\n", "if not SCRIPT:\n", " for sp in sorted(prss):\n", " for prs in sorted(prss[sp]):\n", " print(\"{:<5} {:<3} : {:>5}\".format(sp, prs, prss[sp][prs]))" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[11]:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "pronominal_suffix = {\n", " \"accusative\": {\n", " \"W\": (\"p3-sg-m\", \"him\"),\n", " \"K\": (\"p2-sg-m\", \"you:m\"),\n", " \"J\": (\"p1-sg-\", \"me\"),\n", " \"M\": (\"p3-pl-m\", \"them:mm\"),\n", " \"H\": (\"p3-sg-f\", \"her\"),\n", " \"HM\": (\"p3-pl-m\", \"them:mm\"),\n", " \"KM\": (\"p2-pl-m\", \"you:mm\"),\n", " \"NW\": (\"p1-pl-\", \"us\"),\n", " \"HW\": (\"p3-sg-m\", \"him\"),\n", " \"NJ\": (\"p1-sg-\", \"me\"),\n", " \"K=\": (\"p2-sg-f\", \"you:f\"),\n", " \"HN\": (\"p3-pl-f\", \"them:ff\"),\n", " \"MW\": (\"p3-pl-m\", \"them:mm\"),\n", " \"N\": (\"p3-pl-f\", \"them:ff\"),\n", " \"KN\": (\"p2-pl-f\", \"you:ff\"),\n", " },\n", " \"genitive\": {\n", " \"W\": (\"p3-sg-m\", \"his\"),\n", " \"K\": (\"p2-sg-m\", \"your:m\"),\n", " \"J\": (\"p1-sg-\", \"my\"),\n", " \"M\": (\"p3-pl-m\", \"their:mm\"),\n", " \"H\": (\"p3-sg-f\", \"her\"),\n", " \"HM\": (\"p3-pl-m\", \"their:mm\"),\n", " \"KM\": (\"p2-pl-m\", \"your:mm\"),\n", " \"NW\": (\"p1-pl-\", \"our\"),\n", " \"HW\": (\"p3-sg-m\", \"his\"),\n", " \"NJ\": (\"p1-sg-\", \"my\"),\n", " \"K=\": (\"p2-sg-f\", \"your:f\"),\n", " \"HN\": (\"p3-pl-f\", \"their:ff\"),\n", " \"MW\": (\"p3-pl-m\", \"their:mm\"),\n", " \"N\": (\"p3-pl-f\", \"their:ff\"),\n", " \"KN\": (\"p2-pl-f\", \"your:ff\"),\n", " },\n", "}\n", "switch_prs = dict(\n", " subs=\"genitive\",\n", " verb=\"accusative\",\n", " prep=\"accusative\",\n", " conj=None,\n", " nmpr=None,\n", " art=None,\n", " adjv=\"genitive\",\n", " nega=None,\n", " prps=None,\n", " advb=None,\n", " prde=None,\n", " intj=\"accusative\",\n", " inrg=\"genitive\",\n", " prin=None,\n", ")" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "def get_prs_info(w):\n", " sp = F.sp.v(w)\n", " prs = F.prs.v(w)\n", " switch = switch_prs[sp]\n", " return pronominal_suffix.get(switch, {}).get(prs, (None, None))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Making a verb-clause index\n", "\n", "We generate an index which gives for each verb lexeme a list of clauses that have that lexeme as the main verb.\n", "In the index we store the clause node together with the word node(s) that carries the main verb(s).\n", "\n", "Clauses may have multiple verbs. In many cases it is a copula plus an other verb.\n", "In those cases, we are interested in the other verb, so we exclude copulas.\n", "\n", "Yet, there are also sentences with more than one main verb.\n", "In those cases, we treat both verbs separately as main verb of one and the same clause." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[12]:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 1m 01s Making the verb-clause index .\n", "..............................................................................................\n" ] } ], "source": [ "utils.caption(4, \"Making the verb-clause index\")\n", "occs = collections.defaultdict(\n", " list\n", ") # dictionary of all verb occurrence nodes per verb lexeme\n", "verb_clause = collections.defaultdict(\n", " list\n", ") # dictionary of all verb occurrence nodes per clause node\n", "clause_verb = (\n", " collections.OrderedDict()\n", ") # idem but for the occurrences of selected verbs" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "| 1m 03s \tDone (69439 clauses)\n" ] } ], "source": [ "for w in F.otype.s(\"word\"):\n", " if F.sp.v(w) != \"verb\":\n", " continue\n", " lex = F.lex.v(w).rstrip(\"[\")\n", " pf = F.predication.v(L.u(w, \"phrase\")[0])\n", " if pf in pf_predicate:\n", " cn = L.u(w, \"clause\")[0]\n", " clause_verb.setdefault(cn, []).append(w)\n", " verb_clause[lex].append((cn, w))\n", "utils.caption(0, \"\\tDone ({} clauses)\".format(len(clause_verb)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# (Indirect) Objects, Locatives, Benefactives" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[13]:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 1m 03s Finding key constituents .\n", "..............................................................................................\n" ] } ], "source": [ "utils.caption(4, \"Finding key constituents\")\n", "constituents = collections.defaultdict(lambda: collections.defaultdict(set))\n", "ckinds = \"\"\"\n", " dos pdos ndos kdos ldos idos cdos inds locs cpls bens\n", "\"\"\".strip().split()" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "# go through all relevant clauses and collect all types of direct objects\n", "for c in clause_verb:\n", " these_constituents = collections.defaultdict(set)\n", " # phrase like constituents\n", " for p in L.d(c, \"phrase\"):\n", " gf = F.grammatical.v(p)\n", " of = F.original.v(p)\n", " sf = F.semantic.v(p)\n", " vf = F.valence.v(p)\n", " ckind = None\n", " if gf in gf_direct_object:\n", " if gf == \"principal_direct_object\":\n", " ckind = \"pdos\"\n", " elif gf == \"NP_direct_object\":\n", " ckind = \"ndos\"\n", " elif gf == \"L_object\":\n", " ckind = \"ldos\"\n", " elif gf == \"K_object\":\n", " ckind = \"kdos\"\n", " else:\n", " ckind = \"dos\"\n", " elif gf in gf_indirect_object:\n", " ckind = \"inds\"\n", " elif sf and sf in sf_benefactive:\n", " ckind = \"bens\"\n", " elif sf in sf_locative and vf in vf_locative:\n", " ckind = \"locs\"\n", " elif gf in gf_complement:\n", " ckind = \"cpls\"\n", " if ckind:\n", " these_constituents[ckind].add(p)\n", "\n", " # clause like constituents: only look for object clauses dependent on this clause\n", " for ac in L.d(L.u(c, \"sentence\")[0], \"clause\"):\n", " dep = list(E.mother.f(ac))\n", " if len(dep) and dep[0] == c:\n", " gf = F.grammatical.v(ac)\n", " ckind = None\n", " if gf in gf_direct_object:\n", " if gf == \"direct_object\":\n", " ckind = \"cdos\"\n", " elif gf == \"infinitive_object\":\n", " ckind = \"idos\"\n", " if ckind:\n", " these_constituents[ckind].add(ac)\n", "\n", " for ckind in these_constituents:\n", " constituents[c][ckind] |= these_constituents[ckind]" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "| 1m 05s \tDone, 47571 clauses with relevant constituents\n" ] } ], "source": [ "utils.caption(\n", " 0, \"\\tDone, {} clauses with relevant constituents\".format(len(constituents))\n", ")" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[14]:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "def makegetGloss():\n", " if \"lex\" in F.otype.all:\n", "\n", " def _getGloss(w):\n", " gloss = F.gloss.v(L.u(w, \"lex\")[0])\n", " return \"?\" if gloss is None else gloss\n", "\n", " else:\n", "\n", " def _getGloss(w):\n", " gloss = F.gloss.v(w)\n", " return \"?\" if gloss is None else gloss\n", "\n", " return _getGloss" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "getGloss = makegetGloss()" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[15]:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "testcases = (\n", " # 426955,\n", " # 427654,\n", " # 428420,\n", " # 429412,\n", " # 429501,\n", " # 429862,\n", " # 431695,\n", " # 431893,\n", " # 430372,\n", ")" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "def showcase(n):\n", " otype = F.otype.v(n)\n", " verseNode = L.u(n, \"verse\")[0]\n", " place = T.sectionFromNode(verseNode)\n", " print(\n", " \"\"\"CASE {}={} ({}-{})\\nCLAUSE: {}\\nVERSE\\n{} {}\\nGLOSS {}\\n\"\"\".format(\n", " n,\n", " otype,\n", " F.rela.v(n),\n", " F.typ.v(n),\n", " T.text(L.d(n, \"word\"), fmt=\"text-trans-plain\"),\n", " \"{} {}:{}\".format(*place),\n", " T.text(L.d(verseNode, \"word\"), fmt=\"text-trans-plain\"),\n", " \" \".join(getGloss(w) for w in L.d(verseNode, \"word\")),\n", " )\n", " )\n", " print(\"PHRASES\\n\")\n", " for p in L.d(n, \"phrase\"):\n", " print(\n", " '''{} ({}-{}) {} \"{}\"'''.format(\n", " p,\n", " F.function.v(p),\n", " F.typ.v(n),\n", " T.text(L.d(p, \"word\"), fmt=\"text-trans-plain\"),\n", " \" \".join(getGloss(w) for w in L.d(p, \"word\")),\n", " )\n", " )\n", " print(\n", " \"valence = {}; grammatical = {}; lexical = {}; semantic = {}\\n\".format(\n", " F.valence.v(p),\n", " F.grammatical.v(p),\n", " F.lexical.v(p),\n", " F.semantic.v(p),\n", " )\n", " )\n", " print(\"SUBCLAUSES\\n\")\n", " for ac in L.d(L.u(n, \"sentence\")[0], \"clause\"):\n", " dep = list(E.mother.f(ac))\n", " if not (len(dep) and dep[0] == n):\n", " continue\n", " print(\n", " '''{} ({}-{}) {} \"{}\"'''.format(\n", " ac,\n", " F.rela.v(ac),\n", " F.typ.v(ac),\n", " T.text(L.d(ac, \"word\"), fmt=\"text-trans-plain\"),\n", " \" \".join(getGloss(w) for w in L.d(ac, \"word\")),\n", " )\n", " )\n", " print(\n", " \"valence = {}; grammatical = {}; lexical = {}; semantic = {}\\n\".format(\n", " F.valence.v(ac),\n", " F.grammatical.v(ac),\n", " F.lexical.v(ac),\n", " F.semantic.v(ac),\n", " )\n", " )\n", "\n", " print(\"CONSTITUENTS\")\n", " for ckind in ckinds:\n", " print(\n", " \"{:<4}: {}\".format(\n", " ckind, \",\".join(str(x) for x in sorted(constituents[n][ckind]))\n", " )\n", " )\n", " print(\"================\\n\")" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "if not SCRIPT:\n", " for n in testcases:\n", " showcase(n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Overview of quantities" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[16]:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 1m 08s Counting constituents .\n", "..............................................................................................\n" ] } ], "source": [ "utils.caption(4, \"Counting constituents\")" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "constituents_count = collections.defaultdict(collections.Counter)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "for c in constituents:\n", " for ckind in ckinds:\n", " n = len(constituents[c][ckind])\n", " constituents_count[ckind][n] += 1" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "| 1m 10s \t22375 clauses with 1 dos constituents\n", "| 1m 10s \t25196 clauses with 0 dos constituents\n", "| 1m 10s \t22375 clauses with a dos constituent\n", "| 1m 10s \t 3557 clauses with 1 pdos constituents\n", "| 1m 10s \t44014 clauses with 0 pdos constituents\n", "| 1m 10s \t 3557 clauses with a pdos constituent\n", "| 1m 10s \t 991 clauses with 1 ndos constituents\n", "| 1m 10s \t46580 clauses with 0 ndos constituents\n", "| 1m 10s \t 991 clauses with a ndos constituent\n", "| 1m 10s \t 111 clauses with 1 kdos constituents\n", "| 1m 10s \t47460 clauses with 0 kdos constituents\n", "| 1m 10s \t 111 clauses with a kdos constituent\n", "| 1m 10s \t 33 clauses with 2 ldos constituents\n", "| 1m 10s \t 3788 clauses with 1 ldos constituents\n", "| 1m 10s \t43750 clauses with 0 ldos constituents\n", "| 1m 10s \t 3821 clauses with a ldos constituent\n", "| 1m 10s \t 1 clauses with 3 idos constituents\n", "| 1m 10s \t 18 clauses with 2 idos constituents\n", "| 1m 10s \t 1193 clauses with 1 idos constituents\n", "| 1m 10s \t46359 clauses with 0 idos constituents\n", "| 1m 10s \t 1212 clauses with a idos constituent\n", "| 1m 10s \t 1305 clauses with 1 cdos constituents\n", "| 1m 10s \t46266 clauses with 0 cdos constituents\n", "| 1m 10s \t 1305 clauses with a cdos constituent\n", "| 1m 10s \t 56 clauses with 2 inds constituents\n", "| 1m 10s \t 5223 clauses with 1 inds constituents\n", "| 1m 10s \t42292 clauses with 0 inds constituents\n", "| 1m 10s \t 5279 clauses with a inds constituent\n", "| 1m 10s \t 1 clauses with 6 locs constituents\n", "| 1m 10s \t 1 clauses with 4 locs constituents\n", "| 1m 10s \t 16 clauses with 3 locs constituents\n", "| 1m 10s \t 330 clauses with 2 locs constituents\n", "| 1m 10s \t12164 clauses with 1 locs constituents\n", "| 1m 10s \t35059 clauses with 0 locs constituents\n", "| 1m 10s \t12512 clauses with a locs constituent\n", "| 1m 10s \t 3 clauses with 3 cpls constituents\n", "| 1m 10s \t 87 clauses with 2 cpls constituents\n", "| 1m 10s \t 8704 clauses with 1 cpls constituents\n", "| 1m 10s \t38777 clauses with 0 cpls constituents\n", "| 1m 10s \t 8794 clauses with a cpls constituent\n", "| 1m 10s \t 2 clauses with 2 bens constituents\n", "| 1m 10s \t 171 clauses with 1 bens constituents\n", "| 1m 10s \t47398 clauses with 0 bens constituents\n", "| 1m 10s \t 173 clauses with a bens constituent\n", "| 1m 10s \t69439 clauses\n" ] } ], "source": [ "for ckind in ckinds:\n", " total = 0\n", " for (count, n) in sorted(constituents_count[ckind].items(), key=lambda y: -y[0]):\n", " if count:\n", " total += n\n", " utils.caption(\n", " 0, \"\\t{:>5} clauses with {:>2} {:<10} constituents\".format(n, count, ckind)\n", " )\n", " utils.caption(\n", " 0, \"\\t{:>5} clauses with {:>2} {:<10} constituent\".format(total, \"a\", ckind)\n", " )\n", "utils.caption(0, \"\\t{:>5} clauses\".format(len(clause_verb)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Applying the flowchart\n", "\n", "We can now apply the flowchart in a straightforward manner.\n", "\n", "We output the results as a comma separated file that can be imported directly into SHEBANQ as a set of notes, so that the reader can check results within SHEBANQ. This has the benefit that the full context is available, and also data view can be called up easily to inspect the coding situation for each particular instance." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[17]:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "glossHacks = {\n", " \"XQ/\": \"law/precept\",\n", "}" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[23]:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "def reptext(\n", " label,\n", " ckind,\n", " v,\n", " phrases,\n", " num=False,\n", " txt=False,\n", " gloss=False,\n", " textformat=\"text-trans-plain\",\n", "):\n", " if phrases is None:\n", " return \"\"\n", " phrases_rep = []\n", " for p in sorted(phrases, key=N.sortKey):\n", " ptext = \"[{}|\".format(F.number.v(p) if num else \"[\")\n", " if txt:\n", " ptext += T.text(L.d(p, \"word\"), fmt=textformat)\n", " if gloss:\n", " words = L.d(p, \"word\")\n", " if ckind == \"ldos\" and F.lex.v(words[0]) == \"L\":\n", " words = words[1:]\n", "\n", " wtexts = []\n", " for w in words:\n", " g = glossHacks.get(F.lex.v(w), getGloss(w)).replace(\n", " \"\", \"&\"\n", " )\n", " if F.lex.v(w) == \"BJN/\" and F.pdp.v(w) == \"prep\":\n", " g = \"between\"\n", " prs_g = get_prs_info(w)[1]\n", " uvf = F.uvf.v(w)\n", " wtext = \"\"\n", " if uvf == \"H\":\n", " ptext += \"toward \"\n", " wtext += (\n", " g if w != v else \"\"\n", " ) # we do not have to put in the gloss of the verb in question\n", " wtext += (\"~\" + prs_g) if prs_g is not None else \"\"\n", " wtexts.append(wtext)\n", " ptext += \" \".join(wtexts)\n", " ptext += \"]\"\n", " phrases_rep.append(ptext)\n", " return \" \".join(phrases_rep)" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[24]:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "debug_messages = collections.defaultdict(lambda: collections.defaultdict(list))" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "constKinds = collections.OrderedDict()" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "for constKindSpec in constKindSpecs:\n", " (constKind, constKindName) = constKindSpec.strip().split(\":\", 1)\n", " constKinds[constKind] = constKindName" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "def flowchart(v, lex, verb, consts):\n", " consts = deepcopy(consts)\n", " n_ = collections.defaultdict(lambda: 0)\n", " for ckind in ckinds:\n", " n_[ckind] = len(consts[ckind])\n", " char1 = None\n", " char2 = None\n", " # determine char 1 of the sense label\n", " if n_[\"pdos\"] > 0:\n", " if n_[\"ndos\"] > 0:\n", " char1 = \"n\"\n", " elif n_[\"cdos\"] > 0:\n", " char1 = \"c\"\n", " elif n_[\"ldos\"] > 0:\n", " char1 = \"l\"\n", " elif n_[\"kdos\"] > 0:\n", " char1 = \"k\"\n", " elif n_[\"idos\"] > 0:\n", " char1 = \"i\"\n", " else:\n", " # in trouble: if there is a principal direct object, there should be an other object as well\n", " # and the other one should be an NP, object clause, L_object, K_object, or I_object\n", " # If this happens, it is probably the result of manual correction\n", " # We warn, and remedy\n", " msg_rep = \"; \".join(\"{} {}\".format(n_[ckind], ckind) for ckind in ckinds)\n", " if n_[\"dos\"] > 0:\n", " # there is an other object (dos should only be used if there is a single object)\n", " # we'll put the dos in the ndos (which was empty)\n", " # This could be caused by a manual enrichment sheet that has been generated\n", " # before the concept of NP_direct_object had been introduced\n", " char1 = \"n\"\n", " consts[\"ndos\"] = consts[\"dos\"]\n", " del consts[\"dos\"]\n", " debug_messages[lex][\"pdos with dos\"].append(\n", " \"{}: {}\".format(T.sectionFromNode(v), msg_rep)\n", " )\n", " else:\n", " # there is not another object, we treat this as a single object, so as a dos\n", " char1 = \"d\"\n", " consts[\"dos\"] = consts[\"pdos\"]\n", " del consts[\"pdos\"]\n", " debug_messages[lex][\"lonely pdos\"].append(\n", " \"{}: {}\".format(T.sectionFromNode(v), msg_rep)\n", " )\n", " else:\n", " if n_[\"cdos\"] > 0:\n", " # in the case of a single object, the clause objects act as ordinary objects\n", " char1 = \"d\"\n", " consts[\"dos\"] |= consts[\"cdos\"]\n", " del consts[\"cdos\"]\n", " if n_[\"ndos\"] > 0:\n", " # in the case of a single object, the np_objects act as ordinary objects\n", " char1 = \"d\"\n", " consts[\"dos\"] |= consts[\"ndos\"]\n", " del consts[\"ndos\"]\n", "\n", " n_ = collections.defaultdict(lambda: 0)\n", " for ckind in ckinds:\n", " n_[ckind] = len(consts[ckind])\n", "\n", " if n_[\"pdos\"] == 0 and n_[\"dos\"] > 0:\n", " char1 = \"d\"\n", " if n_[\"pdos\"] == 0 and n_[\"dos\"] == 0:\n", " char1 = \"-\"\n", "\n", " # determine char 2 of the sense label\n", " if char1 in \"nclki\":\n", " char2 = \".\"\n", " else:\n", " if n_[\"inds\"] > 0:\n", " char2 = \"i\"\n", " elif n_[\"bens\"] > 0:\n", " char2 = \"b\"\n", " elif n_[\"locs\"] > 0:\n", " char2 = \"p\"\n", " elif n_[\"cpls\"] > 0:\n", " char2 = \"c\"\n", " else:\n", " char2 = \"-\"\n", "\n", " sense_label = char1 + char2\n", " sense = lex if lex in senses else None\n", " status = \"*\" if lex in senses else \"?\"\n", "\n", " consts_rep = dict(\n", " (ckind, reptext(\"\", ckind, v, consts[ckind], num=True, gloss=True))\n", " for ckind in consts\n", " )\n", "\n", " return (sense_label, sense, status, consts_rep)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "sfields = \"\"\"\n", " version\n", " book\n", " chapter\n", " verse\n", " clause_atom\n", " is_shared\n", " is_published\n", " status\n", " keywords\n", " ntext\n", "\"\"\".strip().split()" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "sfields_fmt = (\"{}\\t\" * (len(sfields) - 1)) + \"{}\\n\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Running the flowchart\n", "\n", "The next cell finally performs all the flowchart computations for all verbs in all contexts." ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 1m 15s Checking the flowcharts .\n", "..............................................................................................\n" ] } ], "source": [ "utils.caption(4, \"Checking the flowcharts\")\n", "missingFlowcharts = set()" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "| 1m 16s \tNo flowchart for 1543 verbs, e.g. 5} clauses\".format(i))\n", " book = F.book.v(L.u(v, \"book\")[0])\n", " chapter = F.chapter.v(L.u(v, \"chapter\")[0])\n", " verse = F.verse.v(L.u(v, \"verse\")[0])\n", " sentence_n = F.number.v(L.u(v, \"sentence\")[0])\n", " clause_n = F.number.v(c)\n", " clause_atom_n = F.number.v(L.u(v, \"clause_atom\")[0])\n", "\n", " verb = [L.u(v, \"phrase\")[0]]\n", " consts = constituents[c]\n", " n_ = collections.defaultdict(lambda: 0)\n", " for ckind in ckinds:\n", " n_[ckind] = len(consts[ckind])\n", "\n", " (sense_label, sense, status, constsRep) = flowchart(v, lex, verb, consts)\n", " senseRep = \"legend\" if sense is None else sense\n", " senseDoc = (\n", " \"Legend\"\n", " if sense is None\n", " else \"FC_{}\".format(sense.replace(\">\", \"A\").replace(\"<\", \"O\"))\n", " )\n", " senseLink = \"{}/{}\".format(flowchartBase, senseDoc)\n", "\n", " senseFeature[v] = sense_label\n", "\n", " constElems = []\n", " for (constKind, constKindName) in constKinds.items():\n", " if constKind not in constsRep:\n", " continue\n", " material = constsRep[constKind]\n", " if not material:\n", " continue\n", " constElems.append(\"*{}*={}\".format(constKindName, material))\n", "\n", " outcome_lab[sense_label] += 1\n", " outcome_lab_l[lex][sense_label] += 1\n", " decisions[lex][sense_label][c] = sense_label\n", "\n", " ofs.write(\n", " sfields_fmt.format(\n", " VERSION,\n", " book,\n", " chapter,\n", " verse,\n", " clause_atom_n,\n", " \"T\",\n", " \"\",\n", " status,\n", " note_keyword_base,\n", " \"verb [{nm}|{vb}] has sense `{sl}` [{sn}]({slink}) {cs}\".format(\n", " nm=F.number.v(L.u(v, \"phrase\")[0]),\n", " vb=F.g_word_utf8.v(v),\n", " sn=senseRep,\n", " slink=senseLink,\n", " sl=sense_label,\n", " cs=\"; \".join(constElems),\n", " ),\n", " )\n", " )\n", " nnotes[note_keyword_base] += 1\n", "utils.caption(0, \"\\t{:>5} clauses\".format(i))\n", "ofs.close()" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "show_limit = 20\n", "for lex in debug_messages:\n", " TF.error(lex, continuation=True)\n", " for kind in debug_messages[lex]:\n", " utils.caption(0, \"\\tERROR: {}\".format(kind), continuation=True)\n", " messages = debug_messages[lex][kind]\n", " lm = len(messages)\n", " utils.caption(\n", " 0,\n", " \"\\tERROR: \\t{}{}\".format(\n", " \"\\n\\t\\t\".join(messages[0:show_limit]),\n", " \"\" if lm <= show_limit else \"\\n\\t\\tAND {} more\".format(lm - show_limit),\n", " ),\n", " continuation=True,\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Add sense feature to valence module\n", "\n", "We create a new TF feature `sense`, being a mapping from verb word nodes to sense labels, as computed by the flowchart algorithm above.\n", "\n", "We add this feature to the valence module, which has been constructed by the [enrich](enrich.ipynb) notebook." ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [], "source": [ "genericMetaPath = f\"{thisRepo}/yaml/generic.yaml\"\n", "flowchartMetaPath = f\"{thisRepo}/yaml/flowchart.yaml\"\n", "\n", "with open(genericMetaPath) as fh:\n", " genericMeta = yaml.load(fh, Loader=yaml.FullLoader)\n", " genericMeta[\"version\"] = VERSION\n", "with open(flowchartMetaPath) as fh:\n", " flowchartMeta = formatMeta(yaml.load(fh, Loader=yaml.FullLoader))\n", "\n", "metaData = {\"\": genericMeta, **flowchartMeta}" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "nodeFeatures = dict(sense=senseFeature)\n", "\n", "for f in nodeFeatures:\n", " metaData[f][\"valueType\"] = \"str\"" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 1m 40s Writing sense feature to TF .\n", "..............................................................................................\n" ] }, { "data": { "text/plain": [ "True" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "utils.caption(4, \"Writing sense feature to TF\")\n", "TF = Fabric(locations=thisTempTf, silent=True)\n", "TF.save(nodeFeatures=nodeFeatures, edgeFeatures={}, metaData=metaData)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Diffs\n", "\n", "Check differences with previous versions." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[30]:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 1m 41s Check differences with previous version .\n", "..............................................................................................\n", "| 1m 41s \tno features to add\n", "| 1m 41s \tno features to delete\n", "| 1m 41s \t1 features in common\n", "| 1m 41s sense ... no changes\n", "| 1m 41s Done\n" ] } ], "source": [ "utils.checkDiffs(thisTempTf, thisTf, only=set(nodeFeatures))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Deliver\n", "\n", "Copy the new TF feature from the temporary location where it has been created to its final destination." ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 1m 44s Deliver features to /Users/werk/github/etcbc/valence/tf/c .\n", "..............................................................................................\n", "| 1m 44s \tsense\n" ] } ], "source": [ "utils.deliverFeatures(thisTempTf, thisTf, nodeFeatures)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Compile TF" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 1m 47s Load and compile the new TF features .\n", "..............................................................................................\n" ] } ], "source": [ "utils.caption(4, \"Load and compile the new TF features\")" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This is Text-Fabric 9.2.0\n", "Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html\n", "\n", "124 features found and 0 ignored\n", " 4.08s Dataset without structure sections in otext:no structure functions in the T-API\n", " | 0.31s T sense from ~/github/etcbc/valence/tf/c\n", " 15s All features loaded/computed - for details use TF.isLoaded()\n" ] }, { "data": { "text/plain": [ "[('Computed',\n", " 'computed-data',\n", " ('C Computed', 'Call AllComputeds', 'Cs ComputedString')),\n", " ('Features', 'edge-features', ('E Edge', 'Eall AllEdges', 'Es EdgeString')),\n", " ('Fabric', 'loading', ('TF',)),\n", " ('Locality', 'locality', ('L Locality',)),\n", " ('Nodes', 'navigating-nodes', ('N Nodes',)),\n", " ('Features',\n", " 'node-features',\n", " ('F Feature', 'Fall AllFeatures', 'Fs FeatureString')),\n", " ('Search', 'search', ('S Search',)),\n", " ('Text', 'text', ('T Text',))]" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "TF = Fabric(locations=[coreTf, thisTf], modules=[\"\"])\n", "api = TF.load(\n", " \"\"\"\n", " lex sp vs\n", " predication gloss\n", "\"\"\"\n", " + \" \".join(nodeFeatures)\n", ")\n", "api.makeAvailableIn(globals())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Examples" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 2m 09s Show sense counts .\n", "..............................................................................................\n", "| 2m 09s \tSense labels = -- -b -c -i -p c. d- db dc di dp i. k. l. n.\n" ] } ], "source": [ "utils.caption(4, \"Show sense counts\")\n", "senseLabels = sorted({F.sense.v(v) for v in F.otype.s(\"word\")} - {None})\n", "utils.caption(0, \"\\tSense labels = {}\".format(\" \".join(senseLabels)))" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [], "source": [ "senseCount = collections.Counter()\n", "noSense = []\n", "isPredicate = {\"regular\", \"copula\"}" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "| 2m 09s \tCounted 47381 senses\n", "| 2m 09s \tAll relevant verbs have been assigned a sense\n" ] } ], "source": [ "for v in F.sp.s(\"verb\"):\n", " sense = F.sense.v(v)\n", " if sense is None:\n", " # skip words that are not verbs in the qal\n", " if F.vs.v(v) != \"qal\":\n", " continue\n", " # skip verbs in a phrase that is not a verb phrase, e.g. some participles\n", " # the criterion here is whether the value of feature `predication` is non trivial\n", " p = L.u(v, \"phrase\")\n", " if F.predication.v(p) not in isPredicate:\n", " continue\n", " noSense.append(v)\n", " continue\n", " senseCount[sense] += 1\n", "utils.caption(0, \"\\tCounted {} senses\".format(sum(senseCount.values())))\n", "if noSense:\n", " utils.caption(\n", " 0, \"\\tWARNING: {} verb occurrences do not have a sense\".format(len(noSense))\n", " )\n", " for v in noSense[0:10]:\n", " utils.caption(\n", " 0,\n", " \"\\t\\t{:<20} word {:>6} phrase {:>6} = {:<5}\".format(\n", " \"{} {}:{}\".format(*T.sectionFromNode(v)),\n", " v,\n", " L.u(v, \"phrase\")[0],\n", " F.lex.v(v),\n", " ),\n", " )\n", "else:\n", " utils.caption(0, \"\\tAll relevant verbs have been assigned a sense\")" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "| 2m 10s \t\t-- occurs 17999x\n", "| 2m 10s \t\td- occurs 9979x\n", "| 2m 10s \t\t-p occurs 6193x\n", "| 2m 10s \t\t-c occurs 4250x\n", "| 2m 10s \t\t-i occurs 2869x\n", "| 2m 10s \t\tdp occurs 1853x\n", "| 2m 10s \t\tdc occurs 1073x\n", "| 2m 10s \t\tdi occurs 889x\n", "| 2m 10s \t\tl. occurs 876x\n", "| 2m 10s \t\ti. occurs 629x\n", "| 2m 10s \t\tn. occurs 533x\n", "| 2m 10s \t\t-b occurs 66x\n", "| 2m 10s \t\tdb occurs 61x\n", "| 2m 10s \t\tc. occurs 57x\n", "| 2m 10s \t\tk. occurs 54x\n" ] } ], "source": [ "for x in sorted(senseCount.items(), key=lambda x: (-x[1], x[0])):\n", " utils.caption(0, \"\\t\\t{:<2} occurs {:>6}x\".format(*x))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more fine grained overview with graphics, see the\n", "[senses](senses.ipynb)\n", "notebook." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[34]:" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "if SCRIPT:\n", " stop(good=True)" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[65]:" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "| 2m 11s \tReporting flowchart application\n", "valence notes: 47381\n", "Total notes: 47381\n", "All lexemes\n", " Sense -- : 17999 clauses\n", " Sense -b : 66 clauses\n", " Sense -c : 4250 clauses\n", " Sense -i : 2869 clauses\n", " Sense -p : 6193 clauses\n", " Sense c. : 57 clauses\n", " Sense d- : 9979 clauses\n", " Sense db : 61 clauses\n", " Sense dc : 1073 clauses\n", " Sense di : 889 clauses\n", " Sense dp : 1853 clauses\n", " Sense i. : 629 clauses\n", " Sense k. : 54 clauses\n", " Sense l. : 876 clauses\n", " Sense n. : 533 clauses\n", " All senses : 47381 clauses\n", " \n", "\n", " Sense -- : 4 clauses\n", " Sense -b : 0 clauses\n", " Sense -c : 0 clauses\n", " Sense -i : 0 clauses\n", " Sense -p : 0 clauses\n", " Sense c. : 0 clauses\n", " Sense d- : 25 clauses\n", " Sense db : 0 clauses\n", " Sense dc : 0 clauses\n", " Sense di : 0 clauses\n", " Sense dp : 2 clauses\n", " Sense i. : 1 clauses\n", " Sense k. : 0 clauses\n", " Sense l. : 0 clauses\n", " Sense n. : 4 clauses\n", " All senses : 36 clauses\n", " \n", "CJT\n", " Sense -- : 3 clauses\n", " Sense -b : 0 clauses\n", " Sense -c : 7 clauses\n", " Sense -i : 1 clauses\n", " Sense -p : 2 clauses\n", " Sense c. : 0 clauses\n", " Sense d- : 7 clauses\n", " Sense db : 1 clauses\n", " Sense dc : 10 clauses\n", " Sense di : 3 clauses\n", " Sense dp : 18 clauses\n", " Sense i. : 3 clauses\n", " Sense k. : 5 clauses\n", " Sense l. : 10 clauses\n", " Sense n. : 12 clauses\n", " All senses : 82 clauses\n", " \n", "DBQ\n", " Sense -- : 6 clauses\n", " Sense -b : 0 clauses\n", " Sense -c : 5 clauses\n", " Sense -i : 0 clauses\n", " Sense -p : 26 clauses\n", " Sense c. : 0 clauses\n", " Sense d- : 1 clauses\n", " Sense db : 0 clauses\n", " Sense dc : 0 clauses\n", " Sense di : 0 clauses\n", " Sense dp : 1 clauses\n", " Sense i. : 0 clauses\n", " Sense k. : 0 clauses\n", " Sense l. : 0 clauses\n", " Sense n. : 0 clauses\n", " All senses : 39 clauses\n", " \n", "FJM\n", " Sense -- : 14 clauses\n", " Sense -b : 2 clauses\n", " Sense -c : 31 clauses\n", " Sense -i : 2 clauses\n", " Sense -p : 29 clauses\n", " Sense c. : 2 clauses\n", " Sense d- : 47 clauses\n", " Sense db : 4 clauses\n", " Sense dc : 85 clauses\n", " Sense di : 24 clauses\n", " Sense dp : 156 clauses\n", " Sense i. : 23 clauses\n", " Sense k. : 19 clauses\n", " Sense l. : 78 clauses\n", " Sense n. : 61 clauses\n", " All senses : 577 clauses\n", " \n", "NTN\n", " Sense -- : 133 clauses\n", " Sense -b : 0 clauses\n", " Sense -c : 51 clauses\n", " Sense -i : 156 clauses\n", " Sense -p : 57 clauses\n", " Sense c. : 6 clauses\n", " Sense d- : 188 clauses\n", " Sense db : 2 clauses\n", " Sense dc : 132 clauses\n", " Sense di : 305 clauses\n", " Sense dp : 357 clauses\n", " Sense i. : 89 clauses\n", " Sense k. : 18 clauses\n", " Sense l. : 326 clauses\n", " Sense n. : 92 clauses\n", " All senses : 1912 clauses\n", " \n", "QR>\n", " Sense -- : 149 clauses\n", " Sense -b : 1 clauses\n", " Sense -c : 56 clauses\n", " Sense -i : 69 clauses\n", " Sense -p : 69 clauses\n", " Sense c. : 1 clauses\n", " Sense d- : 102 clauses\n", " Sense db : 0 clauses\n", " Sense dc : 8 clauses\n", " Sense di : 37 clauses\n", " Sense dp : 23 clauses\n", " Sense i. : 8 clauses\n", " Sense k. : 1 clauses\n", " Sense l. : 30 clauses\n", " Sense n. : 98 clauses\n", " All senses : 652 clauses\n", " \n", "ZQN\n", " Sense -- : 22 clauses\n", " Sense -b : 0 clauses\n", " Sense -c : 0 clauses\n", " Sense -i : 0 clauses\n", " Sense -p : 0 clauses\n", " Sense c. : 0 clauses\n", " Sense d- : 0 clauses\n", " Sense db : 0 clauses\n", " Sense dc : 0 clauses\n", " Sense di : 0 clauses\n", " Sense dp : 0 clauses\n", " Sense i. : 0 clauses\n", " Sense k. : 0 clauses\n", " Sense l. : 0 clauses\n", " Sense n. : 0 clauses\n", " All senses : 22 clauses\n", " \n" ] } ], "source": [ "if not SCRIPT:\n", " utils.caption(0, \"\\tReporting flowchart application\")\n", " ntot = 0\n", " for (lab, n) in sorted(nnotes.items(), key=lambda x: x[0]):\n", " ntot += n\n", " print(\"{:<10} notes: {}\".format(lab, n))\n", " print(\"{:<10} notes: {}\".format(\"Total\", ntot))\n", "\n", " for lex in [\"\"] + sorted(senses):\n", " print(\"All lexemes\" if lex == \"\" else lex)\n", " src_lab = (\n", " outcome_lab\n", " if lex == \"\"\n", " else outcome_lab_l.get(lex, collections.defaultdict(lambda: 0))\n", " )\n", " tot = 0\n", " for x in senseLabels:\n", " n = src_lab[x]\n", " tot += n\n", " print(\" Sense {:<7}: {:>5} clauses\".format(x, n))\n", " print(\" All senses : {:>5} clauses\".format(tot))\n", " print(\" \")" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[49]:" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "lines_to_next_cell": 2, "tags": [] }, "outputs": [], "source": [ "def show_decision(\n", " verbs=None, labels=None, books=None\n", "): # show all clauses that have a verb in verbs and a sense label in labels\n", " results = []\n", " for verb in decisions:\n", " if verbs is not None and verb not in verbs:\n", " continue\n", " for label in decisions[verb]:\n", " if labels is not None and label not in labels:\n", " continue\n", " for (c, stxt) in sorted(decisions[verb][label].items()):\n", " book = T.sectionFromNode(L.u(c, \"book\")[0])[0]\n", " if books is not None and book not in books:\n", " continue\n", " sentence_words = L.d(L.u(c, \"sentence\")[0], \"word\")\n", " results.append(\n", " \"{:<7} {:<12} {:<5} {:<2} {}\\n\\t{}\\n\\t{}\\n\".format(\n", " c,\n", " \"{} {}: {}\".format(*T.sectionFromNode(c)),\n", " verb,\n", " label,\n", " stxt,\n", " T.text(sentence_words, fmt=\"text-trans-plain\"),\n", " \" \".join(getGloss(w) for w in sentence_words),\n", " ).replace(\"<\", \"<\")\n", " )\n", " print(\"\\n\".join(sorted(results)))" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "In[50]:" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "468348 Isaiah 3: 7 FJM n. n.\n", "\tL> TFJMNJ QYJN <M00 \n", "\tnot put chief people\n", "\n", "468512 Isaiah 5: 20 FJM l. l.\n", "\tHWJ H>MRJM LR< WLVWB FMJM XCK L>WR W>WR LXCK FMJM MR LMTWQ WMTWQ LMR00_S \n", "\talas the say to the evil and to the good put darkness to light and light to darkness put bitter to sweet and sweet to bitter\n", "\n", "468514 Isaiah 5: 20 FJM l. l.\n", "\tHWJ H>MRJM LR< WLVWB FMJM XCK L>WR W>WR LXCK FMJM MR LMTWQ WMTWQ LMR00_S \n", "\talas the say to the evil and to the good put darkness to light and light to darkness put bitter to sweet and sweet to bitter\n", "\n", "468912 Isaiah 10: 6 FJM n. n.\n", "\tW<L&<M <BRTJ >YWNW LCLL CLL WLBZ BZ WLFJMW MRMS KXMR XWYWT00 \n", "\tand upon people anger command to plunder plunder and to spoil spoiling and to put trampled land as clay outside\n", "\n", "469117 Isaiah 13: 9 FJM l. l.\n", "\tHNH JWM&JHWH B> >KZRJ W<BRH WXRWN >P LFWM H>RY LCMH \n", "\tbehold day YHWH come cruel and anger and anger nose to put the earth to destruction\n", "\n", "469219 Isaiah 14: 17 FJM k. k.\n", "\tFM TBL KMDBR \n", "\tput world as the desert\n", "\n", "469240 Isaiah 14: 23 FJM l. l.\n", "\tWFMTJH LMWRC QPD W>GMJ&MJM \n", "\tand put to possession hedgehog and reedy pool water\n", "\n", "469603 Isaiah 21: 4 FJM l. l.\n", "\t>T NCP XCQJ FM LJ LXRDH00 \n", "\t<object marker> breeze desire put to to trembling\n", "\n", "469804 Isaiah 23: 13 FJM l. l.\n", "\tFMH LMPLH00 \n", "\tput to decay\n", "\n", "469924 Isaiah 25: 2 FJM l. l.\n", "\tKJ FMT M<JR LGL QRJH BYWRH LMPLH >RMWN ZRJM M<JR \n", "\tthat put from town to the wave, heap town fortified to decay dwelling tower strange from town\n", "\n", "470077 Isaiah 27: 9 FJM k. k.\n", "\tBFWMW05 KL&>BNJ MZBX K>BNJ&GR MNPYWT L>&JQMW >CRJM WXMNJM00 \n", "\tin put whole stone altar as stone chalk shatter not arise asherah and incense-stand\n", "\n", "470164 Isaiah 28: 15 FJM n. n.\n", "\tKJ FMNW KZB MXSNW \n", "\tthat put lie refuge\n", "\n", "470171 Isaiah 28: 17 FJM l. l.\n", "\tWFMTJ MCPV LQW WYDQH LMCQLT \n", "\tand put justice to line and justice to leveller\n", "\n", "470211 Isaiah 28: 25 FJM n. n.\n", "\tWFM XVH FWRH WF<RH NSMN WKSMT GBLTW00 \n", "\tand put wheat <animal> and barley <uncertain> and spelt boundary\n", "\n", "471035 Isaiah 37: 29 FJM dp dp\n", "\tWFMTJ XXJ B>PK WMTGJ BFPTJK \n", "\tand put thorn in nose and bridle in lip\n", "\n", "471406 Isaiah 41: 15 FJM l. l.\n", "\tHNH FMTJK LMWRG XRWY XDC B<L PJPJWT \n", "\tbehold put to threshing-sledge threshing instrument new lord, baal double-edged\n", "\n", "471409 Isaiah 41: 15 FJM k. k.\n", "\tWGB<WT KMY TFJM00 \n", "\tand hill as the chaff put\n", "\n", "471424 Isaiah 41: 18 FJM l. l.\n", "\t>FJM MDBR L>GM&MJM W>RY YJH LMWY>J MJM00 \n", "\tput desert to reedy pool water and earth dry country to issue water\n", "\n", "471427 Isaiah 41: 19 FJM dp dp\n", "\t>FJM B<RBH BRWC TDHR WT>CWR JXDW00 \n", "\tput in the desert juniper box tree and cypress together\n", "\n", "471430 Isaiah 41: 20 FJM -- --\n", "\tWJFJMW \n", "\tand put\n", "\n", "471444 Isaiah 41: 22 FJM d- d-\n", "\tWNFJMH LBNW \n", "\tand put heart\n", "\n", "471501 Isaiah 42: 4 FJM dp dp\n", "\t<D&JFJM B>RY MCPV \n", "\tunto put in the earth justice\n", "\n", "471534 Isaiah 42: 12 FJM l. l.\n", "\tJFJMW LJHWH KBWD \n", "\tput to YHWH weight\n", "\n", "471549 Isaiah 42: 15 FJM l. l.\n", "\tWFMTJ NHRWT L>JJM \n", "\tand put stream to the coast, island\n", "\n", "471555 Isaiah 42: 16 FJM l. l.\n", "\t>FJM MXCK LPNJHM L>WR WM<QCJM LMJCWR \n", "\tput dark place to face to the light and rugged country to fairness\n", "\n", "471605 Isaiah 42: 25 FJM -c -c\n", "\tWL>&JFJM <L&LB00_P \n", "\tand not put upon heart\n", "\n", "471699 Isaiah 43: 19 FJM dp dp\n", "\t>P >FJM BMDBR DRK BJCMWN NHRWT00 \n", "\teven put in the desert way in wilderness stream\n", "\n", "471761 Isaiah 44: 7 FJM d- d-\n", "\tWJ<RKH LJ MFWMJ <M&<WLM \n", "\tand arrange to from put people eternity\n", "\n", "472133 Isaiah 47: 6 FJM di di\n", "\tL>&FMT LHM RXMJM \n", "\tnot put to compassion\n", "\n", "472137 Isaiah 47: 7 FJM dc dc\n", "\t<D L>&FMT >LH <L&LBK \n", "\tunto not put these upon heart\n", "\n", "472307 Isaiah 49: 2 FJM k. k.\n", "\tWJFM PJ KXRB XDH \n", "\tand put mouth as dagger sharp\n", "\n", "472309 Isaiah 49: 2 FJM l. l.\n", "\tWJFJMNJ LXY BRWR \n", "\tand put to arrow purge\n", "\n", "472361 Isaiah 49: 11 FJM l. l.\n", "\tWFMTJ KL&HRJ LDRK \n", "\tand put whole mountain to the way\n", "\n", "472449 Isaiah 50: 2 FJM n. n.\n", "\t>FJM NHRWT MDBR \n", "\tput stream desert\n", "\n", "472453 Isaiah 50: 3 FJM n. n.\n", "\tWFQ >FJM KSWTM00_S \n", "\tand sack put covering\n", "\n", "472468 Isaiah 50: 7 FJM k. k.\n", "\t<L&KN FMTJ PNJ KXLMJC \n", "\tupon thus put face as the flint\n", "\n", "472510 Isaiah 51: 3 FJM k. k.\n", "\tWJFM MDBRH K<DN W<RBTH KGN&JHWH \n", "\tand put desert as Eden and desert as garden YHWH\n", "\n", "472551 Isaiah 51: 10 FJM n. n.\n", "\tHLW> >T&HJ> HMXRBT JM MJ THWM RBH HFMH M<MQJ&JM DRK L<BR G>WLJM00 \n", "\t<interrogative> not you she the be dry sea water primeval ocean much the put depths sea way to pass redeem\n", "\n", "472581 Isaiah 51: 16 FJM dp dp\n", "\tW>FJM DBRJ BPJK \n", "\tand put word in mouth\n", "\n", "472617 Isaiah 51: 23 FJM dp dp\n", "\tWFMTJH BJD&MWGJK >CR&>MRW LNPCK \n", "\tand put in hand grieve <relative> say to soul\n", "\n", "472621 Isaiah 51: 23 FJM k. k.\n", "\tWTFJMJ K>RY GWK WKXWY L<BRJM00_S \n", "\tand put as the earth back and as the outside to the pass\n", "\n", "472743 Isaiah 53: 10 FJM d- d-\n", "\t>M&TFJM >CM NPCW \n", "\tif put guilt soul\n", "\n", "472808 Isaiah 54: 12 FJM n. n.\n", "\tWFMTJ KDKD CMCTJK WC<RJK L>BNJ >QDX WKL&GBWLK L>BNJ&XPY00 \n", "\tand put ruby sun and gate to stone beryl and whole boundary to stone pleasure\n", "\n", "472970 Isaiah 57: 1 FJM -c -c\n", "\tW>JN >JC FM <L&LB \n", "\tand <NEG> man put upon heart\n", "\n", "472992 Isaiah 57: 7 FJM dp dp\n", "\t<L HR&GBH WNF> FMT MCKBK \n", "\tupon mountain high and lift put couch\n", "\n", "472995 Isaiah 57: 8 FJM dp dp\n", "\tW>XR HDLT WHMZWZH FMT ZKRWNK \n", "\tand after the door and the door-post put remembrance\n", "\n", "473015 Isaiah 57: 11 FJM -c -c\n", "\tL>&FMT <L&LBK \n", "\tnot put upon heart\n", "\n", "473247 Isaiah 59: 21 FJM -p -p\n", "\tRWXJ >CR <LJK WDBRJ >CR&FMTJ BPJK L>&JMWCW MPJK WMPJ ZR<K WMPJ ZR< ZR<K M<TH W<D&<WLM00_S \n", "\twind <relative> upon and word <relative> put in mouth not depart from mouth and from mouth seed and from mouth seed seed from now and unto eternity\n", "\n", "473307 Isaiah 60: 15 FJM l. l.\n", "\tTXT HJWTK <ZWBH WFNW>H W>JN <WBR WFMTJK LG>WN <WLM MFWF DWR WDWR00 \n", "\tunder part be leave and hate and <NEG> pass and put to height eternity joy generation and generation\n", "\n", "473317 Isaiah 60: 17 FJM n. n.\n", "\tWFMTJ PQDTK CLWM WNGFJK YDQH00 \n", "\tand put commission peace and drive justice\n", "\n", "473349 Isaiah 61: 3 FJM -- --\n", "\tCLXNJ LXBC LNCBRJ&LB LQR> LCBWJM DRWR WL>SWRJM PQX_QWX00 LQR> CNT&RYWN LJHWH WJWM NQM L>LHJNW LNXM KL&>BLJM00 LFWM05 L>BLJ YJWN LTT LHM P>R TXT >PR CMN FFWN TXT >BL M<VH THLH TXT RWX KHH \n", "\tsend to saddle to break heart to call to take captive release and to bind opening to call year pleasure to YHWH and day vengeance to god(s) to repent, console whole mourning to put to mourning Zion to give to headdress under part dust oil rejoicing under part mourning rites wrap praise under part wind dim\n", "\n", "473421 Isaiah 62: 7 FJM n. n.\n", "\tHMZKRJM >T&JHWH >L&DMJ LKM00 W>L&TTNW DMJ LW <D&JKWNN W<D&JFJM >T&JRWCLM THLH B>RY00 \n", "\tthe remember <object marker> YHWH not rest to and not give rest to unto be firm and unto put <object marker> Jerusalem praise in the earth\n", "\n", "473500 Isaiah 63: 11 FJM dp dp\n", "\t>JH HFM BQRBW >T&RWX QDCW00 MWLJK LJMJN MCH ZRW< TP>RTW BWQ< MJM MPNJHM L<FWT LW CM <WLM00 MWLJKM BTHMWT \n", "\twhere the put in interior <object marker> wind holiness walk to right-hand side Moses arm splendour split water from face to make to name eternity walk in the primeval ocean\n", "\n", "473804 Isaiah 66: 19 FJM dp dp\n", "\tWFMTJ BHM >WT \n", "\tand put in sign\n", "\n" ] } ], "source": [ "show_decision(verbs={\"FJM\"}, books={\"Isaiah\"})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In[ ]:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.2" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }