{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "# Quads\n", "\n", "When simple signs get stacked we get composite signs.\n", "Here we call them *quads*.\n", "There are several ways to compose quads from sub-quads: there is always\n", "an *operator* involved.\n", "And a composition can again be subjected to an other composition.\n", "And again ..." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2018-05-17T12:50:04.767420Z", "start_time": "2018-05-17T12:50:04.747098Z" } }, "outputs": [], "source": [ "from tf.app import use" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2018-05-17T12:50:15.565059Z", "start_time": "2018-05-17T12:50:14.736217Z" } }, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/Nino-cunei/uruk/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/Nino-cunei/uruk/tf/1.0" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " Text-Fabric: Text-Fabric API 11.3.0, Nino-cunei/uruk/app v3, Search Reference
\n", " Data: Nino-cunei - uruk 1.0, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots/node% coverage
tablet636422.01100
face945614.1095
column140239.3493
line358423.6192
case96513.4624
cluster327531.0324
quad37942.056
comment110901.008
sign1400941.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
Uruk IV/III: Proto-cuneiform tablets \n", "
\n", "\n", "
\n", "
\n", "catalogId\n", "
\n", "
str
\n", "\n", " identifier of tablet in catalog (http://www.flutopedia.com/tablets.htm)\n", "\n", "
\n", "\n", "
\n", "
\n", "crossref\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "damage\n", "
\n", "
int
\n", "\n", " indicates damage of signs or quads,corresponds to #-flag in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "depth\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "excavation\n", "
\n", "
str
\n", "\n", " excavation number of tablet\n", "\n", "
\n", "\n", "
\n", "
\n", "fragment\n", "
\n", "
str
\n", "\n", " level between tablet and face\n", "\n", "
\n", "\n", "
\n", "
\n", "fullNumber\n", "
\n", "
str
\n", "\n", " the combination of face type and column number on columns\n", "\n", "
\n", "\n", "
\n", "
\n", "grapheme\n", "
\n", "
str
\n", "\n", " name of a grapheme (glyph)\n", "\n", "
\n", "\n", "
\n", "
\n", "identifier\n", "
\n", "
str
\n", "\n", " additional information pertaining to the name of a face\n", "\n", "
\n", "\n", "
\n", "
\n", "modifier\n", "
\n", "
str
\n", "\n", " indicates modifcation of a sign; corresponds to sign@letter in transcription. if the grapheme is a repeat, the modification applies to the whole repeat.\n", "\n", "
\n", "\n", "
\n", "
\n", "modifierFirst\n", "
\n", "
str
\n", "\n", " indicates the order between modifiers and variants on the same object; if 1, modifiers come before variants\n", "\n", "
\n", "\n", "
\n", "
\n", "modifierInner\n", "
\n", "
str
\n", "\n", " indicates modifcation of a sign within a repeatcorresponds to sign@letter in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "name\n", "
\n", "
str
\n", "\n", " name of tablet\n", "\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
str
\n", "\n", " number of a column or line or case\n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "period\n", "
\n", "
str
\n", "\n", " period that characterises the tablet corpus\n", "\n", "
\n", "\n", "
\n", "
\n", "prime\n", "
\n", "
int
\n", "\n", " indicates the presence/multiplicity of a prime (single quote)\n", "\n", "
\n", "\n", "
\n", "
\n", "remarkable\n", "
\n", "
int
\n", "\n", " corresponds to ! flag in transcription \n", "\n", "
\n", "\n", "
\n", "
\n", "repeat\n", "
\n", "
int
\n", "\n", " number indicating the number of repeats of a grapheme,especially in numerals; -1 comes from repeat N in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "srcLn\n", "
\n", "
str
\n", "\n", " transcribed line\n", "\n", "
\n", "\n", "
\n", "
\n", "srcLnNum\n", "
\n", "
int
\n", "\n", " line number in transcription file\n", "\n", "
\n", "\n", "
\n", "
\n", "terminal\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "text\n", "
\n", "
str
\n", "\n", " text of comment nodes\n", "\n", "
\n", "\n", "
\n", "
\n", "type\n", "
\n", "
str
\n", "\n", " type of a face; type of a comment; type of a cluster;type of a sign\n", "\n", "
\n", "\n", "
\n", "
\n", "uncertain\n", "
\n", "
int
\n", "\n", " corresponds to ?-flag in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "variant\n", "
\n", "
str
\n", "\n", " allograph for a sign, corresponds to ~x in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "variantOuter\n", "
\n", "
str
\n", "\n", " allograph for a quad, corresponds to ~x in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "written\n", "
\n", "
str
\n", "\n", " corresponds to !(xxx) flag in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "comments\n", "
\n", "
none
\n", "\n", " links comment nodes to their targets\n", "\n", "
\n", "\n", "
\n", "
\n", "op\n", "
\n", "
str
\n", "\n", " operator connecting left to right operand in a quad\n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "sub\n", "
\n", "
none
\n", "\n", " connects line or case with sub-cases, quad with sub-quads; clusters with sub-clusters\n", "\n", "
\n", "\n", "
\n", "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Text-Fabric API: names N F E L T S C TF directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/Nino-cunei/uruk/sources/cdli/images" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Found 2095 ideograph linearts
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Found 2724 tablet linearts
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Found 5495 tablet photos
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A = use(\"Nino-cunei/uruk\", hoist=globals())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need our example tablet (again).\n", "It is particularly relevant to this chapter in our tutorial:\n", "it contains the most deeply nested quad in the whole corpus." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2018-05-17T12:50:58.749695Z", "start_time": "2018-05-17T12:50:58.692028Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s 1 result\n" ] }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

result 1" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

tablet:148166 P005381
MSVO 3, 70uruk-iiicatalogId=P005381
comment:178162
atf: lang qpc
face:156932 obverse
column:190362 1
line:254173 1
case:167736 1a
106585 2(N14)
106586 SZE~a
106587 SAL
106588 TUR3~a
106589 NUN~a
case:167737 1b
106590 3(N19)
quad:143013
106591 GISZ
.
106592 TE
line:254174 2
106593 1(N14)
106594 NAR
106595 NUN~a
106596 SIG7
line:254175 3
106597 2(N04)#
106598 PIRIG~b1
106599 SIG7
106600 URI3~a
106601 NUN~a
column:190363 2
line:254176 1
106602 3(N04)
quad:143014
106603 GISZ
.
106604 TE
106605 GAR
quad:143015
106606 SZU2
.
quad:143016
quad:143017
106607 HI
+
106608 1(N57)
+
quad:143018
106609 HI
+
106610 1(N57)
106611 GI4~a
line:254177 2
106612 GU7
106613 AZ
106614 SI4~f
face:156933 reverse
column:190364 1
line:254178 1
106615 3(N14)
106616 SZE~a
line:254179 2
106617 3(N19)
106618 5(N04)
line:254180 3
106619 GU7
column:190365 2
line:254181 1
106620 AZ
106621 SI4~f
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "pNum = \"P005381\"\n", "query = \"\"\"\n", "tablet catalogId=P005381\n", "\"\"\"\n", "results = A.search(query)\n", "A.lineart(results[0][0], width=200)\n", "A.show(results, withNodes=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The components of quads are either sub-quads or signs.\n", "Sub-quads are also quads in TF, and they are always a composition.\n", "Whenever a member of a sub-quad is no longer a composition, it is a *sign*.\n", "\n", "Let's try to unravel the structure of the biggest quad in this tablet." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Find the quad\n", "\n", "First we need to get the node of this quad. Above we have seen the source code of the tablet in which\n", "it occurs, from that we can pick the node of the case it is in:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:08:35.891516Z", "start_time": "2018-05-09T17:08:35.876347Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['1. 3(N04) , |GISZ.TE| GAR |SZU2.((HI+1(N57))+(HI+1(N57)))| GI4~a ']\n" ] }, { "data": { "text/html": [ "
line:254176 1
106602 3(N04)
quad:143014
106603 GISZ
.
106604 TE
106605 GAR
quad:143015
106606 SZU2
.
quad:143016
quad:143017
106607 HI
+
106608 1(N57)
+
quad:143018
106609 HI
+
106610 1(N57)
106611 GI4~a
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "case = A.nodeFromCase((\"P005381\", \"obverse:2\", \"1\"))\n", "print(A.getSource(case))\n", "A.pretty(case, withNodes=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can easily read off the node number of this big quad.\n", "\n", "But we can also do it programmatically.\n", "\n", "In order to identify our super-quad, we list all quad nodes that are part of this case.\n", "For every quad we list the node numbers of the signs contained in it.\n", "\n", "In order to know what signs are contained in any given node, we use the feature `oslots`.\n", "Like the feature `otype`, this is a standard feature that is always available in a TF dataset.\n", "\n", "Unlike `otype`, `oslots` is an *edge* feature: there is an edge between every node and every slot contained in it.\n", "\n", "Whereas you use `F` to do stuff with node features, you use `E` to do business with edge features.\n", "\n", "And whereas you use `F.feature.v(node)` to get the feature value of a node, you use\n", "`E.oslots.s(node)` to get the nodes for which there is an `oslots` edge from `node` to it." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:08:48.149978Z", "start_time": "2018-05-09T17:08:48.138955Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "143014 array('I', [106603, 106604])\n", "143015 array('I', [106606, 106607, 106608, 106609, 106610])\n", "143016 array('I', [106607, 106608, 106609, 106610])\n", "143017 array('I', [106607, 106608])\n", "143018 array('I', [106609, 106610])\n" ] } ], "source": [ "for node in L.d(case, otype=\"quad\"):\n", " print(f\"{node:>6} {E.oslots.s(node)}\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We see what the biggest quad is.\n", "We could have been a bit more friendly to our selves by showing the actual graphemes in the quads." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:08:51.321478Z", "start_time": "2018-05-09T17:08:51.312988Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "143014 GISZ TE\n", "143015 SZU2 HI N57 HI N57\n", "143016 HI N57 HI N57\n", "143017 HI N57\n", "143018 HI N57\n" ] } ], "source": [ "for node in L.d(case, otype=\"quad\"):\n", " print(f'{node:>6} {\" \".join(F.grapheme.v(s) for s in E.oslots.s(node))}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So let us get the node of the biggest quad." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:08:54.151816Z", "start_time": "2018-05-09T17:08:54.137401Z" } }, "outputs": [ { "data": { "text/plain": [ "143015" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bigQuad = sorted(\n", " (quad for quad in L.d(case, otype=\"quad\")), key=lambda q: -len(E.oslots.s(q))\n", ")[0]\n", "bigQuad" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lo and behold, it is precisely the big quad.\n", "\n", "This is what we are talking about:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:08:57.374493Z", "start_time": "2018-05-09T17:08:57.365895Z" } }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.lineart(bigQuad)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quad structure\n", "\n", "Now we are going to retrieve its components by following *edges*.\n", "\n", "When we converted the data to Text-Fabric, we have made\n", "*edges* from quad nodes to the nodes of their component quads and signs.\n", "\n", "We also have made edges between sibling quads and signs.\n", "\n", "We can distinguish between kinds of edges by means of edge features.\n", "\n", "The edges that go down in a structure have a feature `sub`.\n", "\n", "In order to follow the `sub` edges from a node, you use\n", "\n", "`E.sub.f(node)`.\n", "\n", "This will give you a list of nodes that can be reached *from* `node` by following\n", "a `sub` edge.\n", "\n", "Edges can be travelled in the opposite direction as well:\n", "\n", "`E.sub.t(node)`.\n", "\n", "This will give you the nodes from which there is a `sub` edge *to* `node`." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:09:03.095590Z", "start_time": "2018-05-09T17:09:03.090000Z" } }, "outputs": [ { "data": { "text/plain": [ "(106606, 143016)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E.sub.f(bigQuad)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or, more friendly:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:09:10.907109Z", "start_time": "2018-05-09T17:09:10.896234Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "106606 SZU2\n", "143016 HI N57 HI N57\n" ] } ], "source": [ "for node in E.sub.f(bigQuad):\n", " print(f'{node:>6} {\" \".join(F.grapheme.v(s) for s in E.oslots.s(node))}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us unravel the whole structure by means of a function:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:09:16.960087Z", "start_time": "2018-05-09T17:09:16.949864Z" } }, "outputs": [ { "data": { "text/plain": [ "', >>'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def unravelQuad(quad):\n", " if F.otype.v(quad) == \"sign\":\n", " return F.grapheme.v(quad)\n", " subQuads = E.sub.f(quad)\n", " unraveledSubQuads = [unravelQuad(subQuad) for subQuad in subQuads]\n", " return f'<{\", \".join(unraveledSubQuads)}>'\n", "\n", "\n", "unravelQuad(bigQuad)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Operators\n", "\n", "Where have the operators gone?\n", "\n", "They are present as a feature `op` of edges between sibling quads and signs." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:09:18.823345Z", "start_time": "2018-05-09T17:09:18.815974Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "106606 . 143016\n" ] } ], "source": [ "for child in E.sub.f(bigQuad):\n", " for (right, op) in E.op.f(child):\n", " print(child, op, right)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note, that whereas `E.sub.f` yields a list of nodes,\n", "`E.op.f` yields a list of pairs `(node, op-value)`,\n", "because the `op` edges carry a value.\n", "\n", "The best way to know this, is to consult the\n", "[Feature Doc](https://github.com/Nino-cunei/uruk/blob/master/docs/transcription.md).\n", "This link as always present below the cell where you called `Cunei` for the first time." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Can we try to adapt the unravel function above to get the operators?\n", "\n", "Yes:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:09:20.704428Z", "start_time": "2018-05-09T17:09:20.690647Z" } }, "outputs": [ { "data": { "text/plain": [ "' + >>'" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def unravelQuad(quad):\n", " if F.otype.v(quad) == \"sign\":\n", " return F.grapheme.v(quad)\n", " subQuads = E.sub.f(quad)\n", " result = \"<\"\n", " for sq in subQuads:\n", " for (rq, operator) in E.op.f(sq):\n", " leftRep = unravelQuad(sq)\n", " rightRep = unravelQuad(rq)\n", " result += f\"{leftRep} {operator} {rightRep}\"\n", " result += \">\"\n", " return result\n", "\n", "\n", "unravelQuad(bigQuad)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This technique is employed fully in the function `A.atfFromQuad()`:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:09:23.064045Z", "start_time": "2018-05-09T17:09:23.055904Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "|SZU2.((HI+1(N57))+(HI+1(N57)))|\n" ] } ], "source": [ "print(A.atfFromQuad(bigQuad))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have tested the function `A.atfFromQuad()` on all quads in the corpus, an it regenerates the exact ATF transliterations for them, except for two cases where the ATF has unnecessary brackets. See [checks](http://nbviewer.jupyter.org/github/Nino-cunei/uruk/blob/master/programs/checks.ipynb#Quads)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Next\n", "\n", "[jumps](jumps.ipynb)\n", "\n", "*Leap to the next level ...*\n", "\n", "All chapters:\n", "[start](start.ipynb)\n", "[imagery](imagery.ipynb)\n", "[steps](steps.ipynb)\n", "[search](search.ipynb)\n", "[calc](calc.ipynb)\n", "[signs](signs.ipynb)\n", "**quads**\n", "[jumps](jumps.ipynb)\n", "[cases](cases.ipynb)\n", "\n", "---\n", "\n", "CC-BY Dirk Roorda" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.0" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "607px", "left": "0px", "right": "983px", "top": "110px", "width": "297px" }, "toc_section_display": "block", "toc_window_display": false }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }