{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "# Jumps\n", "\n", "Things do not only lie embedded in each other, they can also *point* to each other.\n", "The mechanism for that are *edges*. Edges are links between *nodes*.\n", "Like nodes, edges may carry feature values.\n", "\n", "We learn how to deal with structure in a quantitative way." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2018-05-11T09:56:05.473296Z", "start_time": "2018-05-11T09:56:05.418258Z" } }, "outputs": [], "source": [ "import collections\n", "from IPython.display import Markdown, display\n", "from tf.app import use" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2018-05-11T09:56:07.605466Z", "start_time": "2018-05-11T09:56:06.236008Z" } }, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/Nino-cunei/uruk/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/Nino-cunei/uruk/tf/1.0" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " TF: TF API 12.2.1, Nino-cunei/uruk/app v3, Search Reference
\n", " Data: Nino-cunei - uruk 1.0, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots / node% coverage
tablet636422.01100
face945614.1095
column140239.3493
line358423.6192
case96513.4624
cluster327531.0324
quad37942.056
comment110901.008
sign1400941.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
Uruk IV/III: Proto-cuneiform tablets \n", "
\n", "\n", "
\n", "
\n", "catalogId\n", "
\n", "
str
\n", "\n", " identifier of tablet in catalog (http://www.flutopedia.com/tablets.htm)\n", "\n", "
\n", "\n", "
\n", "
\n", "crossref\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "damage\n", "
\n", "
int
\n", "\n", " indicates damage of signs or quads,corresponds to #-flag in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "depth\n", "
\n", "
int
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "excavation\n", "
\n", "
str
\n", "\n", " excavation number of tablet\n", "\n", "
\n", "\n", "
\n", "
\n", "fragment\n", "
\n", "
str
\n", "\n", " level between tablet and face\n", "\n", "
\n", "\n", "
\n", "
\n", "fullNumber\n", "
\n", "
str
\n", "\n", " the combination of face type and column number on columns\n", "\n", "
\n", "\n", "
\n", "
\n", "grapheme\n", "
\n", "
str
\n", "\n", " name of a grapheme (glyph)\n", "\n", "
\n", "\n", "
\n", "
\n", "identifier\n", "
\n", "
str
\n", "\n", " additional information pertaining to the name of a face\n", "\n", "
\n", "\n", "
\n", "
\n", "modifier\n", "
\n", "
str
\n", "\n", " indicates modifcation of a sign; corresponds to sign@letter in transcription. if the grapheme is a repeat, the modification applies to the whole repeat.\n", "\n", "
\n", "\n", "
\n", "
\n", "modifierFirst\n", "
\n", "
str
\n", "\n", " indicates the order between modifiers and variants on the same object; if 1, modifiers come before variants\n", "\n", "
\n", "\n", "
\n", "
\n", "modifierInner\n", "
\n", "
str
\n", "\n", " indicates modifcation of a sign within a repeatcorresponds to sign@letter in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "name\n", "
\n", "
str
\n", "\n", " name of tablet\n", "\n", "
\n", "\n", "
\n", "
\n", "number\n", "
\n", "
str
\n", "\n", " number of a column or line or case\n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "period\n", "
\n", "
str
\n", "\n", " period that characterises the tablet corpus\n", "\n", "
\n", "\n", "
\n", "
\n", "prime\n", "
\n", "
int
\n", "\n", " indicates the presence/multiplicity of a prime (single quote)\n", "\n", "
\n", "\n", "
\n", "
\n", "remarkable\n", "
\n", "
int
\n", "\n", " corresponds to ! flag in transcription \n", "\n", "
\n", "\n", "
\n", "
\n", "repeat\n", "
\n", "
int
\n", "\n", " number indicating the number of repeats of a grapheme,especially in numerals; -1 comes from repeat N in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "srcLn\n", "
\n", "
str
\n", "\n", " transcribed line\n", "\n", "
\n", "\n", "
\n", "
\n", "srcLnNum\n", "
\n", "
int
\n", "\n", " line number in transcription file\n", "\n", "
\n", "\n", "
\n", "
\n", "terminal\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "text\n", "
\n", "
str
\n", "\n", " text of comment nodes\n", "\n", "
\n", "\n", "
\n", "
\n", "type\n", "
\n", "
str
\n", "\n", " type of a face; type of a comment; type of a cluster;type of a sign\n", "\n", "
\n", "\n", "
\n", "
\n", "uncertain\n", "
\n", "
int
\n", "\n", " corresponds to ?-flag in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "variant\n", "
\n", "
str
\n", "\n", " allograph for a sign, corresponds to ~x in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "variantOuter\n", "
\n", "
str
\n", "\n", " allograph for a quad, corresponds to ~x in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "written\n", "
\n", "
str
\n", "\n", " corresponds to !(xxx) flag in transcription\n", "\n", "
\n", "\n", "
\n", "
\n", "comments\n", "
\n", "
none
\n", "\n", " links comment nodes to their targets\n", "\n", "
\n", "\n", "
\n", "
\n", "op\n", "
\n", "
str
\n", "\n", " operator connecting left to right operand in a quad\n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "sub\n", "
\n", "
none
\n", "\n", " connects line or case with sub-cases, quad with sub-quads; clusters with sub-clusters\n", "\n", "
\n", "\n", "
\n", "
\n", "\n", " Settings:
specified
  1. apiVersion: 3
  2. appName: Nino-cunei/uruk
  3. appPath: /Users/me/text-fabric-data/github/Nino-cunei/uruk/app
  4. commit: 7da6cb7cd9dffb12aff5e35639078029727a90e7
  5. css:.contnr.cluster {
    flex-flow: row wrap;
    border: 0;
    }
    .meta .features {
    background-color: #ffeedd;
    }
    .lbl.clusterb,.lbl.clustere {
    padding: 0.5em 0.1em 0.1em 0.1em;
    margin: 0.8em 0.1em 0.1em 0.1em;
    color: #888844;
    font-size: x-small;
    }
    .lbl.clusterb {
    border-left: 0.3em solid #cccc99;
    border-right: 0;
    border-top: 0;
    border-bottom: 0;
    border-radius: 1rem;
    }
    .lbl.clustere {
    border-left: 0;
    border-right: 0.3em solid #cccc99;
    border-top: 0;
    border-bottom: 0;
    border-radius: 1rem;
    }
    .op {
    padding: 0.5em 0.1em 0.1em 0.1em;
    margin: 0.8em 0.1em 0.1em 0.1em;
    font-family: monospace;
    font-size: x-large;
    font-weight: bold;
    }
    .period {
    font-family: monospace;
    font-size: medium;
    font-weight: bold;
    color: #0000bb;
    }
    .excavation {
    font-family: monospace;
    font-size: medium;
    font-style: italic;
    color: #779900;
    }
  6. dataDisplay:
    • browseContentPretty: True
    • browseNavLevel: 1
    • showVerseInTuple: True
  7. docs:
    • docPage: about
    • featureBase: {docBase}/transcription{docExt}
    • featurePage: ''
  8. interfaceDefaults:
    • lineNumbers: 0
    • showGraphics: True
    • standardFeatures: True
  9. isCompatible: True
  10. local: local
  11. localDir: /Users/me/text-fabric-data/github/Nino-cunei/uruk/_temp
  12. provenanceSpec:
    • corpus: Uruk IV/III: Proto-cuneiform tablets
    • doi: 10.5281/zenodo.1193841
    • graphicsRelative: sources/cdli/images
    • org: Nino-cunei
    • relative: /tf
    • repo: uruk
    • version: 1.0
    • webBase: https://cdli.ucla.edu
    • webHint: to CDLI main page for this tablet
    • webUrl:{webBase}/search/search_results.php?SearchMode=Text&ObjectID=<1>
  13. release: no value
  14. typeDisplay:
    • case:
      • children:
        • cluster
        • comment
        • quad
        • sign
      • flow: hor
      • label: {number}{prime}
      • level: 2
      • lineNumber: srcLnNum
      • stretch: 0
      • template: {number}{prime}
      • transform: {prime: prime}
      • wrap: 0
    • cluster:
      • children:
        • cluster
        • quad
        • sign
      • label: {type}
      • stretch: 0
      • template: {type}
      • transform: {type: ctype}
    • column:
      • children:
        • comment
        • line
      • flow: ver
      • isBig: True
      • label: {number}{prime}
      • level: 3
      • lineNumber: srcLnNum
      • transform: {prime: prime}
    • comment:
      • base: True
      • featuresBare: text
      • label: {type}
      • lineNumber: srcLnNum
    • face:
      • children:
        • column
        • comment
      • featuresBare: identifier fragment
      • flow: hor
      • isBig: True
      • label: {type}
      • lineNumber: srcLnNum
      • stretch: 0
      • template: {type}
      • wrap: 0
    • line:
      • children:
        • case
        • cluster
        • comment
        • quad
        • sign
      • flow: hor
      • label: {number}
      • level: 2
      • lineNumber: srcLnNum
      • stretch: 0
      • transform: {prime: prime}
      • wrap: 0
    • quad:
      • children:
        • cluster
        • quad
        • sign
      • graphics: True
      • stretch: 0
    • sign:
      • base: True
      • graphics: True
      • label: {atf}
      • transform: {atf: atf}
    • tablet:
      • children:
        • comment
        • face
      • condense: True
      • featuresBare: name period excavation
      • flow: ver
      • isBig: True
      • lineNumber: srcLnNum
      • stretch: 0
      • wrap: 0
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
TF API: names N F E L T S C TF Fs Fall Es Eall Cs Call directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/Nino-cunei/uruk/sources/cdli/images" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Found 2095 ideograph linearts
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Found 2724 tablet linearts
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Found 5495 tablet photos
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A = use(\"Nino-cunei/uruk\",hoist=globals())" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "## Measuring depth\n", "\n", "Numbered lines in the transliterations indicate a hierarchy of cases within lines.\n", "How deep can cases go?\n", "We explore the distribution of cases with respect to their depth.\n", "\n", "We need a function that computes the depth of a case.\n", "We program that function in such a way that it also works for *quads* (seen before),\n", "and *clusters* (will see later).\n", "\n", "The idea of this function is:\n", "* if a structure does not have sub-structures, its depth is 1 or 0;\n", " * it is 1 if the lowest level parts of the structure have a different name\n", " such as quads versus signs;\n", " * it is 0 if the lowest level parts of the structure have the same name,\n", " such as cases in lines;\n", "* the depth of a structure is 1 more than the maximum of the depths of its sub-structures.\n", "\n", "How do we find the sub-structures of a structure?\n", "By following *edges* with a `sub` feature, as we have seen in\n", "[quads](quads.ipynb)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:09:45.106637Z", "start_time": "2018-05-09T17:09:45.098797Z" } }, "outputs": [], "source": [ "def depthStructure(node, nodeType, ground):\n", " subDepths = [\n", " depthStructure(subNode, nodeType, ground)\n", " for subNode in E.sub.f(node)\n", " if F.otype.v(subNode) == nodeType\n", " ]\n", " if len(subDepths) == 0:\n", " return ground\n", " else:\n", " return max(subDepths) + 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example: cases\n", "\n", "We call up our example tablet and do a few basic checks on cases.\n", "\n", "Note that there is also a feature **depth** that provides the depth at which a case is found,\n", "which is different from the depth a case has." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:09:54.958477Z", "start_time": "2018-05-09T17:09:54.923144Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s 1 result\n" ] }, { "data": { "text/html": [ "

result 1" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

tablet:148166@85111 P005381
MSVO 3, 70uruk-iiicatalogId=P005381
comment:178162@85112
atf: lang qpc
face:156932@85114 obverse
column:190362@85115 1
line:254173None 1
case:167736@85116 1a
106585 2(N14)
106586 SZE~a
106587 SAL
106588 TUR3~a
106589 NUN~a
case:167737@85117 1b
106590 3(N19)
quad:143013
106591 GISZ
.
106592 TE
line:254174@85118 2
106593 1(N14)
106594 NAR
106595 NUN~a
106596 SIG7
line:254175@85119 3
106597 2(N04)#
106598 PIRIG~b1
106599 SIG7
106600 URI3~a
106601 NUN~a
column:190363@85120 2
line:254176@85121 1
106602 3(N04)
quad:143014
106603 GISZ
.
106604 TE
106605 GAR
quad:143015
106606 SZU2
.
quad:143016
quad:143017
106607 HI
+
106608 1(N57)
+
quad:143018
106609 HI
+
106610 1(N57)
106611 GI4~a
line:254177@85122 2
106612 GU7
106613 AZ
106614 SI4~f
face:156933@85123 reverse
column:190364@85124 1
line:254178@85125 1
106615 3(N14)
106616 SZE~a
line:254179@85126 2
106617 3(N19)
106618 5(N04)
line:254180@85127 3
106619 GU7
column:190365@85128 2
line:254181@85129 1
106620 AZ
106621 SI4~f
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "pNum = \"P005381\"\n", "query = \"\"\"\n", "tablet catalogId=P005381\n", "\"\"\"\n", "results = A.search(query)\n", "A.show(results, withNodes=True, lineNumbers=True, showGraphics=False)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:10:15.641727Z", "start_time": "2018-05-09T17:10:15.626677Z" } }, "outputs": [ { "data": { "text/html": [ "
line 1
case 1a
2(N14)
SZE~a
SAL
TUR3~a
NUN~a
case 1b
3(N19)
quad
GISZ
.
TE
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "1" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "line1 = T.nodeFromSection((pNum, \"obverse:1\", \"1\"))\n", "A.pretty(line1, showGraphics=False)\n", "depthStructure(line1, \"case\", 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That makes sense, since case 1 is divided in one level of sub-cases: 1a and 1b." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:10:24.631432Z", "start_time": "2018-05-09T17:10:24.624461Z" } }, "outputs": [ { "data": { "text/plain": [ "(167736, 167737)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L.d(line1, otype=\"case\")" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:10:31.271774Z", "start_time": "2018-05-09T17:10:31.260357Z" } }, "outputs": [ { "data": { "text/html": [ "
line 2
1(N14)
NAR
NUN~a
SIG7
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "line2 = T.nodeFromSection((pNum, \"obverse:1\", \"2\"))\n", "A.pretty(line2, showGraphics=False)\n", "depthStructure(line2, \"case\", 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Indeed, case 2 does not have a division in sub-cases." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:10:42.249820Z", "start_time": "2018-05-09T17:10:42.243508Z" } }, "outputs": [ { "data": { "text/plain": [ "()" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L.d(line2, otype=\"case\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Counting by depth\n", "\n", "For a variety of structures we'll find out how deep they go,\n", "and how depth is distributed in the corpus." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cases\n", "\n", "We are going to collect all cases in buckets according to their depths." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:10:51.419830Z", "start_time": "2018-05-09T17:10:51.269320Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 24 cases or lines with depth 4\n", " 66 cases or lines with depth 3\n", " 1024 cases or lines with depth 2\n", " 3247 cases or lines with depth 1\n", "41132 cases or lines with depth 0\n" ] } ], "source": [ "caseDepths = collections.defaultdict(list)\n", "\n", "for n in F.otype.s(\"line\"):\n", " caseDepths[depthStructure(n, \"case\", 0)].append(n)\n", "for n in F.otype.s(\"case\"):\n", " caseDepths[depthStructure(n, \"case\", 0)].append(n)\n", "\n", "caseDepthsSorted = sorted(\n", " caseDepths.items(),\n", " key=lambda x: (-x[0], -len(x[1])),\n", ")\n", "\n", "for (depth, casesOrLines) in caseDepthsSorted:\n", " print(f\"{len(casesOrLines):>5} cases or lines with depth {depth}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll have some fun with this. We find two of the deepest cases, one on\n", "a face that is as small as possible, one on a face that is as big as possible.\n", "\n", "So we restrict ourselves to `caseDepths[4]`.\n", "\n", "For all of these cases we find the face they are on, and the number of quads on that face." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:10:55.880402Z", "start_time": "2018-05-09T17:10:55.852795Z" } }, "outputs": [ { "data": { "text/plain": [ "[(253501, 16),\n", " (232985, 18),\n", " (248868, 23),\n", " (255246, 32),\n", " (241089, 37),\n", " (247955, 38),\n", " (250963, 38),\n", " (231788, 41),\n", " (231789, 41),\n", " (245488, 45),\n", " (242207, 48),\n", " (253727, 48),\n", " (241171, 52),\n", " (255664, 53),\n", " (249501, 59),\n", " (251109, 63),\n", " (255650, 94),\n", " (242646, 112),\n", " (242647, 112),\n", " (248316, 112),\n", " (256051, 295),\n", " (256058, 295),\n", " (256061, 295),\n", " (256062, 295)]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "deepCases = caseDepths[4]\n", "candidates = []\n", "\n", "for case in deepCases:\n", " face = L.u(case, otype=\"face\")[0]\n", " size = len(A.getOuterQuads(face))\n", " candidates.append((case, size))\n", "\n", "sortedCandidates = sorted(candidates, key=lambda x: (x[1], x[0]))\n", "sortedCandidates" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can do better than this!" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
nplinesign
1P006428 obverse:2:21a4(N14) 3(N01) [...] [...] 1b1b11b1A2(N14) 3(N01) BA 1b1B1b1B1AN 3(N57) 1b1B2EN~a PA~a ERIN 1b2[...] 2(N01) GI [...] X
2P006428 obverse:3:11a1a11a1A1a1A1[...] 5(N01) [...] UDU~a 1a1A2[...] 7(N01) MASZ2 1a1B4(N14) 1(N01) DUR~b 1a21a2A[...] [...] 1a2B2(N14) 3(N01) UDU~a GI 1b|LAL2~axNIM~b2| [...] X
3P006428 obverse:3:32a2a15(N01) SU~a PAP~a 2a2UNUG~a RAD~a 2b2b12b1A2b1A11(N01) SZUR2~a KU3~a E2~a 2b1A2[1(N01)] [...] 2b1BUR~a 2b23(N01) TUR BAR SUHUR
4P006428 obverse:3:73a5(N01) [...] SZA3~a1 TUR 3b3b13b1A[...] [SAL] 3b1B3b1B1[...] 3b23b2A5(N01) KUR~a 3b2B3b2B1X [...] 3b2B2X [...] X
5P006428 obverse:5:11a[...] 5(N14) 6(N01) GAR 1b1b1[...] X [...] 1b21(N34) 4(N14) 8(N01) SZE~a GAR 1c1c11c1a[...] 1(N14) [...] 1c1b[...] 1(N34) [...] 1(N14) GUG2~a 1c21c2a4(N14) 4(N01) [...] TUR |U4.2(N08)| 1c2b1c2b11(N01) [...] 1c2b21(N14) 7(N01) TUR 1c2b32(N14) 6(N01) SUR ...
6P4487012a6(N01) EN~a 2b2b12b1A2(N01) NUN~a ZATU687 EN~a EN~a TUR 2b1B2b1B1(EN~a# PAP~a#)a 2b1B2(3(N57) GAN2)a 2b22b2A4(N01) EN~a X KI ZATU687 AN 2b2B2b2B1(EN~a |SZU2.E2~b|)a 2b2B2(BU~a SZU)a 2b2B3(SAL BU~a)a 2b2B4(EN~a HI KASZ~c)a
7P4487011a1a11a1A4(N01) SZE~a 1a1B1a1B11(N01) UD5~a 1a1B23(N01) MASZ2 1a24(N34) 4(N14) 2(N01) DUB~a BA UDU~a 1b1b12(N34) 2(N14) 4(N01) DARA4~c2 1b21(N34) 5(N14) 6(N01) MASZ2 UD5~a 1b32(N14) 2(N01) SZE3 UDU~a 1c1(N34) 2(N14) 1(N01) |U4x1(N57)| BAR
8P448701 obverse:1:11a5(N01) SAL 1b1b11b1A4(N01) SAL 1b1B1b1B1(NAB DI |BU~a+DU6~a|)a 1b1B2(ZI~a#? AN)a 1b1B3(ANSZE~e 7(N57) DUR2 DU)a 1b1B4(LAL3~a#? GAR IG~b)a 1b21b2A1(N01) SZA3~a1 TUR 1b2B(TU~b)a 4(N41)
9P448701 obverse:1:12a3(N01) KUR~a 2b2b12b1A1(N01) KUR~a 2b1B(NA~a NIR~a)a 2b22b2A2(N01) SZA3~a1 TUR 2b2B2b2B1(GI6 KISZIK~a# URI3~a)a 2b2B2([...])a 4(N41)
10P448701 obverse:1:23a2(N34) 4(N14) [...] X [...] 3b3b13b1A[...] ZAG~a X SUHUR [...] 3b1B3b1B12(N34) 2(N14) 4(N01) SUHUR [...] 3b1B22(N14) 4(N01) SUHUR [...] 3b1B34(N01) SUHUR [...] 3b1B4[...] 3b21(N14) |HI.SUHUR| [...] X
11P448701 obverse:2:11a5(N01) SUM~a GA2~a1 PAP~a E2~b EN~a ISZ~a 1b1b13(N01) GA2~a1 GAL~a 1b21b2a1b2a11(N01) GAL~a 1b2a21(N01) GA2~a1 TUR 1b2b1b2b11(N01) |U4x3(N57)| 1b2b21(N01) X 1cPIRIG~b1 ISZ~a X
12P448701 obverse:2:11a1a17(N20) 5(N05) 1(N42~a) 1(N25) 1a29(N18) 2(N03) 2(N40) 1(N24~a) 1b1b11b1A2(N14) 1(N01) SZEN~b GAL~a 1b1B1b1B11(N20) 1(N42~a) 1(N25) 1b1B21(N18) 1(N40) 1(N24~a) 1b21b2A2(N34) 4(N14) SZEN~b TUR 1b2B1b2B13(N20) 1(N05) 1(N42~a) 1b2B23(N18) 1(N03) 1(N40) 1b31b3A6(N34) 1(N14) 8(N01) SZEN~c@t 1b3B1b3B13(N20) 1(N05) 2(N42~a) 1b3B24(N18) 5(N03) 2(N40) 1b41b4A1(N14) DUG~a 1b4B1b4B12(N05) 2(N42~a) 1(N57) SZE~a 1b4B21(N03) 3(N40) X
13P4487011a1(N01) 1(N30~a) 1(N24) SZE~a 1b1b12(N39~a) HI@g~a 1b23(N39~a) 1(N24) 1(N30~a) SZE~a 1c1c11c1a5(N01) GAR GAL~a 1c1b1(N42~a) HI@g~a 1c21(N27) NE~a ZATU714 GUG2~a 1c31(N29~a) |ZATU714xHI@g~a| MU 1c41c4a2(N01) U4 GAR 1c4b1(N27) 1c51c5a1c5a18(N01) GAR U2~a 1c5a2|SZU2.E2~b| 1c61c6a8(N01) DU8~c 1c6b1(N27) 1c71c7a1(N14) 5(N01) SZE~a GAR 1c7b1(N42~a) 1(N25)
14P4487011a5(N01) GA~a DUB~a 1b1b11b1A1b1A11(N01) BA AB~a AN MUSZ3~a 1b1A21(N01) SAL BA PIRIG~b1 1b1BZATU751~a DUR2 3(N57) BU~a 1b23(N01) BA [...]
15P448702 obverse:1:22a3(N01) LAL2~a NE~a 1(N57) 3(N57) 2b2b12b1A1(N01) UD5~a 2b1BPAP~a 2b1CLAL2~a NE~a 2b22b2A2(N01) APIN~a 3(N57) GIR3@g~c A BU~a PAP~a NAM2 2b2B2b2B1ZATU836 BU~a SZE~a 2b2B2U8 LAGAB~a 2b2CSZIM~a N
16P448702 obverse:1:22a1(N14) 8(N01) BA KI 2b2b12b1A1(N14) 3(N01) DA~a PA~a |DU8~cxUDU~a| 2b1B2b1B12(N01) U8 [...] 2b1B21(N01) [...] 2b1B34(N01) UDUNITA~a [...] 2b1B46(N01) MASZ2 2b22b2A5(N01) GURUSZDA 2b2B2b2B12(N01) U8 [...] 2b2B21(N01) SZE3 UDUNITA~a 2b2B32(N01) X [...] NUN~a
17P448703 obverse:1:33a1(N01) AB~a AB2 3b3b13b1A3b1A11(N01) DUG~c 3b1A21(N01) [...] 3b1BNAB DI |E2~ax1(N57)@t| BAPPIR~a GIBIL 3b21(N01) ZATU732 GI6 KAR2~a NUN~a 3b33b3A3(N01) |SILA3~axGARA2~a| 3b3B3b3B12(N57) SZA |U4x2(N01)| SZU 3b3B21(N57) |GIxSZE3| BAD 1(N08)
18P471695 obverse:1:11a3(N14) 7(N01) UR5~a SAL 1b1b11b1a2(N14) GI6 AMA~a 1b1b1b1b01NIN NAB DI 1b1b02SAL 1(N02) PAP~a 1b1b03BU~a U4 SI4~a 1b1b04EN~a U4 1b1b05[...] X 1b1b06NAR 1b1b07EN~a U4 1b1b08ZATU628~a KI 1b1b09[...] 1b1b10[...] 1b1b11SAL SAL 1b1b12SZEG9 1b1b13[...] 1b1b14[...] 1b1b15[...] 1b1b16X GI4~a 1b1b17[...] 1b1b18[...] 1b1b19[...] 1b1b20[...] 1b21b2a1(N14) 7(N01) X 1b2b1b2b01[...] 1b2b02EN~a [...] 1b2b03[...] 1b2b04[...] 1b2b05NAGA~a 1b2b06[...] ZATU694~c 1b2b07[...] 1b2b08AN NIN DUB~a 1b2b09NAGA~a HI 1b2b10[...] 1b2b11[...] 1b2b12X NUN~a [...] 1b2b13HI@g~a [...] 1b2b14[...] 1b2b15[...] 1b2b16[...] 1b2b17[...] EN~a
19P471695 obverse:1:12a3(N01) SAL TUR 2b2b12b1a2(N01) [...] 2b1b2b1b1AMA~a GI6 2b1b2X [...] 2b22b2a1(N01) X 2b2b2b2b1MA AN AMA~a EN~a
20P471695 obverse:1:13a6(N01) EN~a IB~a 3b3b11(N01) SU~a PAP~a 3b23b2A5(N01) KUR~a 3b2B3b2B12(N01) EN~a SZE3 KUR~a 3b2B23(N01) DUB~a KUR~a EN~a
21P000014 obverse:1:22a1(N01) DUG~c 2b2b12b1A[...] |DUG~cx1(N57)| 2b1B2b1B1[...] GAN~c SZU 2b1B2SZITA~a1 ADAB UTUL~a 2b22b2A2(N01) SILA3~a GARA2~a 2b2B2b2B11(N57) SZA SZU 2b2B21(N57) ZATU659 PAP~a SUKKAL 2b34(N01) |SILA3~axGA~a| SZA 2cSUG5 LA2 IB~a 1(N01)
22P000014 obverse:1:22a2a13(N01) E2~a PA~a 3(N57) |(UDU~axTAR)~a| KU6~a 2a22a2A2a2A11(N01) E2~a ALAN~b NUN~a TAK4~a 2a2A21(N01) ZATU651@g TAK4~a ALAN~b 2a2BX KA~a X HI X 2a31(N01) SANGA~a GA~a AB~a 2a41(N01) GAL~a ZATU687 AB~a 2a51(N01) GA~a ARARMA2~a 2bGESZTU~b SZITA~a1 ZATU686~a 1(N01)
23P000014 obverse:1:21a[...] [...] DUG~c UNUG~a 1b1b11b1A[...] |SILA3~axGARA2~a| 1b1B1b1B1[...] [...] X 1b1B21(N57) DILMUN 1b21b2A4(N01) |SILA3~axGA~a| 1b2B1b2B11(N57) EN~a SZE3 TUR NUN~a 1b2B21(N57) X AN MUSZ3~a EN~a X 1b2B31(N57) SIG2~b 1b2B41(N57) E2~a GU 1b32(N01) BA SILA3~a KASZ~b 1(N01)
24P000014 obverse:1:22a1(N01) KU6~a 2b2b12b1A6(N01) |SILA3~axGARA2~a| 2b1B2b1B11(N57) EN~a SAG 2b1B21(N57) HI E2~a DILMUN NUN~a 2b1B31(N57) NAMESZDA 2b1B41(N57) GESZTU~a DIM~a 2b1B51(N57) SZA SZU 2b1B61(N57) GI BAD 2b21(N01) |SILA3~axGA~a| |SIxSZE3| EN~a NUN~a 2b35(N14) BA SILA3~a KASZ~b 1(N01)
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.table(sortedCandidates)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also assemble relevant information for this table by hand\n", "and put it in a markdown table." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:10:58.694319Z", "start_time": "2018-05-09T17:10:58.679683Z" } }, "outputs": [ { "data": { "text/markdown": [ "case type | case number | tablet | face | size\n", "------ | ---- | ---- | ---- | ----\n", "line | 1 | P005294 | obverse | 16\n", "line | 1 | P218054 | reverse | 18\n", "line | 2 | P006092 | obverse | 23\n", "line | 3 | P002694 | reverse | 32\n", "line | 1 | P325754 | reverse | 37\n", "line | 2 | P006036 | obverse | 38\n", "line | 1 | P006295 | reverse | 38\n", "line | 1 | P004735 | obverse | 41\n", "line | 2 | P004735 | obverse | 41\n", "line | 3 | P002856 | obverse | 45\n", "line | 1 | P411608 | obverse | 48\n", "line | 1 | P005322 | reverse | 48\n", "line | 1 | P325234 | reverse | 52\n", "line | 1 | P003531 | obverse | 53\n", "line | 2 | P006160 | obverse | 59\n", "line | 2 | P006307 | reverse | 63\n", "line | 3 | P003529 | obverse | 94\n", "line | 1 | P387752 | obverse | 112\n", "line | 2 | P387752 | obverse | 112\n", "line | 3 | P006056 | reverse | 112\n", "line | 2 | P003808 | obverse | 295\n", "line | 2 | P003808 | obverse | 295\n", "line | 1 | P003808 | obverse | 295\n", "line | 2 | P003808 | obverse | 295\n" ], "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "markdown = \"\"\"\n", "case type | case number | tablet | face | size\n", "------ | ---- | ---- | ---- | ----\n", "\"\"\".strip()\n", "markdown += \"\\n\"\n", "\n", "bigCase = sortedCandidates[-1][0]\n", "smallCase = sortedCandidates[0][0]\n", "\n", "for (case, size) in sortedCandidates:\n", " caseType = F.otype.v(case)\n", " caseNum = F.number.v(case)\n", " face = L.u(case, otype=\"face\")[0]\n", " tablet = L.u(case, otype=\"tablet\")[0]\n", " markdown += f\"\"\"\n", "{caseType} | {caseNum} | {A.cdli(tablet, asString=True)} | {F.type.v(face)} | {size}\n", "\"\"\".strip()\n", " markdown += \"\\n\"\n", "\n", "Markdown(markdown)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not surprisingly: the deepest cases are all lines.\n", "Because every case is enclosed by a line, which is one deeper than that case.\n", "\n", "You can click on the P-numbers to view these tablets on CDLI.\n", "\n", "We finally show the source lines that contain these deep cases." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:11:04.397693Z", "start_time": "2018-05-09T17:11:04.372102Z" } }, "outputs": [ { "data": { "text/html": [ "
line 1
case 1a
4(N14)#
3(N01)
cluster ?
...
cluster ?
cluster ?
...
cluster ?
case 1b
case 1b1
case 1b1A
2(N14)
3(N01)
BA
case 1b1B
case 1b1B1
AN
3(N57)
case 1b1B2
EN~a
PA~a
ERIN
case 1b2
cluster ?
...
cluster ?
2(N01)#
GI#
cluster ?
...
cluster ?
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
line 2
case 2a
1(N01)
KU6~a
case 2b
case 2b1
case 2b1A
6(N01)
quad
SILA3~a
x
GARA2~a
case 2b1B
case 2b1B1
1(N57)
EN~a#
SAG#
case 2b1B2
1(N57)
HI
E2~a
DILMUN
NUN~a
case 2b1B3
1(N57)
NAMESZDA
case 2b1B4
1(N57)
GESZTU~a?
DIM~a
case 2b1B5
1(N57)
SZA
SZU
case 2b1B6
1(N57)
GI
BAD
case 2b2
1(N01)
quad
SILA3~a
x
GA~a
quad
SI
x
SZE3
EN~a#
NUN~a#
case 2b3
5(N14)
BA
SILA3~a
KASZ~b
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.pretty(smallCase)\n", "A.pretty(bigCase)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With a bit of coding we can get another display:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:11:22.150309Z", "start_time": "2018-05-09T17:11:22.129140Z" } }, "outputs": [ { "data": { "text/markdown": [ "\n", "**P005294 obverse:1 line 1**\n", "\n", "```\n", "@obverse \n", "@column 1 \n", "1.a. 4(N14)# 3(N01) [...] , [...] \n", "1.b1A. 2(N14) 3(N01) , BA \n", "1.b1B1. , AN 3(N57) \n", "1.b1B2. , EN~a PA~a ERIN \n", "1.b2. [...] 2(N01)# , GI# [...] \n", "```\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "\n", "\n", "---\n", "\n", "**P003808 obverse:6 line 2**\n", "\n", "```\n", "2.a. 1(N01) , KU6~a \n", "2.b1A. 6(N01) , |SILA3~axGARA2~a| \n", "2.b1B1. 1(N57) , EN~a# SAG# \n", "2.b1B2. 1(N57) , HI E2~a DILMUN NUN~a \n", "2.b1B3. 1(N57) , NAMESZDA \n", "2.b1B4. 1(N57) , GESZTU~a? DIM~a \n", "2.b1B5. 1(N57) , SZA SZU \n", "2.b1B6. 1(N57) , GI BAD \n", "2.b2. 1(N01) , |SILA3~axGA~a| |SIxSZE3| EN~a# NUN~a# \n", "2.b3. 5(N14) , BA SILA3~a KASZ~b \n", "```\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "(smallPnum, smallColumn, smallCaseNum) = A.caseFromNode(smallCase)\n", "(bigPnum, bigColumn, bigCaseNum) = A.caseFromNode(bigCase)\n", "\n", "smallLineStr = \"\\n\".join(A.getSource(smallCase))\n", "bigLineStr = \"\\n\".join(A.getSource(bigCase))\n", "\n", "display(\n", " Markdown(\n", " f\"\"\"\n", "**{smallPnum} {smallColumn} line {smallCaseNum}**\n", "\n", "```\n", "{smallLineStr}\n", "```\n", "\"\"\"\n", " )\n", ")\n", "A.lineart(smallPnum, width=200)\n", "\n", "display(\n", " Markdown(\n", " f\"\"\"\n", "\n", "---\n", "\n", "**{bigPnum} {bigColumn} line {bigCaseNum}**\n", "\n", "```\n", "{bigLineStr}\n", "```\n", "\"\"\"\n", " )\n", ")\n", "A.photo(bigPnum, width=400)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Quads\n", "\n", "We just want to see how deep quads can get." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:11:40.178204Z", "start_time": "2018-05-09T17:11:40.138829Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1 quads with depth 3\n", " 167 quads with depth 2\n", " 3626 quads with depth 1\n" ] } ], "source": [ "quadDepths = collections.defaultdict(list)\n", "\n", "for quad in F.otype.s(\"quad\"):\n", " quadDepths[depthStructure(quad, \"quad\", 1)].append(quad)\n", "\n", "quadDepthsSorted = sorted(\n", " quadDepths.items(),\n", " key=lambda x: (-x[0], -len(x[1])),\n", ")\n", "\n", "for (depth, quads) in quadDepthsSorted:\n", " print(f\"{len(quads):>5} quads with depth {depth}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lo and behold! There is just one quad of depth 3 and it is on our leading\n", "example tablet.\n", "\n", "We have studied it already in [quads](quads.jpg)." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:11:42.703918Z", "start_time": "2018-05-09T17:11:42.692020Z" } }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "P005381" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bigQuad = quadDepths[3][0]\n", "tablet = L.u(bigQuad, otype=\"tablet\")[0]\n", "A.lineart(bigQuad)\n", "A.cdli(tablet)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Clusters\n", "\n", "Clusters are groups of consecutive quads between brackets.\n", "\n", "Clusters can be nested.\n", "As with quads, we find the members of a cluster by following `sub` edges." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Depths in clusters\n", "\n", "We use familiar logic to get a hang of cluster depths." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:11:46.847477Z", "start_time": "2018-05-09T17:11:46.667189Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 106 clusters with depth 2\n", "32647 clusters with depth 1\n" ] } ], "source": [ "clusterDepths = collections.defaultdict(list)\n", "\n", "for cl in F.otype.s(\"cluster\"):\n", " clusterDepths[depthStructure(cl, \"cluster\", 1)].append(cl)\n", "\n", "clusterDepthsSorted = sorted(\n", " clusterDepths.items(),\n", " key=lambda x: (-x[0], -len(x[1])),\n", ")\n", "\n", "for (depth, cls) in clusterDepthsSorted:\n", " print(f\"{len(cls):>5} clusters with depth {depth}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not much going on here.\n", "Let's pick a nested cluster." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:11:50.378678Z", "start_time": "2018-05-09T17:11:50.361319Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(IDIGNA [...] ...)a\n" ] }, { "data": { "text/html": [ "
cluster:194488 =
133 IDIGNA
cluster:194489 ?
134 ...
cluster:194489 ?
cluster:194488 =
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "P471695" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "nestedCluster = clusterDepths[2][0]\n", "tablet = L.u(nestedCluster, otype=\"tablet\")[0]\n", "quads = A.getOuterQuads(nestedCluster)\n", "print(A.atfFromCluster(nestedCluster))\n", "A.pretty(nestedCluster, withNodes=True)\n", "A.lineart(quads[0], height=150)\n", "A.cdli(tablet)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Kinds of clusters\n", "\n", "In our corpus we encounter several types of brackets:\n", "\n", "* `( )a` for proper names\n", "* `[ ]` for uncertainty\n", "* `< >` for supplied material.\n", "\n", "The next thing is to get on overview of the distribution of these kinds." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:11:53.566923Z", "start_time": "2018-05-09T17:11:53.494899Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "32116 x a uncertain-cluster\n", " 636 x a properName-cluster\n", " 1 x a supplied-cluster\n" ] } ], "source": [ "clusterTypeDistribution = collections.Counter()\n", "\n", "for cluster in F.otype.s(\"cluster\"):\n", " typ = F.type.v(cluster)\n", " clusterTypeDistribution[typ] += 1\n", "\n", "for (typ, amount) in sorted(\n", " clusterTypeDistribution.items(),\n", " key=lambda x: (-x[1], x[0]),\n", "):\n", " print(f\"{amount:>5} x a {typ:>8}-cluster\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The conversion to TF has transformed `[...]` to a cluster of one sign with grapheme `…`.\n", "These are trivial clusters and we want to exclude them from further analysis, so we redo the counting.\n", "\n", "First we make a sequence of all non-trivial clusters:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:11:56.303587Z", "start_time": "2018-05-09T17:11:56.161362Z" } }, "outputs": [ { "data": { "text/plain": [ "3384" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "realClusters = [\n", " c\n", " for c in F.otype.s(\"cluster\")\n", " if (\n", " F.type.v(c) != \"uncertain\"\n", " or len(E.oslots.s(c)) > 1\n", " or F.grapheme.v(E.oslots.s(c)[0]) != \"…\"\n", " )\n", "]\n", "len(realClusters)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we redo the same analysis, but we start with the filtered cluster sequence." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:11:58.311255Z", "start_time": "2018-05-09T17:11:58.295802Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 2747 x a uncertain-cluster\n", " 636 x a properName-cluster\n", " 1 x a supplied-cluster\n" ] } ], "source": [ "clusterTypeDistribution = collections.Counter()\n", "\n", "for cluster in realClusters:\n", " typ = F.type.v(cluster)\n", " clusterTypeDistribution[typ] += 1\n", "\n", "for (typ, amount) in sorted(\n", " clusterTypeDistribution.items(),\n", " key=lambda x: (-x[1], x[0]),\n", "):\n", " print(f\"{amount:>5} x a {typ:>8}-cluster\")" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "#### Lengths of clusters\n", "\n", "How long are clusters in general?\n", "There are two possible ways to measure the length of a cluster:\n", "\n", "* the amount of signs it occupies;\n", "* the amount of top-level members it has (quads or signs)\n", "\n", "By now, the pattern to answer questions like this is becoming familiar.\n", "\n", "We express the logic in a function, that takes the way of measuring\n", "as a parameter.\n", "In that way, we can easily provide a cluster-length distribution based\n", "on measurements in signs and in quads." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:12:00.424680Z", "start_time": "2018-05-09T17:12:00.418853Z" } }, "outputs": [], "source": [ "def computeDistribution(nodes, measure):\n", " distribution = collections.Counter()\n", "\n", " for node in nodes:\n", " m = measure(node)\n", " distribution[m] += 1\n", "\n", " for (m, amount) in sorted(\n", " distribution.items(),\n", " key=lambda x: (-x[1], x[0]),\n", " ):\n", " print(f\"{amount:>5} x a measure of {m:>8}\")" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:12:01.619428Z", "start_time": "2018-05-09T17:12:01.612356Z" } }, "outputs": [], "source": [ "def lengthInSigns(node):\n", " return len(L.d(node, otype=\"sign\"))\n", "\n", "\n", "def lengthInMembers(node):\n", " return len(E.sub.f(node))" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2018-03-06T16:09:42.265262Z", "start_time": "2018-03-06T16:09:42.259021Z" } }, "source": [ "Now we can show the length distributions of clusters by just calling `computeDistribution()`:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:12:03.752085Z", "start_time": "2018-05-09T17:12:03.722970Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 2691 x a measure of 1\n", " 433 x a measure of 2\n", " 205 x a measure of 3\n", " 41 x a measure of 4\n", " 9 x a measure of 5\n", " 3 x a measure of 6\n", " 2 x a measure of 7\n" ] } ], "source": [ "computeDistribution(realClusters, lengthInSigns)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:12:04.700024Z", "start_time": "2018-05-09T17:12:04.671528Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 2678 x a measure of 1\n", " 452 x a measure of 2\n", " 194 x a measure of 3\n", " 44 x a measure of 4\n", " 11 x a measure of 5\n", " 4 x a measure of 6\n", " 1 x a measure of 7\n" ] } ], "source": [ "computeDistribution(realClusters, lengthInMembers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course, we want to see the longest cluster." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:12:07.581594Z", "start_time": "2018-05-09T17:12:07.461364Z" } }, "outputs": [ { "data": { "text/html": [ "
cluster =
SAG
ERIM~a
TAK4~a
NI~a
MUSZ3~a
UR2
DUR2
cluster =
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "longestCluster = [c for c in F.otype.s(\"cluster\") if lengthInMembers(c) == 7][0]\n", "A.pretty(longestCluster)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Lengths of quads\n", "\n", "If you look closely at the code for these functions, there is nothing in it that\n", "is specific for clusters.\n", "\n", "The measures are in terms of the totally generic `oslots` function, and the fairly generic\n", "`sub` edges, which are also defined for quads.\n", "\n", "So, in one go, we can obtain a length distribution of quads.\n", "\n", "Note that quads can also be sub-quads." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:12:10.508050Z", "start_time": "2018-05-09T17:12:10.474472Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 3611 x a measure of 2\n", " 175 x a measure of 3\n", " 7 x a measure of 4\n", " 1 x a measure of 5\n" ] } ], "source": [ "computeDistribution(F.otype.s(\"quad\"), lengthInSigns)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:12:11.344789Z", "start_time": "2018-05-09T17:12:11.319617Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 3778 x a measure of 2\n", " 16 x a measure of 3\n" ] } ], "source": [ "computeDistribution(F.otype.s(\"quad\"), lengthInMembers)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "ExecuteTime": { "end_time": "2018-05-09T17:12:12.244251Z", "start_time": "2018-05-09T17:12:12.207774Z" } }, "outputs": [ { "data": { "text/html": [ "
quad
SZU2
.
quad
quad
HI
+
1(N57)
+
quad
HI
+
1(N57)
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "longestQuad = [q for q in F.otype.s(\"quad\") if lengthInSigns(q) == 5][0]\n", "A.pretty(longestQuad)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Next\n", "\n", "[cases](cases.ipynb)\n", "\n", "*In* case *you are serious ...*\n", "\n", "Try the\n", "[primers](http://nbviewer.jupyter.org/github/Nino-cunei/primers/tree/master/)\n", "for introductions into digital cuneiform research.\n", "\n", "All chapters:\n", "[start](start.ipynb)\n", "[imagery](imagery.ipynb)\n", "[steps](steps.ipynb)\n", "[search](search.ipynb)\n", "[calc](calc.ipynb)\n", "[signs](signs.ipynb)\n", "[quads](quads.ipynb)\n", "**jumps**\n", "[cases](cases.ipynb)\n", "\n", "---\n", "\n", "CC-BY Dirk Roorda" ] } ], "metadata": { "jupytext": { "encoding": "# -*- coding: utf-8 -*-" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.0" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "607px", "left": "0px", "right": "983px", "top": "110px", "width": "297px" }, "toc_section_display": "block", "toc_window_display": false }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }