{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Getting started\n", "\n", "It is assumed that you have read\n", "[start](start.ipynb)\n", "and followed the installation instructions there." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Corpus\n", "\n", "This is\n", "\n", "* `oldbabylonian` Old Babylonian Letters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# First acquaintance\n", "\n", "We just want to grasp what the corpus is about and how we can find our way in the data.\n", "\n", "Open a terminal or command prompt and say one of the following\n", "\n", "```text-fabric oldbabylonian```\n", "\n", "Wait and see a lot happening before your browser starts up and shows you an interface on the corpus:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Text-Fabric needs an app to deal with the corpus-specific things.\n", "It downloads/finds/caches the latest version of the **app**:\n", "\n", "```\n", "Using TF-app in /Users/dirk/text-fabric-data/annotation/app-oldbabylonian/code:\n", "\trv0.2=#4bb2530bfb94dc93601f8b3df7722cb0e5df7a43 (latest release)\n", "```\n", "\n", "It downloads/finds/caches the latest version of the **data**:\n", "\n", "```\n", "Using data in /Users/dirk/text-fabric-data/Nino-cunei/oldbabylonian/tf/1.0.4:\n", "\trv1.4=#43c36d148794e3feeb3dd39e105ce6a4df79c467 (latest release)\n", "```\n", "\n", "The data is preprocessed in order to speed up typical Text-Fabric operations.\n", "The result is cached on your computer.\n", "Preprocessing costs time. Next time you use this corpus on this machine, the startup time is much quicker.\n", "\n", "```\n", "TF setup done.\n", "```\n", "\n", "Then the app goes on to act as a local webserver serving the corpus that has just been downloaded\n", "and it will open your browser for you and load the corpus page\n", "\n", "```\n", " * Running on http://localhost:8106/ (Press CTRL+C to quit)\n", "Opening oldbabylonian in browser\n", "Listening at port 18986\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Help!\n", "\n", "Indeed, that is what you need. Click the vertical `Help` tab.\n", "\n", "From there, click around a little bit. Don't read closely, just note the kinds of information that is presented to you.\n", "\n", "Later on, it will make more sense!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Browsing\n", "\n", "First we browse our data. Click the browse button.\n", "\n", "\n", "\n", "and then, in the table of *documents* (tablets), click on `obverse`\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you're looking at one side of tablet: the marks in an ASCII transcription.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now click the *Options* tab and select the `layout-orig-unicode` format to see the same tablet in cuneiform signs.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can click a triangle to see how a line is broken down:\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Searching\n", "\n", "See that line, starting with the word `um-ma`, and whose last word ends in the sign `ma`?\n", "\n", "That is a pattern. Let's search for it.\n", "\n", "Enter this query in the search pad and press the search icon above it.\n", "\n", "```\n", "line\n", " =: word\n", " =: sign reading=um\n", " <: sign reading=ma\n", " :=\n", " < sign reading=ma\n", " :=\n", "```\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In English:\n", "\n", "search all `line`s that contain a `word` and a `sign` where:\n", "\n", "* `=:` the `word` starts where the `line` starts\n", "* the `word` contains a `sign` and a `sign` where:\n", " * `=:` the first `sign` starts where the `word` starts\n", " * `<:` the second sign follows the first sign immediately\n", " * `:=` the second sign ends where the word ends\n", "* `<` the `sign` comes after the word\n", "* `:=` the `sign` ends where the line ends\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can expand results by clicking the triangle. \n", "\n", "You can see the result in context by clicking the browse icon.\n", "\n", "You can go back to the result list by clicking the results icon.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Computing\n", "\n", "We see that this line comes at the start of a tablet.\n", "\n", "In fact, this pattern corresponds to a heading of a letter.\n", "\n", "Question: of all 1274 results, how many are the first line, the second line, the third line, etc?\n", "\n", "*This is a typical question where you want to leave the search mode and enter computing mode*.\n", "\n", "Let's do that!\n", "\n", "If you have followed the installation instructions, you are nearly set.\n", "\n", "Open your terminal and say\n", "\n", "``` sh\n", "jupyter notebook\n", "```\n", "\n", "Your browser starts up and presents you a local computing environment where you can run Python programs.\n", "\n", "You see cells like the one below, where you can type programming statements and execute them by pressing `Shift Enter`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First we load the Text-Fabric module, as follows:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from tf.app import use" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we load the TF-app for the corpus `oldbabylonian` and that app loads the corpus data.\n", "\n", "We give a name to the result of all that loading: `A`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/Nino-cunei/oldbabylonian/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/Nino-cunei/oldbabylonian/tf/1.0.6" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " TF: TF API 12.5.4, Nino-cunei/oldbabylonian/app v3, Search Reference
\n", " Data: Nino-cunei - oldbabylonian 1.0.6, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots / node% coverage
document1285158.15100
face283471.71100
line273757.42100
word765052.64100
cluster234491.7821
sign2032191.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
Old Babylonian Letters 1900-1600: Cuneiform tablets\n", "
\n", "\n", "
\n", "
\n", "ARK\n", "
\n", "
str
\n", "\n", " persistent identifier of type ARK from metadata field \"UCLA Library ARK\"\n", "\n", "
\n", "\n", "
\n", "
\n", "after\n", "
\n", "
str
\n", "\n", " what comes after a sign or word (- or space)\n", "\n", "
\n", "\n", "
\n", "
\n", "afterr\n", "
\n", "
str
\n", "\n", " what comes after a sign or word (- or space); between adjacent signs a ␣ is inserted\n", "\n", "
\n", "\n", "
\n", "
\n", "afteru\n", "
\n", "
str
\n", "\n", " what comes after a sign when represented as unicode (space)\n", "\n", "
\n", "\n", "
\n", "
\n", "atf\n", "
\n", "
str
\n", "\n", " full atf of a sign (without cluster chars) or word (including cluster chars)\n", "\n", "
\n", "\n", "
\n", "
\n", "atfpost\n", "
\n", "
str
\n", "\n", " atf of cluster closings at sign\n", "\n", "
\n", "\n", "
\n", "
\n", "atfpre\n", "
\n", "
str
\n", "\n", " atf of cluster openings at sign\n", "\n", "
\n", "\n", "
\n", "
\n", "author\n", "
\n", "
str
\n", "\n", " author from metadata field \"Author(s)\"\n", "\n", "
\n", "\n", "
\n", "
\n", "col\n", "
\n", "
int
\n", "\n", " ATF column number\n", "\n", "
\n", "\n", "
\n", "
\n", "collated\n", "
\n", "
int
\n", "\n", " whether a sign is collated (*)\n", "\n", "
\n", "\n", "
\n", "
\n", "collection\n", "
\n", "
str
\n", "\n", " collection of a document\n", "\n", "
\n", "\n", "
\n", "
\n", "comment\n", "
\n", "
str
\n", "\n", " $ comment to line or inline comment to slot ($ and $)\n", "\n", "
\n", "\n", "
\n", "
\n", "damage\n", "
\n", "
int
\n", "\n", " whether a sign is damaged\n", "\n", "
\n", "\n", "
\n", "
\n", "det\n", "
\n", "
int
\n", "\n", " whether a sign is a determinative gloss - between braces { }\n", "\n", "
\n", "\n", "
\n", "
\n", "docnote\n", "
\n", "
str
\n", "\n", " additional remarks in the document identification\n", "\n", "
\n", "\n", "
\n", "
\n", "docnumber\n", "
\n", "
str
\n", "\n", " number of a document within a collection-volume\n", "\n", "
\n", "\n", "
\n", "
\n", "excavation\n", "
\n", "
str
\n", "\n", " excavation number from metadata field \"Excavation no.\"\n", "\n", "
\n", "\n", "
\n", "
\n", "excised\n", "
\n", "
int
\n", "\n", " whether a sign is excised - between double angle brackets << >>\n", "\n", "
\n", "\n", "
\n", "
\n", "face\n", "
\n", "
str
\n", "\n", " full name of a face including the enclosing object\n", "\n", "
\n", "\n", "
\n", "
\n", "flags\n", "
\n", "
str
\n", "\n", " sequence of flags after a sign\n", "\n", "
\n", "\n", "
\n", "
\n", "fraction\n", "
\n", "
str
\n", "\n", " fraction of a numeral\n", "\n", "
\n", "\n", "
\n", "
\n", "genre\n", "
\n", "
str
\n", "\n", " genre from metadata field \"Genre\"\n", "\n", "
\n", "\n", "
\n", "
\n", "grapheme\n", "
\n", "
str
\n", "\n", " grapheme of a sign\n", "\n", "
\n", "\n", "
\n", "
\n", "graphemer\n", "
\n", "
str
\n", "\n", " grapheme of a sign using non-ascii characters\n", "\n", "
\n", "\n", "
\n", "
\n", "graphemeu\n", "
\n", "
str
\n", "\n", " grapheme of a sign using cuneiform unicode characters\n", "\n", "
\n", "\n", "
\n", "
\n", "lang\n", "
\n", "
str
\n", "\n", " language of a document\n", "\n", "
\n", "\n", "
\n", "
\n", "langalt\n", "
\n", "
int
\n", "\n", " 1 if a sign is in the alternate language (i.e. Sumerian) - between underscores _ _\n", "\n", "
\n", "\n", "
\n", "
\n", "ln\n", "
\n", "
int
\n", "\n", " ATF line number of a numbered line, without prime\n", "\n", "
\n", "\n", "
\n", "
\n", "lnc\n", "
\n", "
str
\n", "\n", " ATF line identification of a comment line ($)\n", "\n", "
\n", "\n", "
\n", "
\n", "lnno\n", "
\n", "
str
\n", "\n", " ATF line number, may be $ or #, with prime; column number prepended\n", "\n", "
\n", "\n", "
\n", "
\n", "material\n", "
\n", "
str
\n", "\n", " material indication from metadata field \"Material\"\n", "\n", "
\n", "\n", "
\n", "
\n", "missing\n", "
\n", "
int
\n", "\n", " whether a sign is missing - between square brackets [ ]\n", "\n", "
\n", "\n", "
\n", "
\n", "museumcode\n", "
\n", "
str
\n", "\n", " museum code from metadata field \"Museum no.\"\n", "\n", "
\n", "\n", "
\n", "
\n", "museumname\n", "
\n", "
str
\n", "\n", " museum name from metadata field \"Collection\"\n", "\n", "
\n", "\n", "
\n", "
\n", "object\n", "
\n", "
str
\n", "\n", " name of an object of a document\n", "\n", "
\n", "\n", "
\n", "
\n", "operator\n", "
\n", "
str
\n", "\n", " the ! or x in a !() or x() construction\n", "\n", "
\n", "\n", "
\n", "
\n", "operatorr\n", "
\n", "
str
\n", "\n", " the ! or x in a !() or x() construction, represented as =, ␣\n", "\n", "
\n", "\n", "
\n", "
\n", "operatoru\n", "
\n", "
str
\n", "\n", " the ! or x in a !() or x() construction, represented as =, ␣\n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "period\n", "
\n", "
str
\n", "\n", " period indication from metadata field \"Period\"\n", "\n", "
\n", "\n", "
\n", "
\n", "pnumber\n", "
\n", "
str
\n", "\n", " P number of a document\n", "\n", "
\n", "\n", "
\n", "
\n", "primecol\n", "
\n", "
int
\n", "\n", " whether a prime is present on a column number\n", "\n", "
\n", "\n", "
\n", "
\n", "primeln\n", "
\n", "
int
\n", "\n", " whether a prime is present on a line number\n", "\n", "
\n", "\n", "
\n", "
\n", "pubdate\n", "
\n", "
str
\n", "\n", " publication date from metadata field \"Publication date\"\n", "\n", "
\n", "\n", "
\n", "
\n", "question\n", "
\n", "
int
\n", "\n", " whether a sign has the question flag (?)\n", "\n", "
\n", "\n", "
\n", "
\n", "reading\n", "
\n", "
str
\n", "\n", " reading of a sign\n", "\n", "
\n", "\n", "
\n", "
\n", "readingr\n", "
\n", "
str
\n", "\n", " reading of a sign using non-ascii characters\n", "\n", "
\n", "\n", "
\n", "
\n", "readingu\n", "
\n", "
str
\n", "\n", " reading of a sign using cuneiform unicode characters\n", "\n", "
\n", "\n", "
\n", "
\n", "remarkable\n", "
\n", "
int
\n", "\n", " whether a sign is remarkable (!)\n", "\n", "
\n", "\n", "
\n", "
\n", "remarks\n", "
\n", "
str
\n", "\n", " # comment to line\n", "\n", "
\n", "\n", "
\n", "
\n", "repeat\n", "
\n", "
int
\n", "\n", " repeat of a numeral; the value n (unknown) is represented as -1\n", "\n", "
\n", "\n", "
\n", "
\n", "srcLn\n", "
\n", "
str
\n", "\n", " full line in source file\n", "\n", "
\n", "\n", "
\n", "
\n", "srcLnNum\n", "
\n", "
int
\n", "\n", " line number in source file\n", "\n", "
\n", "\n", "
\n", "
\n", "srcfile\n", "
\n", "
str
\n", "\n", " source file name of a document\n", "\n", "
\n", "\n", "
\n", "
\n", "subgenre\n", "
\n", "
str
\n", "\n", " genre from metadata field \"Sub-genre\"\n", "\n", "
\n", "\n", "
\n", "
\n", "supplied\n", "
\n", "
int
\n", "\n", " whether a sign is supplied - between angle brackets < >\n", "\n", "
\n", "\n", "
\n", "
\n", "sym\n", "
\n", "
str
\n", "\n", " essential part of a sign or of a word\n", "\n", "
\n", "\n", "
\n", "
\n", "symr\n", "
\n", "
str
\n", "\n", " essential part of a sign or of a word using non-ascii characters\n", "\n", "
\n", "\n", "
\n", "
\n", "symu\n", "
\n", "
str
\n", "\n", " essential part of a sign or of a word using cuneiform unicode characters\n", "\n", "
\n", "\n", "
\n", "
\n", "trans\n", "
\n", "
int
\n", "\n", " whether a line has a translation\n", "\n", "
\n", "\n", "
\n", "
\n", "transcriber\n", "
\n", "
str
\n", "\n", " person who did the encoding into ATF from metadata field \"ATF source\"\n", "\n", "
\n", "\n", "
\n", "
\n", "translation@ll\n", "
\n", "
str
\n", "\n", " translation of line in language en = English\n", "\n", "
\n", "\n", "
\n", "
\n", "type\n", "
\n", "
str
\n", "\n", " name of a type of cluster or kind of sign\n", "\n", "
\n", "\n", "
\n", "
\n", "uncertain\n", "
\n", "
int
\n", "\n", " whether a sign is uncertain - between brackets ( )\n", "\n", "
\n", "\n", "
\n", "
\n", "volume\n", "
\n", "
int
\n", "\n", " volume of a document within a collection\n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", " Settings:
specified
  1. apiVersion: 3
  2. appName: Nino-cunei/oldbabylonian
  3. appPath:/Users/me/text-fabric-data/github/Nino-cunei/oldbabylonian/app
  4. commit: g00c996ce164f4a1dbb6c6c39aee06075d1f70a82
  5. css:.pnum {
    font-family: sans-serif;
    font-size: small;
    font-weight: bold;
    color: #444444;
    }
    .op {
    padding: 0.5em 0.1em 0.1em 0.1em;
    margin: 0.8em 0.1em 0.1em 0.1em;
    font-family: monospace;
    font-size: x-large;
    font-weight: bold;
    }
    .period {
    font-family: monospace;
    font-size: medium;
    font-weight: bold;
    color: #0000bb;
    }
    .comment {
    color: #7777dd;
    font-family: monospace;
    font-size: small;
    }
    .operator {
    color: #ff77ff;
    font-size: large;
    }
    /* LANGUAGE: superscript and subscript */

    /* cluster */
    .det {
    vertical-align: super;
    }
    /* cluster */
    .langalt {
    vertical-align: sub;
    }
    /* REDACTIONAL: line over or under */

    /* flag */
    .collated {
    font-weight: bold;
    text-decoration: underline;
    }
    /* cluster */
    .excised {
    color: #dd0000;
    text-decoration: line-through;
    }
    /* cluster */
    .supplied {
    color: #0000ff;
    text-decoration: overline;
    }
    /* flag */
    .remarkable {
    font-weight: bold;
    text-decoration: overline;
    }

    /* UNSURE: italic*/

    /* cluster */
    .uncertain {
    font-style: italic
    }
    /* flag */
    .question {
    font-weight: bold;
    font-style: italic
    }

    /* BROKEN: text-shadow */

    /* cluster */
    .missing {
    color: #999999;
    text-shadow: #bbbbbb 1px 1px;
    }
    /* flag */
    .damage {
    font-weight: bold;
    color: #999999;
    text-shadow: #bbbbbb 1px 1px;
    }
    .empty {
    color: #ff0000;
    }

  6. dataDisplay:
    • showVerseInTuple: True
    • textFormats:
      • layout-orig-rich:
        • method: layoutRich
        • style: trans
      • layout-orig-unicode:
        • method: layoutUnicode
        • style: orig
      • text-orig-full: {style: source}
      • text-orig-plain: {style: trans}
      • text-orig-rich: {style: trans}
      • text-orig-unicode: {style: orig}
  7. docs:
    • charText: mapping from readings to UNICODE
    • charUrl:https://nbviewer.jupyter.org/github/Nino-cunei/tfFromAtf/blob/master/programs/mapReadings.ipynb
    • docPage: about
    • featureBase:https://github.com/Nino-cunei/tfFromAtf/blob/master/docs/transcription{docExt}
    • featurePage: ''
  8. interfaceDefaults: {lineNumbers: 0}
  9. isCompatible: True
  10. local: local
  11. localDir:/Users/me/text-fabric-data/github/Nino-cunei/oldbabylonian/_temp
  12. provenanceSpec:
    • corpus: Old Babylonian Letters 1900-1600: Cuneiform tablets
    • doi: 10.5281/zenodo.2579207
    • org: Nino-cunei
    • relative: /tf
    • repo: oldbabylonian
    • version: 1.0.6
    • webBase: https://cdli.ucla.edu
    • webHint: Show this document on CDLI
    • webUrl:{webBase}/search/search_results.php?SearchMode=Text&ObjectID=<1>
  13. release: v1.6
  14. typeDisplay:
    • cluster:
      • label: {type}
      • stretch: 0
    • document:
      • featuresBare: collection volume docnumber docnote
      • lineNumber: srcLnNum
    • face:
      • featuresBare: object
      • lineNumber: srcLnNum
    • line:
      • features: remarks translation@en
      • lineNumber: srcLnNum
    • sign:
    • \n", " features:\n", " collated remarkable question damage det uncertain missing excised supplied langalt comment remarks repeat fraction operator grapheme\n", "
    • word:
      • base: True
      • label: True
      • wrap: 0
  15. writing: akk
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
TF API: names N F E L T S C TF Fs Fall Es Eall Cs Call directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A = use('Nino-cunei/oldbabylonian', hoist=globals())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some bits are familiar from above, when you ran the `text-fabric` command in the terminal.\n", "\n", "Other bits are links to the documentation, they point to the same places as the links on the Text-Fabric browser.\n", "\n", "You see a list of all the data features that have been loaded.\n", "\n", "And a list of references to the API documentation, which tells you how you can use this data in your program statements." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Searching (revisited)\n", "\n", "We do the same search again, but now inside our program.\n", "\n", "That means that we can capture the results in a list for further processing. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.29s 1274 results\n" ] } ], "source": [ "results = A.search('''\n", "line\n", " =: word\n", " =: sign reading=um\n", " <: sign reading=ma\n", " :=\n", " < sign reading=ma\n", " :=\n", "''')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In less than a second, we have all the results!\n", "\n", "Let's look at the first one:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(230790, 258166, 11, 12, 20)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each result is a list of numbers: for a \n", "\n", "1. line\n", "1. word\n", "1. sign\n", "1. sign\n", "1. sign\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here is the second one:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(230826, 258317, 359, 360, 366)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results[1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And here the last one:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(258128, 334552, 202886, 202887, 202894)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results[-1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we want to find out something for each result line: which line number does it have among the lines on the same tablet face?\n", "\n", "Click the link `Feature docs` above, and read a bit under **Node type line**.\n", "\n", "There you see that the feature `ln` is of particular interest to us.\n", "\n", "First we get the line number of result 1000:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "252681\n", "3\n" ] } ], "source": [ "node = results[999][0]\n", "print(node)\n", "lineNumber = F.ln.v(node)\n", "print(lineNumber)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we collect the set of all line numbers that our result lines have:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 31}" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "{F.ln.v(result[0]) for result in results}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What we really want to know is how the result lines are distributed over the line numbers." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "import collections" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Counter({3: 834, 2: 110, 4: 102, 6: 42, 7: 37, 5: 33, 8: 31, 9: 16, 10: 13, 1: 11, 12: 9, 11: 8, 13: 8, 16: 5, 15: 4, 14: 4, 20: 2, 17: 2, 31: 1, 19: 1, 21: 1})\n" ] } ], "source": [ "distribution = collections.Counter()\n", "\n", "for result in results:\n", " lineNumber = F.ln.v(result[0])\n", " distribution[lineNumber] += 1\n", " \n", "print(distribution)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An overwhelming majority has it on line 3\n", "\n", "Let's make the output a bit more friendly:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "line 1 is home to 11 results\n", "line 2 is home to 110 results\n", "line 3 is home to 834 results\n", "line 4 is home to 102 results\n", "line 5 is home to 33 results\n", "line 6 is home to 42 results\n", "line 7 is home to 37 results\n", "line 8 is home to 31 results\n", "line 9 is home to 16 results\n", "line 10 is home to 13 results\n", "line 11 is home to 8 results\n", "line 12 is home to 9 results\n", "line 13 is home to 8 results\n", "line 14 is home to 4 results\n", "line 15 is home to 4 results\n", "line 16 is home to 5 results\n", "line 17 is home to 2 results\n", "line 19 is home to 1 results\n", "line 20 is home to 2 results\n", "line 21 is home to 1 results\n", "line 31 is home to 1 results\n" ] } ], "source": [ "for (lineNumber, amount) in sorted(distribution.items()):\n", " print(f'line {lineNumber:>2} is home to {amount:>3} results')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now inspect more closely what is going on, for example where results appear late in the tablet, after line 16:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.28s 7 results\n" ] } ], "source": [ "results16 = A.search('''\n", "line ln>16\n", " =: word\n", " =: sign reading=um\n", " <: sign reading=ma\n", " :=\n", " < sign reading=ma\n", " :=\n", "''')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And we can show them here too:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
nplinewordsignsignsign
1P365130 obverse:20um-ma a-ma-na-nu-um-maum-ma um-ma ma
2P479269 obverse:20um-ma szu-maum-ma um-ma ma
3P479269 obverse:31um-ma szu-maum-ma um-ma ma
4P387306 obverse:19um-ma at-ta-a-maum-ma um-ma ma
5P387324 obverse:17um-ma at!-ta-ma#um-ma um-ma ma#
6P372422 obverse:17um-ma _sag-geme2_-maum-ma um-ma ma
7P372422 obverse:21um-ma szu-u2-maum-ma um-ma ma
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A.table(results16)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But at this point it might be easier to take the new query back to the Text-Fabric browser and query it there:\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.4" } }, "nbformat": 4, "nbformat_minor": 4 }