"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A = use(\"Nino-cunei/oldbabylonian\", hoist=globals())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We pick an example face with which we illustrate many ways to represent cuneiform text."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# exampleFace = ('P509373', 'obverse')\n",
"# exampleFace = ('P292990', 'obverse')\n",
"exampleFace = (\"P292987\", \"reverse\")\n",
"f = T.nodeFromSection(exampleFace)\n",
"lines = L.d(f, otype=\"line\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Raw text\n",
"\n",
"The most basic way is to show the source material for each line, which is in the feature `srcLn`.\n",
"\n",
"This feature has been filled by mere copying the numbered lines from the CDLI ATF sources."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1. nu-uk!(AZ)-ta-la-al-li-mu\n",
"2. dumu ku3#-{d}nanna\n",
"3. i-he#-[er]-ri#\n",
"4. la-ma qa2-as-su2\n",
"5. isz#-ku-nu\n",
"6. at#-ta hi-i-ri\n",
"7. u3 a-na\n",
"8. i-szar-ku-bi\n"
]
}
],
"source": [
"for ln in lines:\n",
" print(F.srcLn.v(ln))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Text formats\n",
"\n",
"The TF API supports *text formats*. Text formats make selections and apply templates and styles based\n",
"on the analysed features of the text. For example: a text-format may ignore flags or clusters, or\n",
"format numerals in special ways.\n",
"\n",
"Text formats are not baked into TF, but they are defined in the feature `otext` of the corpus.\n",
"\n",
"Moreover, for this corpus a TF app has been build that defines additional text-formats.\n",
"\n",
"Whereas the formats defined in `otext` are strictly plain text formats, the formats\n",
"defined in the app are able to use typographic styles to shape the text, such as bold, italic, colours, etc.\n",
"\n",
"Here is the list of all formats."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'text-orig-full': 'sign',\n",
" 'text-orig-plain': 'sign',\n",
" 'text-orig-rich': 'sign',\n",
" 'text-orig-unicode': 'sign',\n",
" 'layout-orig-rich': 'sign',\n",
" 'layout-orig-unicode': 'sign'}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"T.formats"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plain text formats\n",
"\n",
"The formats whose names start with `text-` are the plain text formats.\n",
"\n",
"### `text-orig-full`\n",
"\n",
"This format is really close to the ATF. It contains all original information.\n",
"\n",
"This is the default format. We do not have to specify it."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"239995 nu-uk!(AZ)-ta-la-al-li-mu\n"
]
},
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for ln in lines:\n",
" print(ln, T.text(ln))\n",
" A.plain(ln)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `plain()` function focuses on the *contents*, and instead of the line number, it gives a full specification\n",
"of the location, linked to the online source on CDLI.\n",
"\n",
"But we can omit the locations:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for ln in lines:\n",
" A.plain(ln, fmt=\"text-orig-plain\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `text-orig-rich`\n",
"\n",
"This format is a bit prettier: instead of the strict ASCII encoding used by the CDLI archive, it uses\n",
"characters with diacritics.\n",
"\n",
"There is no flag/cluster information in this representation."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for ln in lines:\n",
" A.plain(ln, fmt=\"text-orig-rich\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `text-orig-unicode`\n",
"\n",
"This format uses the Cuneiform Unicode characters.\n",
"\n",
"Numerals with repeats are represented by placing that many copies of the character in question.\n",
"\n",
"Readings that could not be found in the\n",
"[mapping](https://github.com/Nino-cunei/oldbabylonian/blob/master/sources/writing/GeneratedSignList.json)\n",
"we use, appear in latin characters.\n",
"\n",
"There is no flag/cluster information in this representation."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for ln in lines:\n",
" A.plain(ln, fmt=\"layout-orig-unicode\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here is the text of the face in each of the plain text formats, i.e. no additional HTML formatting is applied."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Pretty\n",
"\n",
"The ultimate of graphical display is by means of the `pretty()` function.\n",
"\n",
"This display is less useful for reading, but instead optimized for showing all information that you might\n",
"wish for.\n",
"\n",
"It shows a base representation according to a text format of your choice\n",
"(here we choose `layout-orig-rich`), and it shows the values\n",
"of a standard set of features."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'_{d}suen_-i-[din-nam]'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"w = F.otype.s(\"word\")[1]\n",
"F.atf.v(w)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.pretty(w, fmt=\"layout-orig-unicode\", baseTypes=\"sign\", withNodes=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Later on, in the [search](search.ipynb) tutorial we see that `pretty()` can also display other features,\n",
"even features that you or other people have created and added later.\n",
"\n",
"Here we call for the feature `atf`, which shows the original ATF for the sign in question\n",
"excluding the bracketing characters.\n",
"\n",
"Consult the\n",
"[feature documentation](https://github.com/Nino-cunei/oldbabylonian/blob/master/docs//transcription.md)\n",
"to see what information is stored in all the features.\n",
"\n",
"We show it with node numbers, but you could leave them out in an obvious way."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"A.pretty(f, extraFeatures=\"atf\", fmt=\"layout-orig-rich\", withNodes=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We do not see much, because the default condense type is `line`, and a `document` is bigger than that.\n",
"Objects bigger than de condense type will be abbreviated to a label that indicates their identity,\n",
"not their contents.\n",
"\n",
"But we can override this by adding `full=True`.\n",
"\n",
"See also the documentation on [`pretty`](https://annotation.github.io/text-fabric/tf/advanced/display.html#tf.advanced.display.pretty)."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"