{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Paragraphs\n",
"\n",
"Paragraph numbers are stored in an extra annotation package called `para`.\n",
"That packages makes available the feature `pargr`.\n",
"\n",
"In this notebook we show how you could use that feature, together with features from an other annotation package, `lexicon`, of which we use the `gloss` feature."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
" 0.00s This is LAF-Fabric 4.8.2\n",
"API reference: http://laf-fabric.readthedocs.org/en/latest/texts/API-reference.html\n",
"Feature doc: https://shebanq.ancient-data.org/static/docs/featuredoc/texts/welcome.html\n",
"\n",
" 3m 01s END\n"
]
}
],
"source": [
"import sys, os\n",
"import collections\n",
"\n",
"import laf\n",
"from laf.fabric import LafFabric\n",
"from etcbc.preprocess import prepare\n",
"fabric = LafFabric()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Loading the feature data"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
" 0.00s LOADING API: please wait ... \n",
" 0.01s DETAIL: COMPILING m: etcbc4b: UP TO DATE\n",
" 0.01s USING main: etcbc4b DATA COMPILED AT: 2015-11-02T15-08-56\n",
" 0.01s DETAIL: COMPILING a: lexicon: UP TO DATE\n",
" 0.01s USING annox: lexicon DATA COMPILED AT: 2016-07-08T14-32-54\n",
" 0.01s DETAIL: COMPILING a: para: UP TO DATE\n",
" 0.02s USING annox: para DATA COMPILED AT: 2016-07-08T14-38-37\n",
" 0.03s DETAIL: load main: G.node_anchor_min\n",
" 0.12s DETAIL: load main: G.node_anchor_max\n",
" 0.19s DETAIL: load main: G.node_sort\n",
" 0.25s DETAIL: load main: G.node_sort_inv\n",
" 0.74s DETAIL: load main: G.edges_from\n",
" 0.82s DETAIL: load main: G.edges_to\n",
" 0.90s DETAIL: load main: F.etcbc4_db_otype [node] \n",
" 1.72s DETAIL: load annox lexicon: F.etcbc4_lex_gloss [node] \n",
" 2.00s DETAIL: load annox para: F.etcbc4_px_pargr [node] \n",
" 2.05s LOGFILE=/Users/dirk/laf/laf-fabric-output/etcbc4b/paragraphs/__log__paragraphs.txt\n",
" 2.10s INFO: LOADING PREPARED data: please wait ... \n",
" 2.10s prep prep: G.node_sort\n",
" 2.17s prep prep: G.node_sort_inv\n",
" 3.04s prep prep: L.node_up\n",
" 6.59s prep prep: L.node_down\n",
" 13s prep prep: V.verses\n",
" 13s prep prep: V.books_la\n",
" 13s ETCBC reference: http://laf-fabric.readthedocs.org/en/latest/texts/ETCBC-reference.html\n",
" 15s INFO: LOADED PREPARED data\n",
" 15s INFO: DATA LOADED FROM SOURCE etcbc4b AND ANNOX lexicon, para FOR TASK paragraphs AT 2016-09-22T10-17-59\n"
]
}
],
"source": [
"version = '4b'\n",
"API = fabric.load('etcbc{}'.format(version), 'lexicon,para', 'paragraphs', {\n",
" \"xmlids\": {\"node\": False, \"edge\": False},\n",
" \"features\": ('''\n",
" otype \n",
" gloss\n",
" pargr\n",
" ''',\n",
" '''\n",
" '''),\n",
" \"prepare\": prepare,\n",
" \"primary\": False,\n",
"}, verbose='DETAIL')\n",
"exec(fabric.localnames.format(var='fabric'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Result\n",
"\n",
"We collect all clause atoms, and produce the paragraph numbers and glosses of them, but only the first 100 of them."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 1 in beginning create god(s)