{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from tf.fabric import Fabric\n", "from tf.convert.walker import CV\n", "import cProfile, pstats, io\n", "from pstats import SortKey" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "TF_PATH = '_temp/tf'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Make test set" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "TF = Fabric(locations=TF_PATH, silent=True)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Importing data from walking through the source ...\n", " | 0.00s Preparing metadata... \n", " | 0.00s No structure nodes will be set up\n", " | SECTION TYPES: chunk\n", " | SECTION FEATURES: num\n", " | STRUCTURE TYPES: \n", " | STRUCTURE FEATURES: \n", " | TEXT FEATURES:\n", " | | text-orig-full cat, num\n", " | 0.01s OK\n", " | 0.00s Following director... \n", " | 1.43s \"edge\" actions: 0\n", " | 1.44s \"feature\" actions: 500000\n", " | 1.44s \"node\" actions: 100000\n", " | 1.44s \"resume\" actions: 0\n", " | 1.44s \"slot\" actions: 400000\n", " | 1.44s \"terminate\" actions: 100001\n", " | 100000 x \"chunk\" node \n", " | 400000 x \"slot\" node = slot type\n", " | 500000 nodes of all types\n", " | 1.51s OK\n", " | 0.00s checking for nodes and edges ... \n", " | 0.00s OK\n", " | 0.00s checking features ... \n", " | 0.11s OK\n", " | 0.00s reordering nodes ...\n", " | 0.09s Sorting 100000 nodes of type \"chunk\"\n", " | 0.23s Max node = 500000\n", " | 0.24s OK\n", " | 0.00s reassigning feature values ...\n", " | | 0.00s node feature \"cat\" with 400000 nodes\n", " | | 0.09s node feature \"num\" with 500000 nodes\n", " | 0.30s OK\n", " 0.00s Exporting 3 node and 1 edge and 1 config features to _temp/tf:\n", " 0.00s VALIDATING oslots feature\n", " 0.07s VALIDATING oslots feature\n", " 0.07s maxSlot= 400000\n", " 0.07s maxNode= 500000\n", " 0.08s OK: oslots is valid\n", " | 0.56s T cat to _temp/tf\n", " | 0.69s T num to _temp/tf\n", " | 0.17s T otype to _temp/tf\n", " | 0.24s T oslots to _temp/tf\n", " | 0.00s M otext to _temp/tf\n", " 1.75s Exported 3 node features and 1 edge features and 1 config features to _temp/tf\n" ] } ], "source": [ "slotType = 'slot'\n", "generic = {\n", " 'name': 'test set for query strategy testing',\n", " 'compiler': 'Dirk Roorda',\n", "}\n", "otext = {\n", " 'fmt:text-orig-full': '{num}{cat} ',\n", " 'sectionTypes': 'chunk',\n", " 'sectionFeatures': 'num',\n", "}\n", "intFeatures = {\n", " 'num',\n", "}\n", "featureMeta = {\n", " 'num': {\n", " 'description': 'node number',\n", " },\n", " 'cat': {\n", " 'description': 'category: m f n',\n", " },\n", "}\n", "\n", "nSlots = 400000\n", "chunkSize = 4\n", "cats = ['m', 'f', 'n']\n", "\n", "def director(cv):\n", " c = None\n", " for n in range(nSlots):\n", " if n % chunkSize == 0:\n", " cv.terminate(c)\n", " c = cv.node('chunk')\n", " cv.feature(c, num=n // chunkSize)\n", " s = cv.slot()\n", " cv.feature(s, num=n, cat=cats[n % 3])\n", " cv.terminate(c)\n", " \n", "cv = CV(TF)\n", "\n", "good = cv.walk(\n", " director,\n", " slotType,\n", " otext=otext,\n", " generic=generic,\n", " intFeatures=intFeatures,\n", " featureMeta=featureMeta,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Load test set" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "TF = Fabric(locations=TF_PATH, silent='deep')\n", "api = TF.loadAll()\n", "docs = api.makeAvailableIn(globals())\n", "silentOff()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Main test1\n", "\n", "This query template consists of a `chunk` and its first and last nodes,\n", "and an independent slot that is constrained between those nodes." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "query = '''\n", "chunk\n", " =: a:slot\n", " < c:slot\n", " :=\n", "\n", "s:slot\n", "\n", "a < s\n", "s < c\n", "'''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First we run it with a few old strategies.\n", "The strategies are not really documented, except from \n", "comments in the code\n", "because they are an implementation detail.\n", "In case you're interested, click the strategy names to go to the code:\n", "\n", "* [`small_choice_first`](https://github.com/annotation/text-fabric/blob/85db305f357466d4735edc7aea4cdfaae6ef6774/tf/search/stitch.py#L152-L219)\n", "* [`small_choice_multi`](https://github.com/annotation/text-fabric/blob/85db305f357466d4735edc7aea4cdfaae6ef6774/tf/search/stitch.py#L222-L347)\n", "* [`by_yarn_size`](https://github.com/annotation/text-fabric/blob/85db305f357466d4735edc7aea4cdfaae6ef6774/tf/search/stitch.py#L350-L425)\n", "\n", "The third one `by_yarn_size` is virtually identical for the kind of queries we are testing here.\n", "So we concentrate on the first two.\n", "\n", "When we run the experiments, we do these steps:\n", "\n", "* study\n", "* show plan\n", "* fetch 10 results under a profiler and collect statistics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice first" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 4 objects ...\n", " 0.20s Constraining search space with 7 relations ...\n", " 0.54s \t2 edges thinned\n", " 0.54s Setting up retrieval plan with strategy small_choice_first ...\n", " 0.56s Ready to deliver results from 700000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_first')" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 4 objects and 7 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 100000 choices\n", "node 2-slot 100000 choices\n", "node 3-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk := 2-slot 1.0 choices (thinned)\n", "edge 0-chunk [[ 2-slot 0 choices\n", "edge 0-chunk [[ 1-slot 1.0 choices\n", "edge 1-slot =: 0-chunk 0 choices\n", "edge 1-slot < 2-slot 0 choices\n", "edge 1-slot < 3-slot 200000.0 choices\n", "edge 3-slot < 2-slot 0 choices\n", " 2.28s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 chunk\n", " 2 R1 =: a:slot\n", " 3 R2 < c:slot\n", " 4 :=\n", " 5 \n", " 6 R3 s:slot\n", " 7 \n", " 8 a < s\n", " 9 s < c\n", "10 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 6400243 function calls (4800149 primitive calls) in 1.832 seconds\n", "\n", " Ordered by: cumulative time\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 2 0.000 0.000 1.832 0.916 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3230(run_code)\n", " 2 0.000 0.000 1.832 0.916 {built-in method builtins.exec}\n", " 1 0.000 0.000 1.832 1.832 :3()\n", " 1 0.000 0.000 1.832 1.832 /Users/dirk/github/annotation/text-fabric/tf/search/search.py:151(fetch)\n", " 1 0.000 0.000 1.832 1.832 /Users/dirk/github/annotation/text-fabric/tf/search/searchexe.py:89(fetch)\n", " 11 0.000 0.000 1.832 0.167 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:683(deliver)\n", "1600105/11 1.503 0.000 1.832 0.167 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:690(stitchOn)\n", " 3199998 0.237 0.000 0.237 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:73()\n", " 1600027 0.091 0.000 0.091 0.000 {built-in method builtins.len}\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codeop.py:132(__call__)\n", " 2 0.000 0.000 0.000 0.000 {built-in method builtins.compile}\n", " 50 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:693()\n", " 5 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:383()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:142(__call__)\n", " 10 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:303()\n", " 1 0.000 0.000 0.000 0.000 :4()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/utils/ipstruct.py:125(__getattr__)\n", " 10 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}\n", " 5 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:355()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:1258(user_global_ns)\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:207(pre_run_code_hook)\n", " 1 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:684()\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", "\n", "\n", "\n" ] } ], "source": [ "pr = cProfile.Profile()\n", "pr.enable()\n", "results = S.fetch(limit=10)\n", "pr.disable()\n", "s = io.StringIO()\n", "sortby = SortKey.CUMULATIVE\n", "ps = pstats.Stats(pr, stream=s).sort_stats(sortby)\n", "ps.print_stats()\n", "print(s.getvalue())" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(400001, 1, 4, 2)\n", "(400001, 1, 4, 3)\n", "(400002, 5, 8, 6)\n", "(400002, 5, 8, 7)\n", "(400003, 9, 12, 10)\n", "(400003, 9, 12, 11)\n", "(400004, 13, 16, 14)\n", "(400004, 13, 16, 15)\n", "(400005, 17, 20, 18)\n", "(400005, 17, 20, 19)\n" ] } ], "source": [ "print('\\n'.join(str(r) for r in results))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 1 up to 50 ...\n", " | 0.00s 1\n", " | 0.00s 2\n", " | 0.29s 3\n", " | 0.30s 4\n", " | 0.57s 5\n", " | 0.57s 6\n", " | 0.84s 7\n", " | 0.84s 8\n", " | 1.11s 9\n", " | 1.11s 10\n", " | 1.38s 11\n", " | 1.38s 12\n", " | 1.66s 13\n", " | 1.66s 14\n", " | 1.93s 15\n", " | 1.93s 16\n", " | 2.20s 17\n", " | 2.20s 18\n", " | 2.48s 19\n", " | 2.48s 20\n", " | 2.76s 21\n", " | 2.76s 22\n", " | 3.13s 23\n", " | 3.13s 24\n", " | 3.45s 25\n", " | 3.45s 26\n", " | 3.75s 27\n", " | 3.75s 28\n", " | 4.02s 29\n", " | 4.02s 30\n", " | 4.30s 31\n", " | 4.30s 32\n", " | 4.65s 33\n", " | 4.65s 34\n", " | 4.99s 35\n", " | 4.99s 36\n", " | 5.28s 37\n", " | 5.28s 38\n", " | 5.56s 39\n", " | 5.56s 40\n", " | 5.83s 41\n", " | 5.83s 42\n", " | 6.11s 43\n", " | 6.11s 44\n", " | 6.39s 45\n", " | 6.39s 46\n", " | 6.66s 47\n", " | 6.66s 48\n", " | 7.03s 49\n", " | 7.04s 50\n", " 7.04s Done: 50 results\n" ] } ], "source": [ "S.count(progress=1, limit=50)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice multi" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 4 objects ...\n", " 0.19s Constraining search space with 7 relations ...\n", " 0.55s \t2 edges thinned\n", " 0.55s Setting up retrieval plan with strategy small_choice_multi ...\n", " 0.57s Ready to deliver results from 700000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_multi')" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 4 objects and 6 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 100000 choices\n", "node 2-slot 100000 choices\n", "node 3-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk := 2-slot 1.0 choices (thinned)\n", "edge 0-chunk [[ 2-slot 0 choices\n", "edge 0-chunk [[ 1-slot 1.0 choices\n", "edge 1-slot =: 0-chunk 0 choices\n", "edge 1-slot < 2-slot 0 choices\n", "edge 1,2-slot <,> 3-slot 20000.0 choices\n", " 2.89s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 chunk\n", " 2 R1 =: a:slot\n", " 3 R2 < c:slot\n", " 4 :=\n", " 5 \n", " 6 R3 s:slot\n", " 7 \n", " 8 a < s\n", " 9 s < c\n", "10 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Observe how two `< >` constraints have been taken together.\n", "They will be tested in one pass." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 3200285 function calls (3200175 primitive calls) in 1.218 seconds\n", "\n", " Ordered by: cumulative time\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 2 0.000 0.000 1.218 0.609 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3230(run_code)\n", " 2 0.000 0.000 1.218 0.609 {built-in method builtins.exec}\n", " 1 0.000 0.000 1.218 1.218 :3()\n", " 1 0.000 0.000 1.218 1.218 /Users/dirk/github/annotation/text-fabric/tf/search/search.py:151(fetch)\n", " 1 0.000 0.000 1.218 1.218 /Users/dirk/github/annotation/text-fabric/tf/search/searchexe.py:89(fetch)\n", " 11 0.000 0.000 1.218 0.111 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:683(deliver)\n", " 121/11 0.981 0.008 1.218 0.111 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:690(stitchOn)\n", " 1600024 0.120 0.000 0.120 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:73()\n", " 1599974 0.117 0.000 0.117 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:83()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codeop.py:132(__call__)\n", " 2 0.000 0.000 0.000 0.000 {built-in method builtins.compile}\n", " 50 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:693()\n", " 5 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:383()\n", " 53 0.000 0.000 0.000 0.000 {built-in method builtins.len}\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:142(__call__)\n", " 5 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:355()\n", " 1 0.000 0.000 0.000 0.000 :4()\n", " 10 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}\n", " 10 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:303()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:1258(user_global_ns)\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/utils/ipstruct.py:125(__getattr__)\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:207(pre_run_code_hook)\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", " 1 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:684()\n", "\n", "\n", "\n" ] } ], "source": [ "pr = cProfile.Profile()\n", "pr.enable()\n", "results = S.fetch(limit=10)\n", "pr.disable()\n", "s = io.StringIO()\n", "sortby = SortKey.CUMULATIVE\n", "ps = pstats.Stats(pr, stream=s).sort_stats(sortby)\n", "ps.print_stats()\n", "print(s.getvalue())" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(400001, 1, 4, 2)\n", "(400001, 1, 4, 3)\n", "(400002, 5, 8, 6)\n", "(400002, 5, 8, 7)\n", "(400003, 9, 12, 10)\n", "(400003, 9, 12, 11)\n", "(400004, 13, 16, 14)\n", "(400004, 13, 16, 15)\n", "(400005, 17, 20, 18)\n", "(400005, 17, 20, 19)\n" ] } ], "source": [ "print('\\n'.join(str(r) for r in results))" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 1 up to 50 ...\n", " | 0.00s 1\n", " | 0.00s 2\n", " | 0.24s 3\n", " | 0.24s 4\n", " | 0.46s 5\n", " | 0.46s 6\n", " | 0.68s 7\n", " | 0.68s 8\n", " | 0.90s 9\n", " | 0.90s 10\n", " | 1.12s 11\n", " | 1.12s 12\n", " | 1.34s 13\n", " | 1.34s 14\n", " | 1.56s 15\n", " | 1.56s 16\n", " | 1.78s 17\n", " | 1.78s 18\n", " | 2.00s 19\n", " | 2.00s 20\n", " | 2.22s 21\n", " | 2.22s 22\n", " | 2.44s 23\n", " | 2.44s 24\n", " | 2.67s 25\n", " | 2.67s 26\n", " | 2.97s 27\n", " | 2.97s 28\n", " | 3.24s 29\n", " | 3.24s 30\n", " | 3.50s 31\n", " | 3.50s 32\n", " | 3.73s 33\n", " | 3.73s 34\n", " | 3.95s 35\n", " | 3.95s 36\n", " | 4.17s 37\n", " | 4.17s 38\n", " | 4.39s 39\n", " | 4.40s 40\n", " | 4.62s 41\n", " | 4.62s 42\n", " | 4.93s 43\n", " | 4.93s 44\n", " | 5.20s 45\n", " | 5.20s 46\n", " | 5.47s 47\n", " | 5.47s 48\n", " | 5.69s 49\n", " | 5.69s 50\n", " 5.69s Done: 50 results\n" ] } ], "source": [ "S.count(progress=1, limit=50)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Observations:\n", "\n", "`small_choice_multi` has a better performance.\n", "\n", "It does only 50% of the function calls that `small_choice_first` does: it cuts out nearly all calls to `stitchOn()` which is a recursive\n", "function that generates new candidates.\n", "\n", "If you look at the primitive calls, then the gain is 30%.\n", "\n", "If you look at the time spent in the `stitchOn()` calls, then you see that `small_choice_first` spends 50% more time in it than\n", "`small_choice_multi`.\n", "\n", "**N.B.** \n", "\n", "In `small_choice_first` 1,600,000 calls to `stitchOn()` take 1.5 seconds.\n", "\n", "In `small_choice_multi` 121 calls to `stitchOn()` take 1.0 seconds.\n", "\n", "That is remarkable. In order to compute the multi-edge, a lot of time per call is needed.\n", "But the net result is positive.\n", "\n", "There is a price: the most time consuming bit is this line:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Main test 2\n", "\n", "We leave out something of the query." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [], "source": [ "query = '''\n", "chunk\n", " =: a:slot\n", " c:slot\n", " :=\n", "\n", "s:slot\n", "\n", "a < s\n", "s < c\n", "'''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It should not make a difference to the outcome that we omit the `a < c` condition, since all chunks have a length greater than 1,\n", "so the first slot of a chunk is always before the last one (and not identical with it)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice first" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 4 objects ...\n", " 0.18s Constraining search space with 6 relations ...\n", " 0.52s \t2 edges thinned\n", " 0.52s Setting up retrieval plan with strategy small_choice_first ...\n", " 0.54s Ready to deliver results from 700000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_first')" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 4 objects and 6 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 100000 choices\n", "node 2-slot 100000 choices\n", "node 3-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk [[ 2-slot 1.0 choices\n", "edge 2-slot := 0-chunk 0 choices\n", "edge 0-chunk [[ 1-slot 1.0 choices\n", "edge 1-slot =: 0-chunk 0 choices\n", "edge 2-slot > 3-slot 200000.0 choices\n", "edge 3-slot > 1-slot 0 choices\n", " 1.36s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 chunk\n", " 2 R1 =: a:slot\n", " 3 R2 c:slot\n", " 4 :=\n", " 5 \n", " 6 R3 s:slot\n", " 7 \n", " 8 a < s\n", " 9 s < c\n", "10 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1600461 function calls (1600301 primitive calls) in 0.334 seconds\n", "\n", " Ordered by: cumulative time\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 2 0.000 0.000 0.334 0.167 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3230(run_code)\n", " 2 0.000 0.000 0.334 0.167 {built-in method builtins.exec}\n", " 1 0.000 0.000 0.334 0.334 :3()\n", " 1 0.000 0.000 0.334 0.334 /Users/dirk/github/annotation/text-fabric/tf/search/search.py:151(fetch)\n", " 1 0.000 0.000 0.334 0.334 /Users/dirk/github/annotation/text-fabric/tf/search/searchexe.py:89(fetch)\n", " 11 0.000 0.000 0.334 0.030 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:683(deliver)\n", " 171/11 0.228 0.001 0.334 0.030 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:690(stitchOn)\n", " 1600074 0.106 0.000 0.106 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:82()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codeop.py:132(__call__)\n", " 2 0.000 0.000 0.000 0.000 {built-in method builtins.compile}\n", " 103 0.000 0.000 0.000 0.000 {built-in method builtins.len}\n", " 50 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:693()\n", " 10 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:301()\n", " 5 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:353()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:142(__call__)\n", " 1 0.000 0.000 0.000 0.000 :4()\n", " 10 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/utils/ipstruct.py:125(__getattr__)\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:1258(user_global_ns)\n", " 5 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:378()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:207(pre_run_code_hook)\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", " 1 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:684()\n", "\n", "\n", "\n" ] } ], "source": [ "pr = cProfile.Profile()\n", "pr.enable()\n", "results = S.fetch(limit=10)\n", "pr.disable()\n", "s = io.StringIO()\n", "sortby = SortKey.CUMULATIVE\n", "ps = pstats.Stats(pr, stream=s).sort_stats(sortby)\n", "ps.print_stats()\n", "print(s.getvalue())" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(400001, 1, 4, 2)\n", "(400001, 1, 4, 3)\n", "(400002, 5, 8, 6)\n", "(400002, 5, 8, 7)\n", "(400003, 9, 12, 10)\n", "(400003, 9, 12, 11)\n", "(400004, 13, 16, 14)\n", "(400004, 13, 16, 15)\n", "(400005, 17, 20, 18)\n", "(400005, 17, 20, 19)\n" ] } ], "source": [ "print('\\n'.join(str(r) for r in results))" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 1 up to 50 ...\n", " | 0.00s 1\n", " | 0.00s 2\n", " | 0.06s 3\n", " | 0.06s 4\n", " | 0.11s 5\n", " | 0.12s 6\n", " | 0.16s 7\n", " | 0.16s 8\n", " | 0.21s 9\n", " | 0.21s 10\n", " | 0.25s 11\n", " | 0.25s 12\n", " | 0.30s 13\n", " | 0.30s 14\n", " | 0.34s 15\n", " | 0.34s 16\n", " | 0.39s 17\n", " | 0.39s 18\n", " | 0.43s 19\n", " | 0.43s 20\n", " | 0.47s 21\n", " | 0.47s 22\n", " | 0.52s 23\n", " | 0.52s 24\n", " | 0.56s 25\n", " | 0.56s 26\n", " | 0.61s 27\n", " | 0.61s 28\n", " | 0.65s 29\n", " | 0.65s 30\n", " | 0.69s 31\n", " | 0.69s 32\n", " | 0.74s 33\n", " | 0.74s 34\n", " | 0.78s 35\n", " | 0.78s 36\n", " | 0.83s 37\n", " | 0.83s 38\n", " | 0.87s 39\n", " | 0.87s 40\n", " | 0.92s 41\n", " | 0.92s 42\n", " | 0.97s 43\n", " | 0.98s 44\n", " | 1.03s 45\n", " | 1.04s 46\n", " | 1.10s 47\n", " | 1.10s 48\n", " | 1.15s 49\n", " | 1.15s 50\n", " 1.16s Done: 50 results\n" ] } ], "source": [ "S.count(progress=1, limit=50)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice multi" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 4 objects ...\n", " 0.19s Constraining search space with 6 relations ...\n", " 0.53s \t2 edges thinned\n", " 0.53s Setting up retrieval plan with strategy small_choice_multi ...\n", " 0.54s Ready to deliver results from 700000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_multi')" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 4 objects and 5 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 100000 choices\n", "node 2-slot 100000 choices\n", "node 3-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk [[ 2-slot 1.0 choices\n", "edge 2-slot := 0-chunk 0 choices\n", "edge 0-chunk [[ 1-slot 1.0 choices\n", "edge 1-slot =: 0-chunk 0 choices\n", "edge 2,1-slot >,< 3-slot 20000.0 choices\n", " 1.39s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 chunk\n", " 2 R1 =: a:slot\n", " 3 R2 c:slot\n", " 4 :=\n", " 5 \n", " 6 R3 s:slot\n", " 7 \n", " 8 a < s\n", " 9 s < c\n", "10 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Observe how two `< >` constraints have been taken together.\n", "They will be tested in one pass." ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1600341 function calls (1600246 primitive calls) in 0.744 seconds\n", "\n", " Ordered by: cumulative time\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 2 0.000 0.000 0.744 0.372 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3230(run_code)\n", " 2 0.000 0.000 0.744 0.372 {built-in method builtins.exec}\n", " 1 0.000 0.000 0.744 0.744 :3()\n", " 1 0.000 0.000 0.744 0.744 /Users/dirk/github/annotation/text-fabric/tf/search/search.py:151(fetch)\n", " 1 0.000 0.000 0.744 0.744 /Users/dirk/github/annotation/text-fabric/tf/search/searchexe.py:89(fetch)\n", " 11 0.000 0.000 0.744 0.068 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:683(deliver)\n", " 106/11 0.634 0.006 0.744 0.068 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:690(stitchOn)\n", " 1600019 0.110 0.000 0.110 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:82()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codeop.py:132(__call__)\n", " 2 0.000 0.000 0.000 0.000 {built-in method builtins.compile}\n", " 50 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:693()\n", " 55 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:72()\n", " 48 0.000 0.000 0.000 0.000 {built-in method builtins.len}\n", " 10 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:301()\n", " 5 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:353()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:142(__call__)\n", " 1 0.000 0.000 0.000 0.000 :4()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/utils/ipstruct.py:125(__getattr__)\n", " 10 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:1258(user_global_ns)\n", " 5 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:378()\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:207(pre_run_code_hook)\n", " 1 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:684()\n", "\n", "\n", "\n" ] } ], "source": [ "pr = cProfile.Profile()\n", "pr.enable()\n", "results = S.fetch(limit=10)\n", "pr.disable()\n", "s = io.StringIO()\n", "sortby = SortKey.CUMULATIVE\n", "ps = pstats.Stats(pr, stream=s).sort_stats(sortby)\n", "ps.print_stats()\n", "print(s.getvalue())" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(400001, 1, 4, 2)\n", "(400001, 1, 4, 3)\n", "(400002, 5, 8, 6)\n", "(400002, 5, 8, 7)\n", "(400003, 9, 12, 10)\n", "(400003, 9, 12, 11)\n", "(400004, 13, 16, 14)\n", "(400004, 13, 16, 15)\n", "(400005, 17, 20, 18)\n", "(400005, 17, 20, 19)\n" ] } ], "source": [ "print('\\n'.join(str(r) for r in results))" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 1 up to 50 ...\n", " | 0.00s 1\n", " | 0.00s 2\n", " | 0.17s 3\n", " | 0.17s 4\n", " | 0.32s 5\n", " | 0.32s 6\n", " | 0.47s 7\n", " | 0.47s 8\n", " | 0.62s 9\n", " | 0.62s 10\n", " | 0.76s 11\n", " | 0.76s 12\n", " | 0.91s 13\n", " | 0.91s 14\n", " | 1.05s 15\n", " | 1.05s 16\n", " | 1.20s 17\n", " | 1.20s 18\n", " | 1.35s 19\n", " | 1.35s 20\n", " | 1.49s 21\n", " | 1.49s 22\n", " | 1.64s 23\n", " | 1.64s 24\n", " | 1.83s 25\n", " | 1.83s 26\n", " | 2.02s 27\n", " | 2.02s 28\n", " | 2.19s 29\n", " | 2.19s 30\n", " | 2.36s 31\n", " | 2.36s 32\n", " | 2.53s 33\n", " | 2.53s 34\n", " | 2.69s 35\n", " | 2.69s 36\n", " | 2.84s 37\n", " | 2.84s 38\n", " | 2.98s 39\n", " | 2.98s 40\n", " | 3.13s 41\n", " | 3.13s 42\n", " | 3.28s 43\n", " | 3.28s 44\n", " | 3.43s 45\n", " | 3.43s 46\n", " | 3.58s 47\n", " | 3.58s 48\n", " | 3.78s 49\n", " | 3.78s 50\n", " 3.78s Done: 50 results\n" ] } ], "source": [ "S.count(progress=1, limit=50)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Main test 3\n", "\n", "We add something of the query." ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [], "source": [ "query = '''\n", "chunk\n", " =: a:slot\n", " < b:slot\n", " < d:slot\n", " c:slot\n", " :=\n", "\n", "s:slot\n", "\n", "b < s\n", "s < d\n", "'''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It becomes more difficult to constrain s within the chunk.\n", "\n", "This is a heavy query." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice first" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 6 objects ...\n", " 0.29s Constraining search space with 10 relations ...\n", " 0.96s \t2 edges thinned\n", " 0.96s Setting up retrieval plan with strategy small_choice_first ...\n", " 0.99s Ready to deliver results from 1500000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_first')" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 6 objects and 10 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 100000 choices\n", "node 2-slot 400000 choices\n", "node 3-slot 400000 choices\n", "node 4-slot 100000 choices\n", "node 5-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk := 4-slot 1.0 choices (thinned)\n", "edge 4-slot ]] 0-chunk 0 choices\n", "edge 0-chunk =: 1-slot 1.0 choices (thinned)\n", "edge 1-slot ]] 0-chunk 0 choices\n", "edge 0-chunk [[ 3-slot 4.0 choices\n", "edge 0-chunk [[ 2-slot 4.0 choices\n", "edge 2-slot > 1-slot 0 choices\n", "edge 2-slot < 3-slot 0 choices\n", "edge 2-slot < 5-slot 200000.0 choices\n", "edge 5-slot < 3-slot 0 choices\n", " 2.81s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 chunk\n", " 2 R1 =: a:slot\n", " 3 R2 < b:slot\n", " 4 R3 < d:slot\n", " 5 R4 c:slot\n", " 6 :=\n", " 7 \n", " 8 R5 s:slot\n", " 9 \n", "10 b < s\n", "11 s < d\n", "12 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 44799866 function calls (33599883 primitive calls) in 12.853 seconds\n", "\n", " Ordered by: cumulative time\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 2 0.000 0.000 12.853 6.426 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3230(run_code)\n", " 2 0.000 0.000 12.853 6.426 {built-in method builtins.exec}\n", " 1 0.000 0.000 12.853 12.853 :3()\n", " 1 0.000 0.000 12.853 12.853 /Users/dirk/github/annotation/text-fabric/tf/search/search.py:151(fetch)\n", " 1 0.000 0.000 12.853 12.853 /Users/dirk/github/annotation/text-fabric/tf/search/searchexe.py:89(fetch)\n", " 11 0.000 0.000 12.853 1.168 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:683(deliver)\n", "11199994/11 10.608 0.000 12.853 1.168 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:690(stitchOn)\n", " 22399625 1.628 0.000 1.628 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:72()\n", " 11199886 0.618 0.000 0.618 0.000 {built-in method builtins.len}\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codeop.py:132(__call__)\n", " 2 0.000 0.000 0.000 0.000 {built-in method builtins.compile}\n", " 158 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:82()\n", " 50 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:301()\n", " 10 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:380()\n", " 70 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:693()\n", " 20 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:287()\n", " 10 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:355()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:142(__call__)\n", " 1 0.000 0.000 0.000 0.000 :4()\n", " 10 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/utils/ipstruct.py:125(__getattr__)\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:207(pre_run_code_hook)\n", " 1 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:684()\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:1258(user_global_ns)\n", "\n", "\n", "\n" ] } ], "source": [ "pr = cProfile.Profile()\n", "pr.enable()\n", "results = S.fetch(limit=10)\n", "pr.disable()\n", "s = io.StringIO()\n", "sortby = SortKey.CUMULATIVE\n", "ps = pstats.Stats(pr, stream=s).sort_stats(sortby)\n", "ps.print_stats()\n", "print(s.getvalue())" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(400001, 1, 2, 4, 4, 3)\n", "(400002, 5, 6, 8, 8, 7)\n", "(400003, 9, 10, 12, 12, 11)\n", "(400004, 13, 14, 16, 16, 15)\n", "(400005, 17, 18, 20, 20, 19)\n", "(400006, 21, 22, 24, 24, 23)\n", "(400007, 25, 26, 28, 28, 27)\n", "(400008, 29, 30, 32, 32, 31)\n", "(400009, 33, 34, 36, 36, 35)\n", "(400010, 37, 38, 40, 40, 39)\n" ] } ], "source": [ "print('\\n'.join(str(r) for r in results))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice multi" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 6 objects ...\n", " 0.29s Constraining search space with 10 relations ...\n", " 0.96s \t2 edges thinned\n", " 0.97s Setting up retrieval plan with strategy small_choice_multi ...\n", " 1.00s Ready to deliver results from 1500000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_multi')" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 6 objects and 9 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 100000 choices\n", "node 2-slot 400000 choices\n", "node 3-slot 400000 choices\n", "node 4-slot 100000 choices\n", "node 5-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk := 4-slot 1.0 choices (thinned)\n", "edge 4-slot ]] 0-chunk 0 choices\n", "edge 0-chunk =: 1-slot 1.0 choices (thinned)\n", "edge 1-slot ]] 0-chunk 0 choices\n", "edge 0-chunk [[ 3-slot 4.0 choices\n", "edge 0-chunk [[ 2-slot 4.0 choices\n", "edge 2-slot > 1-slot 0 choices\n", "edge 2-slot < 3-slot 0 choices\n", "edge 2,3-slot <,> 5-slot 20000.0 choices\n", " 2.05s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 chunk\n", " 2 R1 =: a:slot\n", " 3 R2 < b:slot\n", " 4 R3 < d:slot\n", " 5 R4 c:slot\n", " 6 :=\n", " 7 \n", " 8 R5 s:slot\n", " 9 \n", "10 b < s\n", "11 s < d\n", "12 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 22400920 function calls (22400415 primitive calls) in 8.301 seconds\n", "\n", " Ordered by: cumulative time\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 2 0.000 0.000 8.301 4.150 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3230(run_code)\n", " 2 0.000 0.000 8.301 4.150 {built-in method builtins.exec}\n", " 1 0.000 0.000 8.301 8.301 :3()\n", " 1 0.000 0.000 8.301 8.301 /Users/dirk/github/annotation/text-fabric/tf/search/search.py:151(fetch)\n", " 1 0.000 0.000 8.301 8.301 /Users/dirk/github/annotation/text-fabric/tf/search/searchexe.py:89(fetch)\n", " 11 0.000 0.000 8.301 0.755 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:683(deliver)\n", " 516/11 6.658 0.013 8.301 0.755 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:690(stitchOn)\n", " 11200157 0.833 0.000 0.833 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:72()\n", " 11199626 0.809 0.000 0.809 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:82()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codeop.py:132(__call__)\n", " 2 0.000 0.000 0.000 0.000 {built-in method builtins.compile}\n", " 418 0.000 0.000 0.000 0.000 {built-in method builtins.len}\n", " 50 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:301()\n", " 70 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:693()\n", " 10 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:380()\n", " 20 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:287()\n", " 10 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/relations.py:355()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:142(__call__)\n", " 1 0.000 0.000 0.000 0.000 :4()\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/utils/ipstruct.py:125(__getattr__)\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py:1258(user_global_ns)\n", " 10 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", " 2 0.000 0.000 0.000 0.000 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/hooks.py:207(pre_run_code_hook)\n", " 1 0.000 0.000 0.000 0.000 /Users/dirk/github/annotation/text-fabric/tf/search/stitch.py:684()\n", "\n", "\n", "\n" ] } ], "source": [ "pr = cProfile.Profile()\n", "pr.enable()\n", "results = S.fetch(limit=10)\n", "pr.disable()\n", "s = io.StringIO()\n", "sortby = SortKey.CUMULATIVE\n", "ps = pstats.Stats(pr, stream=s).sort_stats(sortby)\n", "ps.print_stats()\n", "print(s.getvalue())" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(400001, 1, 2, 4, 4, 3)\n", "(400002, 5, 6, 8, 8, 7)\n", "(400003, 9, 10, 12, 12, 11)\n", "(400004, 13, 14, 16, 16, 15)\n", "(400005, 17, 18, 20, 20, 19)\n", "(400006, 21, 22, 24, 24, 23)\n", "(400007, 25, 26, 28, 28, 27)\n", "(400008, 29, 30, 32, 32, 31)\n", "(400009, 33, 34, 36, 36, 35)\n", "(400010, 37, 38, 40, 40, 39)\n" ] } ], "source": [ "print('\\n'.join(str(r) for r in results))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Observation\n", "\n", "Here is a query where the amount of time spent in the `stitchOn()` overtakes the time spent in the `all)` call.\n", "\n", "So we really have a mixed bag with these strategies.\n", "\n", "For now, I turn on the `small_choice_multi` because it makes really long queries a bit more bearable, and does not\n", "make much of a difference for shorter queries." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Main test 4\n", "\n", "A quite different query." ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [], "source": [ "query = '''\n", "chunk\n", ".num. slot\n", "'''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice first" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 2 objects ...\n", " 0.08s Constraining search space with 1 relations ...\n", " 0.10s \t0 edges thinned\n", " 0.10s Setting up retrieval plan with strategy small_choice_first ...\n", " 0.13s Ready to deliver results from 500000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_first')" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 2 objects and 1 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk .num. 1-slot 0.0 choices\n", " 3.24s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 chunk\n", " 2 R1 .num. slot\n", " 3 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 1 up to 10 ...\n", " | 0.00s 1\n", " | 0.23s 2\n", " | 0.45s 3\n", " | 0.66s 4\n", " | 0.87s 5\n", " | 1.09s 6\n", " | 1.30s 7\n", " | 1.51s 8\n", " | 1.72s 9\n", " | 1.93s 10\n", " 1.94s Done: 10 results\n" ] } ], "source": [ "S.count(progress=1, limit=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice multi" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 2 objects ...\n", " 0.08s Constraining search space with 1 relations ...\n", " 0.10s \t0 edges thinned\n", " 0.10s Setting up retrieval plan with strategy small_choice_multi ...\n", " 0.12s Ready to deliver results from 500000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_multi')" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 2 objects and 1 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk .num. 1-slot 0.0 choices\n", " 1.94s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 chunk\n", " 2 R1 .num. slot\n", " 3 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 1 up to 10 ...\n", " | 0.00s 1\n", " | 0.23s 2\n", " | 0.44s 3\n", " | 0.66s 4\n", " | 0.87s 5\n", " | 1.08s 6\n", " | 1.29s 7\n", " | 1.50s 8\n", " | 1.71s 9\n", " | 1.92s 10\n", " 1.92s Done: 10 results\n" ] } ], "source": [ "S.count(progress=1, limit=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: by yarn size" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.03s Setting up search space for 2 objects ...\n", " 0.11s Constraining search space with 1 relations ...\n", " 0.13s \t0 edges thinned\n", " 0.13s Setting up retrieval plan with strategy by_yarn_size ...\n", " 0.15s Ready to deliver results from 500000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='by_yarn_size')" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 2 objects and 1 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk .num. 1-slot 0.0 choices\n", " 1.17s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 chunk\n", " 2 R1 .num. slot\n", " 3 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 1 up to 10 ...\n", " | 0.00s 1\n", " | 0.24s 2\n", " | 0.45s 3\n", " | 0.66s 4\n", " | 0.88s 5\n", " | 1.09s 6\n", " | 1.31s 7\n", " | 1.53s 8\n", " | 1.74s 9\n", " | 1.96s 10\n", " 1.96s Done: 10 results\n" ] } ], "source": [ "S.count(progress=1, limit=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Main test 5\n", "\n", "Yet another feature comparison query." ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [], "source": [ "query = '''\n", "a:chunk\n", " n:slot\n", "< b:chunk\n", " m:slot\n", "\n", "n .cat. m\n", "'''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice first" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 4 objects ...\n", " 0.14s Constraining search space with 4 relations ...\n", " 0.50s \t0 edges thinned\n", " 0.50s Setting up retrieval plan with strategy small_choice_first ...\n", " 0.54s Ready to deliver results from 1000000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_first')" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 4 objects and 4 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 400000 choices\n", "node 2-chunk 100000 choices\n", "node 3-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk [[ 1-slot 4.0 choices\n", "edge 0-chunk < 2-chunk 50000.0 choices\n", "edge 2-chunk [[ 3-slot 4.0 choices\n", "edge 3-slot .cat. 1-slot 0 choices\n", " 1.49s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 a:chunk\n", " 2 R1 n:slot\n", " 3 R2 < b:chunk\n", " 4 R3 m:slot\n", " 5 \n", " 6 n .cat. m\n", " 7 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 100000 up to 1000000 ...\n", " | 0.50s 100000\n", " | 0.99s 200000\n", " | 1.47s 300000\n", " | 1.95s 400000\n", " | 2.43s 500000\n", " | 2.92s 600000\n", " | 3.45s 700000\n", " | 4.04s 800000\n", " | 4.53s 900000\n", " | 5.18s 1000000\n", " 5.18s Done: 1000000 results\n" ] } ], "source": [ "S.count(progress=100000, limit=1000000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: small choice multi" ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 4 objects ...\n", " 0.14s Constraining search space with 4 relations ...\n", " 0.50s \t0 edges thinned\n", " 0.50s Setting up retrieval plan with strategy small_choice_multi ...\n", " 0.54s Ready to deliver results from 1000000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='small_choice_multi')" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 4 objects and 4 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 400000 choices\n", "node 2-chunk 100000 choices\n", "node 3-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk [[ 1-slot 4.0 choices\n", "edge 0-chunk < 2-chunk 50000.0 choices\n", "edge 2-chunk [[ 3-slot 4.0 choices\n", "edge 1-slot .cat. 3-slot 0 choices\n", " 1.38s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 a:chunk\n", " 2 R1 n:slot\n", " 3 R2 < b:chunk\n", " 4 R3 m:slot\n", " 5 \n", " 6 n .cat. m\n", " 7 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 100000 up to 1000000 ...\n", " | 0.51s 100000\n", " | 1.00s 200000\n", " | 1.49s 300000\n", " | 1.97s 400000\n", " | 2.46s 500000\n", " | 2.94s 600000\n", " | 3.43s 700000\n", " | 3.92s 800000\n", " | 4.44s 900000\n", " | 4.93s 1000000\n", " 4.93s Done: 1000000 results\n" ] } ], "source": [ "S.count(progress=100000, limit=1000000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strategy: by yarn size" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Checking search template ...\n", " 0.00s Setting up search space for 4 objects ...\n", " 0.13s Constraining search space with 4 relations ...\n", " 0.50s \t0 edges thinned\n", " 0.50s Setting up retrieval plan with strategy by_yarn_size ...\n", " 0.55s Ready to deliver results from 1000000 nodes\n", "Iterate over S.fetch() to get the results\n", "See S.showPlan() to interpret the results\n" ] } ], "source": [ "S.study(query, strategy='by_yarn_size')" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search with 4 objects and 4 relations\n", "Results are instantiations of the following objects:\n", "node 0-chunk 100000 choices\n", "node 1-slot 400000 choices\n", "node 2-chunk 100000 choices\n", "node 3-slot 400000 choices\n", "Performance parameters:\n", "\tyarnRatio = 1.25\n", "\ttryLimitFrom = 40\n", "\ttryLimitTo = 40\n", "Instantiations are computed along the following relations:\n", "node 0-chunk 100000 choices\n", "edge 0-chunk [[ 1-slot 4.0 choices\n", "edge 0-chunk < 2-chunk 50000.0 choices\n", "edge 2-chunk [[ 3-slot 4.0 choices\n", "edge 1-slot .cat. 3-slot 0 choices\n", " 5.14s The results are connected to the original search template as follows:\n", " 0 \n", " 1 R0 a:chunk\n", " 2 R1 n:slot\n", " 3 R2 < b:chunk\n", " 4 R3 m:slot\n", " 5 \n", " 6 n .cat. m\n", " 7 \n" ] } ], "source": [ "S.showPlan(details=True)" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.00s Counting results per 100000 up to 1000000 ...\n", " | 0.51s 100000\n", " | 0.99s 200000\n", " | 1.48s 300000\n", " | 1.97s 400000\n", " | 2.46s 500000\n", " | 2.94s 600000\n", " | 3.43s 700000\n", " | 3.91s 800000\n", " | 4.40s 900000\n", " | 4.88s 1000000\n", " 4.88s Done: 1000000 results\n" ] } ], "source": [ "S.count(progress=100000, limit=1000000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Left overs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Test use of shallow" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "query = '''\n", "chunk\n", " slot num=1\n", " < slot\n", "'''" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(400001, 2, 3), (400001, 2, 4)]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(S.search(query))" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[400001]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(S.search(query, shallow=True))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(400001, 2)]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(S.search(query, shallow=2))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "query = '''\n", "slot\n", "<: slot\n", "< slot\n", "<: slot\n", "< slot\n", "<: slot\n", "'''" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\tconnecting to online GitHub repo annotation/app-bhsa ... connected\n", "Using TF-app in /Users/dirk/text-fabric-data/annotation/app-bhsa/code:\n", "\t#d3cf8f0c2ab5d690a0fda14ea31c33da5c5c8483 (latest commit)\n", "\tconnecting to online GitHub repo etcbc/bhsa ... connected\n", "Using data in /Users/dirk/text-fabric-data/etcbc/bhsa/tf/c:\n", "\trv1.6 (latest release)\n", "\tconnecting to online GitHub repo etcbc/phono ... connected\n", "Using data in /Users/dirk/text-fabric-data/etcbc/phono/tf/c:\n", "\tr1.2 (latest release)\n", "\tconnecting to online GitHub repo etcbc/parallels ... connected\n", "Using data in /Users/dirk/text-fabric-data/etcbc/parallels/tf/c:\n", "\tr1.2 (latest release)\n", "\tconnecting to online GitHub repo cmerwich/bh-reference-system ... connected\n", "\tdownloading https://github.com/cmerwich/bh-reference-system/releases/download/v1.0/tf-c.zip ... \n", "\tunzipping ... \n", "\tsaving data\n", "Using data in /Users/dirk/text-fabric-data/cmerwich/bh-reference-system/tf/c:\n", "\trv1.0=#b9852739f705ab1e1bf53a60bbd68f16b4e20d90 (latest release)\n", " | 0.00s No structure info in otext, the structure part of the T-API cannot be used\n" ] }, { "data": { "text/html": [ "Documentation: BHSA Character table Feature docs bhsa API Text-Fabric API 7.8.3 Search Reference
Loaded features:\n", "

BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis: book book@ll chapter code det freq_lex function g_cons g_cons_utf8 g_lex g_lex_utf8 g_word g_word_utf8 gloss gn label language lex lex_utf8 ls nametype nu number otype pdp prs_gn prs_nu prs_ps ps qere qere_trailer qere_trailer_utf8 qere_utf8 rank_lex rela sp st trailer trailer_utf8 txt typ verse voc_lex voc_lex_utf8 vs vt mother oslots

cmerwich/bh-reference-system/tf: pgn_prde pgn_prps pgn_prs pgn_verb pgn_verb_prs

Parallel Passages: crossref

Phonetic Transcriptions: phono phono_trailer

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from tf.app import use\n", "A = use('bhsa', mod='cmerwich/bh-reference-system/tf')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" } }, "nbformat": 4, "nbformat_minor": 2 }