{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "\n", "# Phrases in versions of the BHSA\n", "\n", "In [version Mappings](versionMappings.ipynb)\n", "we have constructed edge features that map the nodes from one version of the data to the next.\n", "In this notebook we are going to use those edges to study what happened to the feature `function`\n", "of `phrases`.\n", "\n", "# Overview\n", "\n", "We explore:\n", "* how the values of the `function` feature have changed;\n", "* to what degree phrases have other boundaries.\n", "\n", "# Discussion\n", "The feature `function` was called `phrase_function` in version `3`.\n", "\n", "## Phrase boundaries\n", "In order to see whether phrase boundaries have changed, we follow the `omap@` edges from\n", "phrases in one version to their counterparts in the next version.\n", "\n", "We make use of the dissimilarity values that are attached to such edges.\n", "If there is no value, or the value is `0`, we have a match without a boundary change.\n", "All other dissimilarities imply that boundaries have changed.\n", "\n", "# Results\n", "For the sake of presentation,\n", "we start with the result cells, **they should be run after the other cells**.\n", "The computation starts [here](#Start)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Changes in `function` values" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 13s Phrase function change from version 3 to 2017 #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
3\\2017AdjuCmplConjEPPrExsSExstFrntIntSIntjLocaModSModiNCoSNCopNegaObjcPrAdPrcSPreCPreOPreSPredPtcOQuesRelaSubjSuppTimeVoct
Adju54381421518171692171841114921773201
Cmpl186220059112511168271105131160383
Conj1015133064101715110531023197329101
ExsS7
Exst2290137
Frnt8775519121455
IntS1611
Intj11171199659355121
IrpC1183
IrpO13120542
IrpP4114243
IrpS112332184
Loca18802199249177118
ModS2522
Modi61801128242122711401411836121254210
NegS50
Nega217131244154234573139212
Objc3281218715153513240318248
PreC80805132621971163251356783414167614
PreO9111625348234136619
PreS15619212
Pred23311420642756155819
PtSp11
PtcO1587141
Ques11932122197971925
Rela13591321111148294
Subj2516265519101112382175321115121238936
Supp651101573140
Time1559911472281213627451
Unkn3992797911890185493607340162081029221661761726055555841413424363135610507321219775
Voct126324821
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 15s Phrase function change from version 2016 to 2017 #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
2016\\2017AdjuCmplConjEPPrExsSExstFrntIntSIntjLocaModSModiNCoSNCopNegaObjcPrAdPrcSPreCPreOPreSPredPtcOQuesRelaSubjSuppTimeVoct
Adju9508122525426
Cmpl163000241311
Conj146135331
EPPr21
ExsS14
Exst143
Frnt11119119
IntS251
Intj1621
Loca22621
ModS35
Modi373832216
NCoS101
NCop595
Nega6047
Objc2652226271197
PrAd242
PrcS8
PreC64812119333112
PreO54021
PreS886
Pred57069
PtcO162
Ques11203
Rela116327
Subj11351931313190711
Supp178
Time613850
Voct121605
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 16s Phrase function change from version 4b to 2016 #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
4b\\2016AdjuCmplConjEPPrExsSExstFrntIntSIntjLocaModSModiNCoSNCopNegaObjcPrAdPrcSPreCPreOPreSPredPtcOQuesRelaSubjSuppTimeVoct
Adju9477311511111161821
Cmpl3929921168124124117
Conj146124121
EPPr9
ExsS14
Exst143
Frnt108725
IntS251
Intj1621
Loca35326131424
ModS35
Modi13980131
NCoS101
NCop5941
Nega1116040
Objc7241043222596214160
PrAd22351
PrcS8
PreC15511111103193271125
PreO1540430
PreS185510
Pred11570684
PtcO162
Ques1204
Rela363251
Subj3111117111411141318111
Supp29176
Time1193835
Voct221607
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 17s Phrase function change from version 4 to 4b #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
4\\4bAdjuCmplConjEPPrExsSExstFrntIntSIntjLocaModSModiNCoSNCopNegaObjcPrAdPrcSPreCPreOPreSPredPtcOQuesRelaSubjSuppTimeVoct
Adju80619413710206115582655181186317
Cmpl77276069210865710531158623
Conj4439459361710611011913744271
EPPr4
ExsS14
Exst1431
Frnt15100712553
IntS250
Intj162413
Loca718243343542
ModS351
Modi391961435261124132341915
NCoS101
NCop132258723
Nega61414603912241
Objc1535201513206722226142603
PrAd117912
PrcS8
PreC36417115435121755019147728
PreO14155434761112
PreS17771
Pred11114457042913
PtcO11614
Ques1594241811561831
Rela3162317136239121
Subj1512351839150579141828763416
Supp6049223180
Time2181140251163489
Unkn13912357651132351672120431318401231102957433779
Voct221171504
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 18s Phrase function change from version 3 to 4 #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
3\\4AdjuCmplConjEPPrExsSExstFrntIntSIntjLocaModSModiNCoSNCopNegaObjcPrAdPrcSPreCPreOPreSPredPtcOQuesRelaSubjSuppTimeUnknVoct
Adju6067741561031431965143152
Cmpl9022418125146179262131171362
Conj8727335406815436252318392271
ExsS7
Exst290139
Frnt821785812225
IntS1611
Intj12171199165935511
IrpC1183
IrpO1113012
IrpP24113973
IrpS1322188
Loca1217221196113114
ModS2621
Modi35627282418256714013515224146
NegS50
Nega10610124433424253313611
Objc224414825155881015219327
PreC48501031821471149111379781721108713
PreO91112546314112676
PreS763022
Pred14111122024275618994
PtSp11
PtcO14190111
Ques221212188521
Rela1356921814903
Subj19104401781189311432322121407528
Supp1544134264
Time15521111922623828071
Unkn2726558912119455021472402385891022164177252432253756108367369140176092980311416692
Voct11822118835
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "for (v, w) in reversed(phraseMapping): # noqa F821\n", " caption(1, \"Phrase function change from version {} to {}\".format(v, w)) # noqa F821\n", " featureDiff(v, w, \"FUNCTION\") # noqa F821" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Boundary statistics" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 30s Phrase boundary change from version 3 to 2017 #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
dissimilaritynumber of phrases
0251551
129
226
322
413
510
65
76
81
913
103
114
124
131
143
151
161
17
18
192
201
21
22
231
24
25
261
27
281
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 30s Phrase boundary change from version 2016 to 2017 #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
dissimilaritynumber of phrases
0253073
129
226
322
413
510
65
76
81
913
103
114
124
131
143
151
161
17
18
192
201
21
22
231
24
25
261
27
281
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 30s Phrase boundary change from version 4b to 2016 #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
dissimilaritynumber of phrases
0252881
1128
282
365
426
516
611
714
811
95
103
112
121
131
14
15
161
17
181
19
201
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 30s Phrase boundary change from version 4 to 4b #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
dissimilaritynumber of phrases
0250751
1750
2745
3618
4372
5305
6188
7141
8123
977
1067
1164
1243
1341
1427
1522
1615
1717
1820
1915
2011
219
223
234
245
252
26
275
282
295
302
313
321
332
34
351
361
371
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "##############################################################################################\n", "# #\n", "# 1m 30s Phrase boundary change from version 3 to 4 #\n", "# #\n", "##############################################################################################\n", "\n" ] }, { "data": { "text/html": [ "
dissimilaritynumber of phrases
0250346
12837
21164
3788
4457
5287
6166
7127
886
961
1066
1133
1239
1319
1422
1516
1616
1710
1810
197
204
212
22
233
241
25
26
27
28
29
301
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "for (v, w) in reversed(phraseMapping): # noqa F821\n", " caption(1, \"Phrase boundary change from version {} to {}\".format(v, w)) # noqa F821\n", " showStats(v, w) # noqa F821" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Start\n", "Start the program here." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import os # noqa 402\n", "import collections # noqa 402\n", "from functools import reduce # noqa 402\n", "from utils import caption # noqa 402\n", "from tf.fabric import Fabric # noqa 402\n", "\n", "from IPython.display import HTML, display # noqa 402\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We specify our versions and the subtle differences between them as far as they are relevant." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "REPO = os.path.expanduser(\"~/github/etcbc/bhsa\")\n", "baseDir = \"{}/tf\".format(REPO)\n", "tempDir = \"{}/_temp\".format(REPO)\n", "\n", "versions = \"\"\"\n", " 3\n", " 4\n", " 4b\n", " 2016\n", " 2017\n", "\"\"\".strip().split()\n", "\n", "versionInfoSpec = {\n", " \"\": dict(\n", " OCC=\"g_word\",\n", " LEX=\"lex\",\n", " FUNCTION=\"function\",\n", " ),\n", " \"3\": dict(\n", " OCC=\"text_plain\",\n", " LEX=\"lexeme\",\n", " FUNCTION=\"phrase_function\",\n", " ),\n", "}\n", "\n", "versionInfo = {}\n", "\n", "defaults = versionInfoSpec[\"\"].items()\n", "\n", "for (i, v) in enumerate(versions):\n", " versionInfo.setdefault(v, {})[\"OMAP\"] = (\n", " \"\" if i == 0 else \"omap@{}-{}\".format(versions[i - 1], v)\n", " )\n", " versionInfo[v].update(versionInfoSpec.get(\"\", {}))\n", " versionInfo[v].update(versionInfoSpec.get(v, {}))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load all versions in one go, with the version mapping feature if present." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 0.00s Version -> 3 <- loading ... .\n", "..............................................................................................\n", "This is Text-Fabric 3.0.9\n", "Api reference : https://github.com/Dans-labs/text-fabric/wiki/Api\n", "Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb\n", "Example data : https://github.com/Dans-labs/text-fabric-data\n", "\n", "118 features found and 0 ignored\n", " 0.00s loading features ...\n", " | 0.12s B lexeme from /Users/dirk/github/etcbc/bhsa/tf/3\n", " | 0.22s B text_plain from /Users/dirk/github/etcbc/bhsa/tf/3\n", " | 0.08s B phrase_function from /Users/dirk/github/etcbc/bhsa/tf/3\n", " | 0.00s Feature overview: 115 for nodes; 2 for edges; 1 configs; 7 computed\n", " 4.99s All features loaded/computed - for details use loadLog()\n", "..............................................................................................\n", ". 5.00s Version -> 4 <- loading ... .\n", "..............................................................................................\n", "This is Text-Fabric 3.0.9\n", "Api reference : https://github.com/Dans-labs/text-fabric/wiki/Api\n", "Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb\n", "Example data : https://github.com/Dans-labs/text-fabric-data\n", "\n", "104 features found and 0 ignored\n", " 0.00s loading features ...\n", " | 0.14s B g_word from /Users/dirk/github/etcbc/bhsa/tf/4\n", " | 0.12s B lex from /Users/dirk/github/etcbc/bhsa/tf/4\n", " | 0.07s B function from /Users/dirk/github/etcbc/bhsa/tf/4\n", " | 6.25s T omap@3-4 from /Users/dirk/github/etcbc/bhsa/tf/4\n", " | 0.00s Feature overview: 98 for nodes; 5 for edges; 1 configs; 7 computed\n", " 12s All features loaded/computed - for details use loadLog()\n", "..............................................................................................\n", ". 17s Version -> 4b <- loading ... .\n", "..............................................................................................\n", "This is Text-Fabric 3.0.9\n", "Api reference : https://github.com/Dans-labs/text-fabric/wiki/Api\n", "Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb\n", "Example data : https://github.com/Dans-labs/text-fabric-data\n", "\n", "103 features found and 0 ignored\n", " 0.00s loading features ...\n", " | 0.16s B g_word from /Users/dirk/github/etcbc/bhsa/tf/4b\n", " | 0.14s B lex from /Users/dirk/github/etcbc/bhsa/tf/4b\n", " | 0.07s B function from /Users/dirk/github/etcbc/bhsa/tf/4b\n", " | 6.33s T omap@4-4b from /Users/dirk/github/etcbc/bhsa/tf/4b\n", " | 0.00s Feature overview: 97 for nodes; 5 for edges; 1 configs; 7 computed\n", " 12s All features loaded/computed - for details use loadLog()\n", "..............................................................................................\n", ". 29s Version -> 2016 <- loading ... .\n", "..............................................................................................\n", "This is Text-Fabric 3.0.9\n", "Api reference : https://github.com/Dans-labs/text-fabric/wiki/Api\n", "Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb\n", "Example data : https://github.com/Dans-labs/text-fabric-data\n", "\n", "108 features found and 0 ignored\n", " 0.00s loading features ...\n", " | 0.15s B g_word from /Users/dirk/github/etcbc/bhsa/tf/2016\n", " | 0.12s B lex from /Users/dirk/github/etcbc/bhsa/tf/2016\n", " | 0.08s B function from /Users/dirk/github/etcbc/bhsa/tf/2016\n", " | 6.56s T omap@4b-2016 from /Users/dirk/github/etcbc/bhsa/tf/2016\n", " | 0.00s Feature overview: 102 for nodes; 5 for edges; 1 configs; 7 computed\n", " 12s All features loaded/computed - for details use loadLog()\n", "..............................................................................................\n", ". 41s Version -> 2017 <- loading ... .\n", "..............................................................................................\n", "This is Text-Fabric 3.0.9\n", "Api reference : https://github.com/Dans-labs/text-fabric/wiki/Api\n", "Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb\n", "Example data : https://github.com/Dans-labs/text-fabric-data\n", "\n", "114 features found and 0 ignored\n", " 0.00s loading features ...\n", " | 0.48s B g_word from /Users/dirk/github/etcbc/bhsa/tf/2017\n", " | 0.16s B lex from /Users/dirk/github/etcbc/bhsa/tf/2017\n", " | 0.10s B function from /Users/dirk/github/etcbc/bhsa/tf/2017\n", " | 6.50s T omap@2016-2017 from /Users/dirk/github/etcbc/bhsa/tf/2017\n", " | 0.00s Feature overview: 108 for nodes; 5 for edges; 1 configs; 7 computed\n", " 13s All features loaded/computed - for details use loadLog()\n" ] } ], "source": [ "TF = {}\n", "api = {}\n", "for (i, v) in enumerate(versions):\n", " for (param, value) in versionInfo[v].items():\n", " globals()[param] = value\n", " caption(4, \"Version -> {} <- loading ...\".format(v))\n", " TF[v] = Fabric(locations=\"{}/{}\".format(baseDir, v), modules=[\"\"])\n", " api[v] = TF[v].load(\" \".join((OCC, LEX, FUNCTION, OMAP))) # noqa F821" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "# Utility function: tables in your cells" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def tableText(table):\n", " return display(\n", " HTML(\n", " \"{}
\".format(\n", " \"\".join(\n", " \"{}\".format(\"\".join(str(_) for _ in row))\n", " for row in table\n", " )\n", " )\n", " )\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Get counterparts\n", "\n", "Here is a function that gets the counterparts of phrases between versions, and classifies them according to dissimilarity.\n", "\n", "`phraseMapping` is keyed by a (source version, target version) pair,\n", "then by dissimilarity, then by node in source version, and then\n", "the value is a node in the target version.\n", "\n", "Source nodes that lack a counterpart, end up in a bucket with dissimilarity -1." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "phraseMapping = collections.OrderedDict()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def getPhrases(v, w):\n", " V = api[v]\n", " W = api[w]\n", " mapVW = \"omap@{}-{}\".format(v, w)\n", " vKey = (v, w)\n", "\n", " phraseMapping[vKey] = {}\n", " phrases = phraseMapping[vKey]\n", "\n", " for n in V.F.otype.s(\"phrase\"):\n", " ms = W.Es(mapVW).f(n)\n", " if ms is not None:\n", " phrases[n] = ms" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We also want to see the evolution in one big leap, so we construct a mapping from the first version to the last,\n", "just by composing the individual `omap@`s into a stride.\n", "\n", "Picking a phrase, and following it through the versions might lead to multiple counterparts.\n", "When that happens, we choose the one with the highest similarity, and ignore the rest." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true, "lines_to_end_of_cell_marker": 2 }, "outputs": [], "source": [ "def composeMap(curMap, newStep):\n", " resultMap = {}\n", " for (n, ms) in curMap.items():\n", " theM = (\n", " ms[0][0] if len(ms) == 1 else sorted(ms, key=lambda x: (x[1], x[0]))[0][0]\n", " )\n", " resultMap[n] = newStep[theM]\n", " return resultMap\n", "\n", "\n", "def getFirstLastMapping():\n", " if len(versions) <= 2:\n", " return {}\n", " curMap = phraseMapping[(versions[0], versions[1])]\n", "\n", " for i in range(2, len(versions)):\n", " caption(0, \"mapping from {} to {}\".format(versions[0], versions[i]))\n", " curMap = composeMap(curMap, phraseMapping[(versions[i - 1], versions[i])])\n", " phraseMapping[(versions[0], versions[-1])] = curMap" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 2 }, "source": [ "# Table of boundary changes" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true, "lines_to_next_cell": 2 }, "outputs": [], "source": [ "def showStats(v, w):\n", " vKey = (v, w)\n", " phrases = phraseMapping[vKey]\n", " dists = {}\n", " for (n, ms) in phrases.items():\n", " for (m, dis) in ms:\n", " dists.setdefault(dis or 0, set()).add(m)\n", " stats = collections.Counter()\n", " for (dis, ms) in dists.items():\n", " stats[dis] = len(ms)\n", " table = []\n", " table.append([\"dissimilarity\", \"number of phrases\"])\n", " for dis in range(0, max(stats) + 1):\n", " table.append([dis, stats.get(dis, \"\")])\n", " tableText(table)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "lines_to_next_cell": 2 }, "source": [ "# Table of old and new values\n", "We visualize the changes in the values of the `function` feature,\n", "by generating a matrix, with old values in the row headers\n", "and new values in the column headers, and the number of times that this old feature has changed into that new\n", "feature in the corresponding matrix cells." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def featureDiff(v, w, feat):\n", " V = api[v]\n", " W = api[w]\n", " vKey = (v, w)\n", " vFeat = versionInfo[v][feat]\n", " wFeat = versionInfo[w][feat]\n", " phrases = phraseMapping[vKey]\n", "\n", " combis = {}\n", " for (n, ms) in phrases.items():\n", " vVal = V.Fs(vFeat).v(n)\n", " for (m, dis) in ms:\n", " wVal = W.Fs(wFeat).v(m)\n", " combis.setdefault(vVal, collections.Counter())[wVal] += 1\n", " vValues = sorted(combis.keys())\n", " wValues = sorted(reduce(set.union, [set(combis[v]) for v in vValues], set()))\n", " table = []\n", " table.append([\"{}\\\\{}\".format(v, w)] + wValues)\n", " for v in vValues:\n", " table.append([v] + [str(combis[v].get(w, \"\")) for w in wValues])\n", " tableText(table)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Collect\n", "We collect all data in a big data structure." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "..............................................................................................\n", ". 55s Collecting data .\n", "..............................................................................................\n", "| 55s \t3 => 4 \n", "| 57s \t4 => 4b \n", "| 58s \t4b => 2016\n", "| 1m 00s \t2016 => 2017\n", "| 1m 02s \t3 => 2017\n", "| 1m 02s mapping from 3 to 4b\n", "| 1m 02s mapping from 3 to 2016\n", "| 1m 02s mapping from 3 to 2017\n", "| 1m 02s Done\n" ] } ], "source": [ "caption(4, \"Collecting data\")\n", "for (i, w) in enumerate(versions):\n", " if i == 0:\n", " continue\n", " v = versions[i - 1]\n", " caption(0, \"\\t{:<4} => {:<4}\".format(v, w))\n", " getPhrases(v, w)\n", "\n", "caption(0, \"\\t{:<4} => {:<4}\".format(versions[0], versions[-1]))\n", "getFirstLastMapping()\n", "caption(0, \"Done\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }