{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "---\n", "\n", "To get started: consult [start](start.ipynb)\n", "\n", "---\n", "\n", "# Named Entities\n", "\n", "A research group at VU University Amsterdam (Piek Vossen VU, Sophie Arnoult)\n", "has applied a NER-algorithm to this corpus (Named Entity Recognition) and \n", "delivered the results as Text-Fabric features in \n", "[cltl/voc-missives](https://github.com/cltl/voc-missives).\n", "\n", "We can use these shared features, they are in `export/tf` and we see that they have been produced\n", "against version `1.0` of the corpus data.\n", "\n", "See [entityProto](entityProto.ipynb) for an exploration of these entities.\n", "\n", "Based on that we have created `ent` nodes for entity occurrences and `entity` nodes for collections of `ent`\n", "nodes that have the same entity id and entity kind." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2018-05-24T10:06:39.818664Z", "start_time": "2018-05-24T10:06:39.796588Z" } }, "outputs": [], "source": [ "from tf.app import use" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "**Locating corpus resources ...**" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "app: ~/text-fabric-data/github/CLARIAH/wp6-missieven/app" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Status: latest release online v1.1e versus v1.0 locally" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "downloading app, main data and requested additions ..." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Status: latest release online v1.1e versus v1.1 locally" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "downloading app, main data and requested additions ..." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "data: ~/text-fabric-data/github/CLARIAH/wp6-missieven/ner" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " | 1.22s T otype from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 12s T oslots from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.53s T transn from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 11s T punc from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.99s T n from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 6.26s T punco from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 4.51s T puncr from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.41s T puncn from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T title from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 5.42s T transr from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 7.79s T transo from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 14s T trans from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | | 0.21s C __levels__ from otype, oslots, otext\n", " | | 33s C __order__ from otype, oslots, __levels__\n", " | | 1.24s C __rank__ from otype, __order__\n", " | | 28s C __levUp__ from otype, oslots, __rank__\n", " | | 5.25s C __levDown__ from otype, __levUp__, __rank__\n", " | | 2.71s C __characters__ from otext\n", " | | 11s C __boundary__ from otype, oslots, __rank__\n", " | | 1.27s C __sections__ from otype, oslots, otext, __levUp__, __levels__, n, n, n\n", " | | 0.51s C __structure__ from otype, oslots, otext, __rank__, __levUp__, n, title, n\n", " | 0.00s T author from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T authorFull from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.05s T col from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T day from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.05s T eid from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.03s T eoccs from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T isden from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.01s T isemph from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.04s T isfolio from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.35s T isnote from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T isnum from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 5.37s T isorig from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.01s T isq from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.03s T isref from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 3.77s T isremark from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T isspecial from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T issub from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.02s T issuper from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T isund from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.04s T kind from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.02s T mark from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T month from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.05s T note from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T page from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T place from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T rawdate from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.07s T row from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T seq from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T status from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T vol from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.13s T weblink from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.02s T x from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n", " | 0.00s T year from ~/text-fabric-data/github/CLARIAH/wp6-missieven/tf/1.0e\n" ] }, { "data": { "text/html": [ "\n", " TF: TF API 12.3.4, CLARIAH/wp6-missieven/app v3, Search Reference
\n", " Data: CLARIAH - wp6-missieven 1.0e, Character table, Feature docs
\n", "
Node types\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", "\n", "
Name# of nodes# slots / node% coverage
volume14426954.79100
letter6079847.39100
page11215532.98100
table491137.911
para34773100.7959
remark2411097.4939
head60731.120
note1247616.884
line52691811.34100
row83508.101
entity46596.260
folio78992.630
cell323022.091
ent177561.640
subhead18641.420
word59773671.00100
\n", " Sets: no custom sets
\n", " Features:
\n", "
General Missives Dutch East India Company 1600-1800\n", "
\n", "\n", "
\n", "
\n", "author\n", "
\n", "
str
\n", "\n", " authors of the letter, surnames only\n", "\n", "
\n", "\n", "
\n", "
\n", "authorFull\n", "
\n", "
str
\n", "\n", " authors of the letter, full names\n", "\n", "
\n", "\n", "
\n", "
\n", "col\n", "
\n", "
int
\n", "\n", " column number of a column in a row in a table\n", "\n", "
\n", "\n", "
\n", "
\n", "day\n", "
\n", "
int
\n", "\n", " day part of the date of the letter\n", "\n", "
\n", "\n", "
\n", "
\n", "eid\n", "
\n", "
str
\n", "\n", " entity identifier base on string value of occurrence\n", "\n", "
\n", "\n", "
\n", "
\n", "isden\n", "
\n", "
int
\n", "\n", " whether a word is the denominator in fraction, e.g. 4 in 1/4\n", "\n", "
\n", "\n", "
\n", "
\n", "isemph\n", "
\n", "
str
\n", "\n", " whether a word is emphasized by typography\n", "\n", "
\n", "\n", "
\n", "
\n", "isfolio\n", "
\n", "
int
\n", "\n", " a folio reference\n", "\n", "
\n", "\n", "
\n", "
\n", "isnote\n", "
\n", "
int
\n", "\n", " whether a word belongs to footnote text\n", "\n", "
\n", "\n", "
\n", "
\n", "isnum\n", "
\n", "
int
\n", "\n", " whether a word is the numerator in fraction, e.g. 1 in 1/4\n", "\n", "
\n", "\n", "
\n", "
\n", "isorig\n", "
\n", "
int
\n", "\n", " whether a word belongs to original text\n", "\n", "
\n", "\n", "
\n", "
\n", "isq\n", "
\n", "
int
\n", "\n", " whether a word is a numerical fraction, e.g. 1/4\n", "\n", "
\n", "\n", "
\n", "
\n", "isref\n", "
\n", "
int
\n", "\n", " whether a word belongs to the text of reference\n", "\n", "
\n", "\n", "
\n", "
\n", "isremark\n", "
\n", "
int
\n", "\n", " whether a word belongs to the text of editorial remarks\n", "\n", "
\n", "\n", "
\n", "
\n", "isspecial\n", "
\n", "
int
\n", "\n", " whether a word has special typography possibly with OCR mistakes as well\n", "\n", "
\n", "\n", "
\n", "
\n", "issub\n", "
\n", "
int
\n", "\n", " whether a word has subscript typography possibly indicating the denominator of a fraction\n", "\n", "
\n", "\n", "
\n", "
\n", "issuper\n", "
\n", "
int
\n", "\n", " whether a word has superscript typography possibly indicating the numerator of a fraction\n", "\n", "
\n", "\n", "
\n", "
\n", "isund\n", "
\n", "
str
\n", "\n", " whether a word is underlined by typography\n", "\n", "
\n", "\n", "
\n", "
\n", "kind\n", "
\n", "
str
\n", "\n", " entity kind\n", "\n", "
\n", "\n", "
\n", "
\n", "mark\n", "
\n", "
int
\n", "\n", " footnote mark (not necessarily the same as shown on the printed page\n", "\n", "
\n", "\n", "
\n", "
\n", "month\n", "
\n", "
int
\n", "\n", " month part of the date of the letter\n", "\n", "
\n", "\n", "
\n", "
\n", "n\n", "
\n", "
int
\n", "\n", " number of a volume, letter, page, para, line, table\n", "\n", "
\n", "\n", "
\n", "
\n", "otype\n", "
\n", "
str
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "page\n", "
\n", "
str
\n", "\n", " number of the first page of this letter in this volume\n", "\n", "
\n", "\n", "
\n", "
\n", "place\n", "
\n", "
str
\n", "\n", " place from where the letter was sent\n", "\n", "
\n", "\n", "
\n", "
\n", "punc\n", "
\n", "
str
\n", "\n", " punctuation and/or whitespace following a wordup to the next word\n", "\n", "
\n", "\n", "
\n", "
\n", "puncn\n", "
\n", "
str
\n", "\n", " punctuation and/or whitespace following a word,up to the next word, footnote text only\n", "\n", "
\n", "\n", "
\n", "
\n", "punco\n", "
\n", "
str
\n", "\n", " punctuation and/or whitespace following a word,up to the next word, original text only\n", "\n", "
\n", "\n", "
\n", "
\n", "puncr\n", "
\n", "
str
\n", "\n", " punctuation and/or whitespace following a word,up to the next word, remark text only\n", "\n", "
\n", "\n", "
\n", "
\n", "rawdate\n", "
\n", "
str
\n", "\n", " the date the letter was sent\n", "\n", "
\n", "\n", "
\n", "
\n", "row\n", "
\n", "
int
\n", "\n", " row number of a row of column in a table\n", "\n", "
\n", "\n", "
\n", "
\n", "seq\n", "
\n", "
str
\n", "\n", " ('sequence number of this letter among the letters of the same author in this volume',)\n", "\n", "
\n", "\n", "
\n", "
\n", "status\n", "
\n", "
str
\n", "\n", " status of the letter, e.g. secret, copy\n", "\n", "
\n", "\n", "
\n", "
\n", "title\n", "
\n", "
str
\n", "\n", " title of the letter\n", "\n", "
\n", "\n", "
\n", "
\n", "trans\n", "
\n", "
str
\n", "\n", " transcription of a word\n", "\n", "
\n", "\n", "
\n", "
\n", "transn\n", "
\n", "
str
\n", "\n", " transcription of a word, only for footnote text\n", "\n", "
\n", "\n", "
\n", "
\n", "transo\n", "
\n", "
str
\n", "\n", " transcription of a word, only for original text\n", "\n", "
\n", "\n", "
\n", "
\n", "transr\n", "
\n", "
str
\n", "\n", " transcription of a word, only for remark text\n", "\n", "
\n", "\n", "
\n", "
\n", "vol\n", "
\n", "
int
\n", "\n", " volume number\n", "\n", "
\n", "\n", "
\n", "
\n", "weblink\n", "
\n", "
str
\n", "\n", " the page-specific part of web links for page nodes\n", "\n", "
\n", "\n", "
\n", "
\n", "x\n", "
\n", "
int
\n", "\n", " column offset of a column in a row in a table\n", "\n", "
\n", "\n", "
\n", "
\n", "year\n", "
\n", "
int
\n", "\n", " year part of the date of the letter\n", "\n", "
\n", "\n", "
\n", "
\n", "eoccs\n", "
\n", "
none
\n", "\n", " entity occurrences\n", "\n", "
\n", "\n", "
\n", "
\n", "note\n", "
\n", "
none
\n", "\n", " edge between a word and the footnotes associated with it\n", "\n", "
\n", "\n", "
\n", "
\n", "oslots\n", "
\n", "
none
\n", "\n", " \n", "\n", "
\n", "\n", "
\n", "
\n", "\n", " Settings:
specified
  1. apiVersion: 3
  2. appName: CLARIAH/wp6-missieven
  3. appPath: /Users/me/text-fabric-data/github/CLARIAH/wp6-missieven/app
  4. commit: g61b0cb1b6bb6e9c4549a53aa5db557ffe37c1946
  5. css:.remark {
    font-size: large;
    font-style: italic;
    }
    .folio {
    font-size: small;
    color: #668866;
    }
    .fmark:after {
    font-size: small;
    font-weight: bold;
    vertical-align: super;
    color: #ddaa22;
    }
    .note {
    vertical-align: super;
    font-size: small;
    color: #774400;
    }
    .ref {
    font-size: small;
    font-weight: bold;
    color: #666688;
    }
    .emph {
    font-style: italic;
    }
    .und {
    text-decoration: underline;
    }
    .q {
    color: #777777;
    font-weight: bold;
    }
    .num {
    font-size: small;
    vertical-align: super;
    }
    .den {
    font-size: small;
    vertical-align: sub;
    }
    .sub {
    vertical-align: sub;
    }
    .super {
    vertical-align: super;
    }
    .special {
    font-family: monospace;
    font-weight: bold;
    color: #886666;
    }
  6. dataDisplay:
  7. \n", " textFormats:\n", "
    • layout-full: {method: layoutFull}
    • layout-nonorig: {method: layoutNonOrig}
    • layout-nonotes: {method: layoutNoNotes}
    • layout-noremarks: {method: layoutNoRemarks}
    • layout-notes: {method: layoutNotes}
    • layout-orig: {method: layoutOrig}
    • layout-remarks: {method: layoutRemarks}
    \n", "
  8. docs:
    • docPage: about
    • featureBase:https://github.com/{org}/{repo}/blob/master/docs/transcription{docExt}
    • featurePage: ''
  9. interfaceDefaults: {}
  10. isCompatible: True
  11. local: local
  12. localDir:/Users/me/text-fabric-data/github/CLARIAH/wp6-missieven/_temp
  13. provenanceSpec:
    • corpus: General Missives Dutch East India Company 1600-1800
    • doi: 10.5281/zenodo.4011801
    • extraData: ner
    • org: CLARIAH
    • relative: /tf
    • repo: wp6-missieven
    • version: 1.0e
    • webBase:http://resources.huygens.knaw.nl/retroboeken/generalemissiven
    • webFeature: weblink
    • webHint: Show this document on Huygens
    • webOffset:
    • \n", " 2:\n", "
      • 1: 23
      • 10: 11
      • 11: 11
      • 12: 11
      • 13: 11
      • 2: 13
      • 3: 13
      • 4: 15
      • 5: 15
      • 6: 15
      • 7: 13
      • 8: 11
      • 9: 13
      \n", "
    • webUrl: {webBase}/#page=<2>&source=<1>
  14. release: v1.1
  15. typeDisplay: {}
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
TF API: names N F E L T S C TF Fs Fall Es Eall Cs Call directly usable

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A = use(\"CLARIAH/wp6-missieven\", checkout=\"latest\", hoist=globals())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following snippet shows how the `entity` and `ent` nodes hang together." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "entity 6656750 is PER pieter.both having 20 occs\n", "ent 6638994 is PER pieter.both 1 3:1\n", "ent 6639002 is PER pieter.both 1 3:1\n", "ent 6639005 is PER pieter.both 1 3:1\n", "ent 6639023 is PER pieter.both 1 7:1\n", "ent 6639045 is PER pieter.both 1 8:1\n", "ent 6639063 is PER pieter.both 1 16:1\n", "ent 6639067 is PER pieter.both 1 16:1\n", "ent 6639073 is PER pieter.both 1 17:1\n", "ent 6639084 is PER pieter.both 1 18:1\n", "ent 6639086 is PER pieter.both 1 19:1\n", "ent 6639105 is PER pieter.both 1 20:1\n", "ent 6639109 is PER pieter.both 1 20:1\n", "ent 6639115 is PER pieter.both 1 20:1\n", "ent 6639116 is PER pieter.both 1 21:1\n", "ent 6639136 is PER pieter.both 1 27:1\n", "ent 6639138 is PER pieter.both 1 27:1\n", "ent 6639141 is PER pieter.both 1 29:1\n", "ent 6639169 is PER pieter.both 1 33:1\n", "ent 6639183 is PER pieter.both 1 37:1\n", "ent 6639217 is PER pieter.both 1 39:1\n" ] } ], "source": [ "firstEntity = F.otype.s(\"entity\")[0]\n", "entityOccurrences = E.eoccs.f(firstEntity)\n", "\n", "print(f\"entity {firstEntity} is {F.kind.v(firstEntity)} {F.eid.v(firstEntity)} having {len(entityOccurrences)} occs\")\n", "\n", "for eo in entityOccurrences:\n", " print(f\"ent {eo} is {F.kind.v(eo)} {F.eid.v(eo)} {A.sectionStrFromNode(eo)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we show the\n", "[NER API](https://annotation.github.io/text-fabric/tf/browser/ner/annotate.html)\n", "as built in into Text-Fabric." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "NE = A.makeNer()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "74 lines\n" ] } ], "source": [ "results = NE.filterContent(eVals=(\"japan\", \"LOC\"))" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
3 462:11 760 matrozen; mogelijk is de 2s.oraveland SHP 1’s Oraveland, 2met carg. van f. 244844. 13. 15 uit 1japan LOC 74Japan 1
3 462:13 f. 948809. 10. 5 uit 1japan LOC 74Japan, 1bestemd voor 1india LOC 32India 1enz.; 1perak LOC 4Perak 1leverde 738 bahar tin; het op
3 653:6 naar 1japan LOC 74Japan 1gevaren met lading van f 51070, vnl. huiden »
3 653:8 De 2 joncken, die de Coning verleden saison nae 1japan LOC 74Japan 1hadde gehad, waeren
3 653:17 gegevens 1japan LOC 74Japan, 1vgl. Beschrijvinge II, I, p. 435-437, deels letterlijk gelijk; uit het
3 653:21 luyden beyde, 1china LOC 79China 1namentlijck en 1japan LOC 74Japan, 1onderdanig sijn of immers den 1japander LOCderiv 1Japander 1
3 813:5 Te vergelijken mei Daghregisters 1672 , p. 28-33bericht uit 1japan LOC 74Japan 1over
3 813:23 Daar stonden 2 joncken uyt 1japan LOC 74Japan 1naa 1tonquin LOC 2Tonquin 1te gaan, die ’t koopergelt
3 813:24 ontrent 20 ten hondert dierder in 1japan LOC 74Japan 1hadden ingekoght als de 2comp.e ORG 5Comp. e 2en
3 834:5 Vgl. voor de inhoud Daghregisters 1672 p. 348-349gegevens 1japan LOC 74Japan , 1vgl.
3 893:7 gegevens over 1japan LOC 74Japan, 1vrijwel letterlijk gelijk afgedrukt Beschrijvinge II, II, p. 448;
3 918:15 jonk uit 1japan LOC 74Japan 1met o. a. 1800 kisten koper, één uit 1siam LOC 43Siam 1met o. a. sapanhout »
4 100:22 De 1tayoanse LOCderiv 1Tayoanse 11chinesen LOCderiv 15Chinesen - 1in dat rijck NI. 1japan LOC 74Japan. 1
4 100:26 Grotere cargasoenen voor 1japan LOC 74Japan 1overwogen »
4 238:7 vrees, dal, evenals in 1japan LOC 74Japan, 1in 1china LOC 79China 1de ivaren tegen taxatie aan de gouverneurs zullen
4 241:19 1japan LOC 74Japan 1voor 1coromandel LOC 41Coromandel 1beschikbaar zal zijn, terwijl ook 1bengalen LOC 72Bengalen 1zal kunnen toekomen;
4 309:7 uit 1japan LOC 74Japan 1aangebracht , werden opgekocht d 18 rsd. het picol; de taxatie der goederen te
4 423:17 heeft het laten kruisen van zijn vaartuigen weer gestaakt ; vgl. voor 1japan LOC 74Japan 1Beschrijvinge 11, 1, p. 467-468betreurd, dat er niet meer 1bengaalse LOCderiv 4Bengaalse 1zijde beschikbaar was; men
4 477:24 en 1rome LOC 1Rome 1bestemde 1siamese LOCderiv 2Siamese 1gezanten te 1bantam LOC 133Bantam 1zien zeer tegen de reis op; voor 1japan LOC 74Japan 1
4 782:16 waren de gewraackte ofte uytgeschooten zijde nae 1japan LOC 74Japan 1te senden, alwaar
4 782:19 1comp ORG 86Comp. 1e in hunnen handel op 1japan LOC 74Japan 1hinderlijck en nadeeligh souden konnen wesen
Showing only the first 20 lines of all 74 ones.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "NE.showContent(results, start=20)" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "---\n", "\n", "# Contents\n", "\n", "* **[start](start.ipynb)** start computing with this corpus\n", "* **[search](search.ipynb)** turbo charge your hand-coding with search templates\n", "* **[compute](compute.ipynb)** sink down a level and compute it yourself\n", "* **[exportExcel](exportExcel.ipynb)** make tailor-made spreadsheets out of your results\n", "* **[annotate](annotate.ipynb)** export text, annotate with BRAT, import annotations\n", "* **[share](share.ipynb)** draw in other people's data and let them use yours\n", "* **entities** use results of third-party NER (named entity recognition)\n", "* **[porting](porting.ipynb)** port features made against an older version to a newer version\n", "* **[volumes](volumes.ipynb)** work with selected volumes only\n", "\n", "CC-BY Dirk Roorda" ] } ], "metadata": { "jupytext": { "encoding": "# -*- coding: utf-8 -*-" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }