{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font size=6>\n",
    "    <b>Text_Extensions_for_Pandas_Overview.ipynb:</b>\n",
    "    <p>Overview of the basic functionality and usage of Text Extensions for Pandas.</p>\n",
    "</font>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Text Extensions for Pandas\n",
    "\n",
    "[Text Extensions for Pandas](https://github.com/CODAIT/text-extensions-for-pandas) is a library that provides natural language processing support for Pandas DataFrames. It includes [Pandas](https://pandas.pydata.org) extension arrays that help with natural language processing, and integrates with other popular NLP libraries to provide a workflow centered around the easy to use and powerful Pandas [DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html).\n",
    "\n",
    "This notebook gives an overview of the basic functionality of Text Extensions for Pandas, and serves as a jumping off point to more in-depth examples of specific functionality. See the following notebooks that use Text Extensions for Pandas for data analysis, NLP, and model training:\n",
    "\n",
    "- [Analyze_Model_Outputs](./Analyze_Model_Outputs.ipynb) - analyze the outputs of a NLP model on a target corpus\n",
    "- [Analyze_Text](./Analyze_Text.ipynb) - usage with the IBM Watson cloud API\n",
    "- [Integrate_NLP_Libraries](./Integrate_NLP_Libraries.ipynb) - integration with SpaCy and IBM Watson\n",
    "- [Model_Training_with_BERT](./Model_Training_with_BERT.ipynb) - model training for NER with BERT tokenization and embeddings\n",
    "- [Understand_Tables](./Understand_Tables.ipynb) - integration with IBM Watson Discovery for understanding of tables in PDFs and documents\n",
    "\n",
    "API reference can be found at https://text-extensions-for-pandas.readthedocs.io/en/latest/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Environment Setup\n",
    "\n",
    "This notebook requires a Python 3.6 or later environment with NumPy, and Pandas. \n",
    "\n",
    "The notebook also requires the  `text_extensions_for_pandas` library. You can satisfy this dependency in two ways:\n",
    "\n",
    "* Run `pip install text_extensions_for_pandas` before running this notebook. This command adds the library to your Python environment.\n",
    "* Run this notebook out of your local copy of the Text Extensions for Pandas project's [source tree](https://github.com/CODAIT/text-extensions-for-pandas). In this case, the notebook will use the version of Text Extensions for Pandas in your local source tree **if the package is not installed in your Python environment**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import regex\n",
    "import sys\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "# And of course we need the text_extensions_for_pandas library itself.\n",
    "try:\n",
    "    import text_extensions_for_pandas as tp\n",
    "except ModuleNotFoundError as e:\n",
    "    # If we're running from within the project source tree and the parent Python\n",
    "    # environment doesn't have the text_extensions_for_pandas package, use the\n",
    "    # version in the local source tree.\n",
    "    if not os.getcwd().endswith(\"notebooks\"):\n",
    "        raise e\n",
    "    if \"..\" not in sys.path:\n",
    "        sys.path.insert(0, \"..\")\n",
    "    import text_extensions_for_pandas as tp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pandas Extension Arrays\n",
    "\n",
    "Text Extensions for Pandas provides several Pandas extension arrays on which much of the functionality is built on top of. This section will introduce and show basic usage of these extension arrays."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SpanArray\n",
    "\n",
    "A `SpanArray` represents a column of character-based spans over a single target text. It is backed by 2 child arrays of integers that are the begin and end offsets of each span item from the target text. Spans can use any offset within the target text and can also overlap with each other. A `SpanArray` can efficiently represent the tokenized result of text because each token is not copied, only offsets are stored. Equality of spans is determined by the text and offset values, so each token will be unique within the text.\n",
    "\n",
    "The `SpanArray` is a Pandas extension type, so it can be wrapped as a series and included in a DataFrame to make use of standard Pandas functionality. The values of a `SpanArray` are also designed to render nicely as HTML, for easy display of the span offsets, text and highlighted target text.\n",
    "\n",
    "We will show some basic operations of the `SpanArray` by tokenizing a small example piece of text."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Sample text input.\n",
    "text = \"\"\"\\\n",
    "In AD 932, King Arthur and his squire, Patsy, travel throughout Britain \\\n",
    "searching for men to join the Knights of the Round Table. Along the way, \\\n",
    "he recruits Sir Bedevere the Wise, Sir Lancelot the Brave, Sir Galahad \\\n",
    "the Pure, Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot, and Sir \\\n",
    "Not-Appearing-in-this-Film, along with their squires and Robin's troubadours.\\\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define a crude tokenizer to split by words, for example use only.\n",
    "def tokenize_with_offsets(text):\n",
    "    \"\"\"Return offsets of tokens from given `text`\"\"\"\n",
    "    splits = text.split(\" \")\n",
    "    begins = np.cumsum([0] + [len(s) + 1 for s in splits[:-1]])\n",
    "    ends = begins + [len(s.strip(\",.\")) for s in splits]\n",
    "    return begins, ends"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<style class=\"span-array-css\">\n",
       "            .span-array {\n",
       "    --thead-background-color: var(--jp-layout-color1, inherit);\n",
       "    --thead-text-color: var(--jp-ui-font-color1, inherit);\n",
       "    --tbody-background-color-1: var(--jp-layout-color1, inherit);\n",
       "    --tbody-background-color-2: var(--jp-layout-color2, inherit);\n",
       "    --tbody-background-color-hover: var(--jp-rendermime-table-row-hover-background, var(--jp-layout-color3, inherit));\n",
       "    --tbody-background-color-disabled: var(--jp-layout-color4, #ccccd1);\n",
       "    --tbody-text-color: var(--jp-ui-font-color0, inherit);\n",
       "    --tbody-text-color-disabled: var(--jp-ui-inverse-font-color0, #b3b3b9);\n",
       "    --table-font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "\n",
       "    --table-control-background: rgba(0, 0, 0, 0.2);\n",
       "    --table-control-color: var(--jp-ui-font-color0);\n",
       "    --table-control-border: 1px solid rgba(0, 0, 0, 0.8);\n",
       "    --table-control-border-radius: 0.5em;\n",
       "\n",
       "    --root-highlight: #a0c4ff;\n",
       "    --nested-highlight: #ffadad;\n",
       "    --hover-highlight: #ffd6a5;\n",
       "\n",
       "    --inverted-background-color: #0B525B;\n",
       "    --inverted-text-color: rgb(243, 243, 243);\n",
       "    --paragraph-border-color: var(--jp-layout-color2, inherit);\n",
       "\n",
       "    --fallback-font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif\n",
       "}\n",
       "\n",
       "/* Table of span offsets */\n",
       ".span-array>.document>table {\n",
       "    table-layout: auto;\n",
       "    overflow: hidden;\n",
       "    width: 100%;\n",
       "    border-collapse: collapse;\n",
       "    font-family: var(--table-font-family);\n",
       "}\n",
       "\n",
       ".span-array>.document>table thead {\n",
       "    font-variant-caps: all-petite-caps;\n",
       "}\n",
       "\n",
       ".span-array>.document>table th {\n",
       "    padding: 1em;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:last-child, .span-array>.document>table tr>th:last-child {\n",
       "    text-align: right;\n",
       "    width: 100%;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:not(tr>td:last-child), .span-array>.document>table tr>th:not(tr>th:last-child) {\n",
       "    text-align: left;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n), .span-array>.document>table tr.disabled.hover:nth-child(n) {\n",
       "    background-color: var(--tbody-background-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n)>td, .span-array>.document>table tr.disabled.hover:nth-child(n)>td {\n",
       "    color: var(--tbody-text-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.hover:not(.disabled) {\n",
       "    background: var(--jp-rendermime-table-row-hover-background);\n",
       "}\n",
       "\n",
       "/* Table control buttons */\n",
       "\n",
       ".span-array>.document>table td.sa-table-controls-container {\n",
       "    vertical-align: center;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls {\n",
       "    display: flex;\n",
       "    flex-direction: row;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button {\n",
       "    background-color: var(--table-control-background);\n",
       "    color: var(--table-control-color);\n",
       "    border: var(--table-control-border);\n",
       "    border-right: none;\n",
       "    border-radius: 0;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:first-child {\n",
       "    border-radius: var(--table-control-border-radius) 0 0 var(--table-control-border-radius);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:last-child {\n",
       "    border-radius: 0 var(--table-control-border-radius) var(--table-control-border-radius) 0;\n",
       "    border-right: var(--table-control-border);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button[data-control=\"visibility\"]:hover {\n",
       "    background-color: var(--root-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"]:hover {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.highlighted:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"] {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       "/* Styling for spans within document context */\n",
       ".span-array>.document>p {\n",
       "    border:1px solid var(--paragraph-border-color);\n",
       "    border-radius: 0.2em;\n",
       "    padding: 1em;\n",
       "    line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "    box-sizing: border-box;\n",
       "    font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "}\n",
       "\n",
       "body[data-jp-theme-light=\"false\"].span-array>.document>p {\n",
       "    border: 1px solid black;\n",
       "    background-color: var(--inverted-background-color);\n",
       "    color: var(--inverted-text-color);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    padding: 0.4em 0.4em;\n",
       "    border-radius: 0.35em;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>p .mark {\n",
       "    color: var(black);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    background-color: var(--root-highlight);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>mark {\n",
       "    background-color: var(--nested-highlight);\n",
       "    padding: 0.2em 0.4em;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.complex-set {\n",
       "    background: linear-gradient(to right, var(--root-highlight), var(--nested-highlight))\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>span.mark-tag {\n",
       "    font-weight: bolder;\n",
       "    font-size: 0.8em;\n",
       "    font-variant: small-caps;\n",
       "    font-variant-caps: all-small-caps;\n",
       "    margin-left: 8px;\n",
       "    text-transform: uppercase;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.hover, .span-array.span-array>.document>p mark>mark.hover, .span-array>.document>p mark.complex-set.hover, .span-array>.document>p mark.highlighted, .span-array>.document>p mark.complex-set.highlighted, .span-array.span-array>.document>p mark>mark.highlighted {\n",
       "    background: none;\n",
       "    background-color: var(--hover-highlight);\n",
       "}\n",
       "\n",
       "</style>\n",
       "<script>\n",
       "{\n",
       "            // Increment the version to invalidate the cached script\n",
       "const VERSION = 0.79\n",
       "const global_stylesheet = document.head.querySelector(\"style.span-array-css\")\n",
       "const local_stylesheet = document.currentScript.parentElement.querySelector(\"style.span-array-css\")\n",
       "\n",
       "if(window.SpanArray == undefined || window.SpanArray.VERSION == undefined || window.SpanArray.VERSION < VERSION) {\n",
       "\n",
       "    // Replace global SpanArray CSS with latest copy\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet != undefined) {\n",
       "            document.head.removeChild(global_stylesheet)\n",
       "        }\n",
       "        document.head.appendChild(local_stylesheet)\n",
       "    }\n",
       "\n",
       "    // Sets up the SpanArray global namespace\n",
       "    window.SpanArray = {}\n",
       "    window.SpanArray.VERSION = VERSION\n",
       "\n",
       "    window.SpanArray.TYPE_OVERLAP = 0;\n",
       "    window.SpanArray.TYPE_NESTED = 1;\n",
       "    window.SpanArray.TYPE_COMPLEX = 2;\n",
       "    window.SpanArray.TYPE_SOLO = 3;\n",
       "\n",
       "    const TYPE_OVERLAP = window.SpanArray.TYPE_OVERLAP;\n",
       "    const TYPE_NESTED = window.SpanArray.TYPE_NESTED;\n",
       "    const TYPE_COMPLEX = window.SpanArray.TYPE_COMPLEX;\n",
       "    const TYPE_SOLO = window.SpanArray.TYPE_SOLO;\n",
       "\n",
       "    function sanitize(input) {\n",
       "        let out = input.slice();\n",
       "        out = out.replace(/&/g, \"&amp;\")\n",
       "        out = out.replace(/</g, \"&lt;\")\n",
       "        out = out.replace(/>/g, \"&gt;\")\n",
       "        out = out.replace(/\\$/g, \"<span>&#36;</span>\")\n",
       "        out = out.replace(/\"/g, \"&quot;\")\n",
       "        out = out.replace(/(\\r|\\n)/g, \"<br>\")\n",
       "        return out;\n",
       "    }\n",
       "\n",
       "    /** Comparison function used to sort SpanArrays by position and length\n",
       "     *  Will sort by primarily by earliest beginning point. On a tie, will prioritize latest end point (first and largest)\n",
       "     *  Used by mark relationship algorithm.\n",
       "     */\n",
       "    function compareSpanArrays(a, b) {\n",
       "        const start_diff = a.begin - b.begin\n",
       "        if(start_diff == 0) {\n",
       "            return b.end - a.end\n",
       "        }\n",
       "        return start_diff;\n",
       "    }\n",
       "\n",
       "    /** Models an instance of a SpanArray, with document-separated spans and text\n",
       "     * NOTE: Using docs instead of documents to avoid unintentionally manipulating the global 'document' object.\n",
       "    */\n",
       "    class SpanArray {\n",
       "        constructor(docs, show_offsets, script_context) {\n",
       "            this.docs = docs\n",
       "            this.show_offsets = show_offsets\n",
       "            this.script_context = script_context\n",
       "\n",
       "            // For each doc, generate a lookup map for quick ID access\n",
       "            this.docs = this.docs.map(doc => {\n",
       "                doc.span_objects = Span.arrayFromSpanArray(doc.doc_spans)\n",
       "                doc.lookup_table = {}\n",
       "                doc.span_objects.forEach(span => {\n",
       "                    doc.lookup_table[span.id] = span\n",
       "                })\n",
       "                return doc\n",
       "            })\n",
       "        }\n",
       "\n",
       "        render() {\n",
       "            let span_array_frag = document.createDocumentFragment()\n",
       "            // For each document, create a document fragment and append to a document container\n",
       "            for(let doc_index = 0; doc_index < this.docs.length; doc_index++)\n",
       "            {\n",
       "                let doc = this.docs[doc_index]\n",
       "                let doc_container = document.createElement(\"div\")\n",
       "                // Using the data-doc-id attribute allows a selector to access a document's render by its index\n",
       "                doc_container.setAttribute(\"data-doc-id\", doc_index)\n",
       "                doc_container.classList.add(\"document\")\n",
       "                const document_fragment = getDocumentFragment(doc, this.show_offsets)\n",
       "                if(this.show_offsets) {\n",
       "                    attachDocumentEvents(document_fragment, doc, this)\n",
       "                }\n",
       "                doc_container.appendChild(document_fragment)\n",
       "                span_array_frag.appendChild(doc_container)\n",
       "            }\n",
       "            let container = this.script_context.parentElement.querySelector(\".span-array\")\n",
       "            if(container != undefined) {\n",
       "                container.innerHTML = \"\"\n",
       "                container.appendChild(span_array_frag)\n",
       "            } else {\n",
       "                console.error(\"No container found for SpanArray renderer\")\n",
       "            }\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.SpanArray = SpanArray\n",
       "\n",
       "\n",
       "    /** Models an instance of a Span and its relationship to other spans in the document */\n",
       "    class Span {\n",
       "\n",
       "        // Creates an ordered list of entries from a list of spans with struct [begin, end]\n",
       "        static arrayFromSpanArray(spanArray) {\n",
       "            let entries = []\n",
       "            let span;\n",
       "            for(let i = 0; i < spanArray.length; i++)\n",
       "            {\n",
       "                span = spanArray[i];\n",
       "                entries.push(new Span(i, span[0], span[1]))\n",
       "            }\n",
       "\n",
       "            entries = entries.sort(compareSpanArrays)\n",
       "\n",
       "            let set;\n",
       "            for(let i = 0; i < entries.length; i++) {\n",
       "                for(let j = i+1; j < entries.length && entries[j].begin < entries[i].end; j++) {\n",
       "                    if(entries[j].end <= entries[i].end) {\n",
       "                        set = {type: TYPE_NESTED, entry: entries[j]}\n",
       "                    } else {\n",
       "                        set = {type: TYPE_OVERLAP, entry: entries[j]}\n",
       "                    }\n",
       "                    entries[i].sets.push(set)\n",
       "                }\n",
       "            }\n",
       "\n",
       "            return entries\n",
       "        }\n",
       "\n",
       "        constructor(id, begin, end) {\n",
       "            this.id = id\n",
       "            this.begin = begin\n",
       "            this.end = end\n",
       "            this.sets = []\n",
       "            this.visible = true\n",
       "            this.highlighted = false\n",
       "        }\n",
       "\n",
       "        // Returns only visible sets\n",
       "        get valid_sets() {\n",
       "            let valid_sets = []\n",
       "\n",
       "            this.sets.forEach(set => {\n",
       "                if(set.entry.visible) valid_sets.push(set)\n",
       "            })\n",
       "\n",
       "            return valid_sets\n",
       "        }\n",
       "\n",
       "        // Returns true if mark should render as a compound set of spans\n",
       "        isComplex() {\n",
       "            for(let i = 0; i < this.valid_sets.length; i++) {\n",
       "                let otherMember = this.valid_sets[i].entry;\n",
       "                if(this.valid_sets[i].type == TYPE_OVERLAP && otherMember.visible) {\n",
       "                    return true;\n",
       "                } else {\n",
       "                    if(otherMember.valid_sets.length > 0 && otherMember.visible) {\n",
       "                        return true;\n",
       "                    }\n",
       "                }\n",
       "            }\n",
       "            return false;\n",
       "        }\n",
       "\n",
       "        // Gets the combined span of all connected elements\n",
       "        getSetSpan() {\n",
       "            let begin = this.begin\n",
       "            let end = this.end\n",
       "            let highest_id = this.id\n",
       "\n",
       "            this.valid_sets.forEach(set => {\n",
       "                let other = set.entry.getSetSpan()\n",
       "                if(other.begin < begin) begin = other.begin\n",
       "                if(other.end > end) end = other.end\n",
       "                if(other.highest_id > highest_id) highest_id = other.highest_id\n",
       "            })\n",
       "\n",
       "            return {begin: begin, end: end, highest_id: highest_id}\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.Span = Span\n",
       "\n",
       "    // Get the DocumentFragment for a single document \n",
       "    function getDocumentFragment(doc, show_offsets) {\n",
       "\n",
       "        const doc_text = doc.doc_text;\n",
       "        const entries = doc.span_objects;\n",
       "\n",
       "        let frag = document.createDocumentFragment()\n",
       "\n",
       "        // Render Table\n",
       "        if(show_offsets) {\n",
       "            let table = document.createElement(\"table\")\n",
       "            table.innerHTML = `\n",
       "            <thead>\n",
       "            <tr>\n",
       "                <th></th>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "                ${(doc['doc_token_spans'] != undefined) ? '<th>begin token</th> <th>end token</th>' : ''}\n",
       "                <th>context</th>\n",
       "            </tr>\n",
       "            </thead>`\n",
       "            let tbody = document.createElement(\"tbody\")\n",
       "            entries.forEach(entry => {\n",
       "                let row = document.createElement(\"tr\")\n",
       "                row.setAttribute(\"data-id\", entry.id.toString())\n",
       "                if(!entry.visible)\n",
       "                {\n",
       "                    row.classList.add(\"disabled\")\n",
       "                }\n",
       "                if(entry.highlighted)\n",
       "                {\n",
       "                    row.classList.add(\"highlighted\")\n",
       "                }\n",
       "\n",
       "                // Adds the span entry to the table. doc_text is sanitized by replacing the reserved\n",
       "                // symbols by their entity name representations\n",
       "                row.innerHTML += `\n",
       "                <td>\n",
       "                    <div class='sa-table-controls'>\n",
       "                    <button data-control='visibility' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 576 512\"><path d=\"M572.52 241.4C518.29 135.59 410.93 64 288 64S57.68 135.64 3.48 241.41a32.35 32.35 0 0 0 0 29.19C57.71 376.41 165.07 448 288 448s230.32-71.64 284.52-177.41a32.35 32.35 0 0 0 0-29.19zM288 400a144 144 0 1 1 144-144 143.93 143.93 0 0 1-144 144zm0-240a95.31 95.31 0 0 0-25.31 3.79 47.85 47.85 0 0 1-66.9 66.9A95.78 95.78 0 1 0 288 160z\"/></svg></button>\n",
       "                    <button data-control='highlight' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 512 512\"><path d=\"M256 160c-52.9 0-96 43.1-96 96s43.1 96 96 96 96-43.1 96-96-43.1-96-96-96zm246.4 80.5l-94.7-47.3 33.5-100.4c4.5-13.6-8.4-26.5-21.9-21.9l-100.4 33.5-47.4-94.8c-6.4-12.8-24.6-12.8-31 0l-47.3 94.7L92.7 70.8c-13.6-4.5-26.5 8.4-21.9 21.9l33.5 100.4-94.7 47.4c-12.8 6.4-12.8 24.6 0 31l94.7 47.3-33.5 100.5c-4.5 13.6 8.4 26.5 21.9 21.9l100.4-33.5 47.3 94.7c6.4 12.8 24.6 12.8 31 0l47.3-94.7 100.4 33.5c13.6 4.5 26.5-8.4 21.9-21.9l-33.5-100.4 94.7-47.3c13-6.5 13-24.7.2-31.1zm-155.9 106c-49.9 49.9-131.1 49.9-181 0-49.9-49.9-49.9-131.1 0-181 49.9-49.9 131.1-49.9 181 0 49.9 49.9 49.9 131.1 0 181z\"/></svg></button>\n",
       "                    </div>\n",
       "                </td>\n",
       "                <td><b>${entry.id.toString()}</b></td>\n",
       "                <td>${entry.begin}</td>\n",
       "                <td>${entry.end}</td>\n",
       "                ${(doc.doc_token_spans != undefined) ? `<td>${doc.doc_token_spans[entry.id][0]}</td><td>${doc.doc_token_spans[entry.id][1]}</td>` : ''}\n",
       "                <td>${sanitize(doc_text.substring(entry.begin, entry.end))}</td>`\n",
       "\n",
       "                tbody.appendChild(row)\n",
       "            })\n",
       "            table.appendChild(tbody)\n",
       "            frag.appendChild(table)\n",
       "        }\n",
       "\n",
       "        // Render Text\n",
       "        let highlight_regions = []\n",
       "        for(let i = 0; i < entries.length; i++)\n",
       "        {\n",
       "            if(!entries[i].visible) continue\n",
       "            if(entries[i].valid_sets.length > 0)\n",
       "            {\n",
       "                let span = entries[i].getSetSpan();\n",
       "                let ids = [entries[i].id, ...entries[i].valid_sets.map(set => { return set.entry.id })]\n",
       "                if(entries[i].isComplex()) {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_COMPLEX, ids: ids})\n",
       "                } else {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_NESTED, ids: ids})\n",
       "                }\n",
       "                i = span.highest_id\n",
       "            } else {\n",
       "                highlight_regions.push({begin: entries[i].begin, end: entries[i].end, type: TYPE_SOLO, ids: [entries[i].id]})\n",
       "            }\n",
       "        }\n",
       "\n",
       "        let paragraph = document.createElement(\"p\")\n",
       "        if(highlight_regions.length == 0) {\n",
       "            paragraph.innerHTML = sanitize(doc_text)\n",
       "        } else {\n",
       "            let begin = 0\n",
       "            highlight_regions.forEach(region => {\n",
       "                paragraph.innerHTML += sanitize(doc_text.substring(begin, region.begin))\n",
       "\n",
       "                let mark = document.createElement(\"mark\")\n",
       "                // The data-ids tag is a list of comma-separated reference IDs for matching Spans \n",
       "                mark.setAttribute(\"data-ids\", \"\");\n",
       "                if (region.type != TYPE_NESTED) {\n",
       "                    region.ids.forEach(id => {\n",
       "                        mark.setAttribute(\"data-ids\", mark.getAttribute(\"data-ids\") + `#${id},`)\n",
       "                        if(doc.lookup_table[id].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    })\n",
       "                    mark.innerHTML = sanitize(doc_text.substring(region.begin, region.end))\n",
       "                } else {\n",
       "                    mark.setAttribute(\"data-ids\", `#${region.ids[0]},`)\n",
       "                    if(doc.lookup_table[region.ids[0]].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    let nested_begin = region.begin\n",
       "                    region.ids.slice(1).forEach(nested_id => {\n",
       "                        let nested_region = doc.lookup_table[nested_id]\n",
       "                        mark.innerHTML += sanitize(doc_text.substring(nested_begin, nested_region.begin))\n",
       "                        let nested_mark = document.createElement(\"mark\")\n",
       "                        nested_mark.setAttribute(\"data-ids\", `#${nested_id},`)\n",
       "                        if(nested_region.highlighted) nested_mark.classList.add(\"highlighted\")\n",
       "                        nested_mark.innerHTML = sanitize(doc_text.substring(nested_region.begin, nested_region.end))\n",
       "                        nested_begin = nested_region.end\n",
       "                        mark.appendChild(nested_mark)\n",
       "                    })\n",
       "                    mark.innerHTML += sanitize(doc_text.substring(nested_begin, region.end))\n",
       "                }\n",
       "\n",
       "                if(region.type == TYPE_COMPLEX) {\n",
       "                    let markTag = document.createElement(\"span\")\n",
       "                    markTag.textContent = \"Set\"\n",
       "                    markTag.classList.add(\"mark-tag\")\n",
       "                    mark.classList.add(\"complex-set\")\n",
       "                    mark.appendChild(markTag)\n",
       "                }\n",
       "\n",
       "                begin = region.end\n",
       "                paragraph.appendChild(mark)\n",
       "            })\n",
       "            paragraph.innerHTML += sanitize(doc_text.substring(highlight_regions[highlight_regions.length - 1].end, doc_text.length))\n",
       "        }\n",
       "\n",
       "        frag.appendChild(paragraph)\n",
       "\n",
       "        return frag\n",
       "    }\n",
       "\n",
       "    /** Attach hover and click events to a document render via event delegation */\n",
       "    function attachDocumentEvents(fragment, doc_object, source_spanarray) {\n",
       "        const doc_table_body = fragment.querySelector(\"table>tbody\")\n",
       "        const doc_text = fragment.querySelector(\"p\")\n",
       "\n",
       "        // Hover highlight events\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        // Click disable/enable events\n",
       "\n",
       "        doc_table_body.addEventListener(\"click\", (event) => {\n",
       "            const closest_control_button = event.target.closest(\"button[data-control]\")\n",
       "            if(closest_control_button == undefined) return\n",
       "\n",
       "            const closest_tr = event.target.closest(\"tr\")\n",
       "            if(closest_tr == undefined) return\n",
       "\n",
       "            const matching_span = doc_object.lookup_table[closest_tr.getAttribute(\"data-id\")]\n",
       "            if(matching_span == undefined) return\n",
       "\n",
       "            switch(closest_control_button.getAttribute(\"data-control\")) {\n",
       "                case \"visibility\":\n",
       "                    {\n",
       "                        matching_span.visible = !matching_span.visible\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "                case \"highlight\":\n",
       "                    {\n",
       "                        matching_span.highlighted = !matching_span.highlighted\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "            }\n",
       "\n",
       "\n",
       "\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"click\", (event) => {\n",
       "            const closest_mark = event.target.closest(\"mark\")\n",
       "            if(closest_mark == undefined) return\n",
       "\n",
       "            // Preprocess ID string into a list of IDs\n",
       "            const ids = closest_mark.getAttribute(\"data-ids\")\n",
       "                .split(\",\")\n",
       "                .slice(0, -1)\n",
       "                .map(id => {\n",
       "                    return id.substring(1)\n",
       "                })\n",
       "\n",
       "            // If any of the connected IDs are highlighted, we set all spans in the list to not highlighted.\n",
       "            // Inversely, we want all spans highlighted if none were previously.\n",
       "\n",
       "            const highlighted_entry = ids.find(id => {\n",
       "                return doc_object.lookup_table[id].highlighted\n",
       "            })\n",
       "\n",
       "            const is_highlighted = (highlighted_entry != undefined)\n",
       "\n",
       "            ids.forEach(id => {\n",
       "                const span = doc_object.lookup_table[id]\n",
       "                if(span != undefined) span.highlighted = !is_highlighted\n",
       "            })\n",
       "\n",
       "            source_spanarray.render()\n",
       "        })\n",
       "    }\n",
       "} else {\n",
       "    // SpanArray JS is already defined and not an outdated copy\n",
       "    // Replace global SpanArray CSS with latest copy IFF global stylesheet is undefined\n",
       "\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet == undefined) {\n",
       "            document.head.appendChild(local_stylesheet)\n",
       "        } else {\n",
       "            document.currentScript.parentElement.removeChild(local_stylesheet)\n",
       "        }\n",
       "    }       \n",
       "}\n",
       "}\n",
       "</script>\n",
       "<div class=\"span-array\">\n",
       "\n",
       "    <div class='document'>\n",
       "        <table style='\n",
       "            table-layout: auto;\n",
       "            overflow: hidden;\n",
       "            width: 100%;\n",
       "            border-collapse: collapse;\n",
       "            '>\n",
       "            <thead style='font-variant-caps: all-petite-caps;'>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "\n",
       "                <th style='text-align:right;width:100%'>context</th>\n",
       "            </tr></thead>\n",
       "            <tbody>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>0</b></td>\n",
       "            <td>0</td>\n",
       "            <td>2</td>\n",
       "\n",
       "            <td>In</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>1</b></td>\n",
       "            <td>3</td>\n",
       "            <td>5</td>\n",
       "\n",
       "            <td>AD</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>2</b></td>\n",
       "            <td>6</td>\n",
       "            <td>9</td>\n",
       "\n",
       "            <td>932</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>3</b></td>\n",
       "            <td>11</td>\n",
       "            <td>15</td>\n",
       "\n",
       "            <td>King</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>4</b></td>\n",
       "            <td>16</td>\n",
       "            <td>22</td>\n",
       "\n",
       "            <td>Arthur</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>5</b></td>\n",
       "            <td>23</td>\n",
       "            <td>26</td>\n",
       "\n",
       "            <td>and</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>6</b></td>\n",
       "            <td>27</td>\n",
       "            <td>30</td>\n",
       "\n",
       "            <td>his</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>7</b></td>\n",
       "            <td>31</td>\n",
       "            <td>37</td>\n",
       "\n",
       "            <td>squire</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>8</b></td>\n",
       "            <td>39</td>\n",
       "            <td>44</td>\n",
       "\n",
       "            <td>Patsy</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>9</b></td>\n",
       "            <td>46</td>\n",
       "            <td>52</td>\n",
       "\n",
       "            <td>travel</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>10</b></td>\n",
       "            <td>53</td>\n",
       "            <td>63</td>\n",
       "\n",
       "            <td>throughout</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>11</b></td>\n",
       "            <td>64</td>\n",
       "            <td>71</td>\n",
       "\n",
       "            <td>Britain</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>12</b></td>\n",
       "            <td>72</td>\n",
       "            <td>81</td>\n",
       "\n",
       "            <td>searching</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>13</b></td>\n",
       "            <td>82</td>\n",
       "            <td>85</td>\n",
       "\n",
       "            <td>for</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>14</b></td>\n",
       "            <td>86</td>\n",
       "            <td>89</td>\n",
       "\n",
       "            <td>men</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>15</b></td>\n",
       "            <td>90</td>\n",
       "            <td>92</td>\n",
       "\n",
       "            <td>to</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>16</b></td>\n",
       "            <td>93</td>\n",
       "            <td>97</td>\n",
       "\n",
       "            <td>join</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>17</b></td>\n",
       "            <td>98</td>\n",
       "            <td>101</td>\n",
       "\n",
       "            <td>the</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>18</b></td>\n",
       "            <td>102</td>\n",
       "            <td>109</td>\n",
       "\n",
       "            <td>Knights</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>19</b></td>\n",
       "            <td>110</td>\n",
       "            <td>112</td>\n",
       "\n",
       "            <td>of</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>20</b></td>\n",
       "            <td>113</td>\n",
       "            <td>116</td>\n",
       "\n",
       "            <td>the</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>21</b></td>\n",
       "            <td>117</td>\n",
       "            <td>122</td>\n",
       "\n",
       "            <td>Round</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>22</b></td>\n",
       "            <td>123</td>\n",
       "            <td>128</td>\n",
       "\n",
       "            <td>Table</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>23</b></td>\n",
       "            <td>130</td>\n",
       "            <td>135</td>\n",
       "\n",
       "            <td>Along</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>24</b></td>\n",
       "            <td>136</td>\n",
       "            <td>139</td>\n",
       "\n",
       "            <td>the</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>25</b></td>\n",
       "            <td>140</td>\n",
       "            <td>143</td>\n",
       "\n",
       "            <td>way</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>26</b></td>\n",
       "            <td>145</td>\n",
       "            <td>147</td>\n",
       "\n",
       "            <td>he</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>27</b></td>\n",
       "            <td>148</td>\n",
       "            <td>156</td>\n",
       "\n",
       "            <td>recruits</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>28</b></td>\n",
       "            <td>157</td>\n",
       "            <td>160</td>\n",
       "\n",
       "            <td>Sir</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>29</b></td>\n",
       "            <td>161</td>\n",
       "            <td>169</td>\n",
       "\n",
       "            <td>Bedevere</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>30</b></td>\n",
       "            <td>170</td>\n",
       "            <td>173</td>\n",
       "\n",
       "            <td>the</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>31</b></td>\n",
       "            <td>174</td>\n",
       "            <td>178</td>\n",
       "\n",
       "            <td>Wise</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>32</b></td>\n",
       "            <td>180</td>\n",
       "            <td>183</td>\n",
       "\n",
       "            <td>Sir</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>33</b></td>\n",
       "            <td>184</td>\n",
       "            <td>192</td>\n",
       "\n",
       "            <td>Lancelot</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>34</b></td>\n",
       "            <td>193</td>\n",
       "            <td>196</td>\n",
       "\n",
       "            <td>the</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>35</b></td>\n",
       "            <td>197</td>\n",
       "            <td>202</td>\n",
       "\n",
       "            <td>Brave</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>36</b></td>\n",
       "            <td>204</td>\n",
       "            <td>207</td>\n",
       "\n",
       "            <td>Sir</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>37</b></td>\n",
       "            <td>208</td>\n",
       "            <td>215</td>\n",
       "\n",
       "            <td>Galahad</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>38</b></td>\n",
       "            <td>216</td>\n",
       "            <td>219</td>\n",
       "\n",
       "            <td>the</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>39</b></td>\n",
       "            <td>220</td>\n",
       "            <td>224</td>\n",
       "\n",
       "            <td>Pure</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>40</b></td>\n",
       "            <td>226</td>\n",
       "            <td>229</td>\n",
       "\n",
       "            <td>Sir</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>41</b></td>\n",
       "            <td>230</td>\n",
       "            <td>235</td>\n",
       "\n",
       "            <td>Robin</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>42</b></td>\n",
       "            <td>236</td>\n",
       "            <td>239</td>\n",
       "\n",
       "            <td>the</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>43</b></td>\n",
       "            <td>240</td>\n",
       "            <td>274</td>\n",
       "\n",
       "            <td>Not-Quite-So-Brave-as-Sir-Lancelot</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>44</b></td>\n",
       "            <td>276</td>\n",
       "            <td>279</td>\n",
       "\n",
       "            <td>and</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>45</b></td>\n",
       "            <td>280</td>\n",
       "            <td>283</td>\n",
       "\n",
       "            <td>Sir</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>46</b></td>\n",
       "            <td>284</td>\n",
       "            <td>310</td>\n",
       "\n",
       "            <td>Not-Appearing-in-this-Film</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>47</b></td>\n",
       "            <td>312</td>\n",
       "            <td>317</td>\n",
       "\n",
       "            <td>along</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>48</b></td>\n",
       "            <td>318</td>\n",
       "            <td>322</td>\n",
       "\n",
       "            <td>with</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>49</b></td>\n",
       "            <td>323</td>\n",
       "            <td>328</td>\n",
       "\n",
       "            <td>their</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>50</b></td>\n",
       "            <td>329</td>\n",
       "            <td>336</td>\n",
       "\n",
       "            <td>squires</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>51</b></td>\n",
       "            <td>337</td>\n",
       "            <td>340</td>\n",
       "\n",
       "            <td>and</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>52</b></td>\n",
       "            <td>341</td>\n",
       "            <td>348</td>\n",
       "\n",
       "            <td>Robin&#39;s</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>53</b></td>\n",
       "            <td>349</td>\n",
       "            <td>360</td>\n",
       "\n",
       "            <td>troubadours</td>\n",
       "        </tr>\n",
       "\n",
       "            </tbody>\n",
       "        </table>\n",
       "        <p style='\n",
       "            padding: 1em;\n",
       "            line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "            '>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>In</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>AD</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>932</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>King</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Arthur</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>and</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>his</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>squire</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Patsy</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>travel</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>throughout</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Britain</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>searching</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>for</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>men</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>to</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>join</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>the</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Knights</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>of</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>the</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Round</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Table</span>\n",
       "\n",
       "            . \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Along</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>the</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>way</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>he</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>recruits</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Bedevere</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>the</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Wise</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Lancelot</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>the</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Brave</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Galahad</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>the</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Pure</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Robin</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>the</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Not-Quite-So-Brave-as-Sir-Lancelot</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>and</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Not-Appearing-in-this-Film</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>along</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>with</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>their</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>squires</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>and</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Robin&#39;s</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>troubadours</span>\n",
       "            .\n",
       "        </p>\n",
       "    </div>\n",
       "\n",
       "    <span style=\"font-size: 0.8em;color: #b3b3b3;\">Your notebook viewer does not support Javascript execution. The above rendering will not be interactive.</span>\n",
       "</div>\n",
       "<script>\n",
       "    {\n",
       "        const Span = window.SpanArray.Span\n",
       "        const script_context = document.currentScript\n",
       "        const documents = []\n",
       "\n",
       "    {\n",
       "\n",
       "    const doc_spans = [[0,2],[3,5],[6,9],[11,15],[16,22],[23,26],[27,30],[31,37],[39,44],[46,52],[53,63],[64,71],[72,81],[82,85],[86,89],[90,92],[93,97],[98,101],[102,109],[110,112],[113,116],[117,122],[123,128],[130,135],[136,139],[140,143],[145,147],[148,156],[157,160],[161,169],[170,173],[174,178],[180,183],[184,192],[193,196],[197,202],[204,207],[208,215],[216,219],[220,224],[226,229],[230,235],[236,239],[240,274],[276,279],[280,283],[284,310],[312,317],[318,322],[323,328],[329,336],[337,340],[341,348],[349,360]]\n",
       "    const doc_text = 'In AD 932, King Arthur and his squire, Patsy, travel throughout Britain searching for men to join the Knights of the Round Table. Along the way, he recruits Sir Bedevere the Wise, Sir Lancelot the Brave, Sir Galahad the Pure, Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot, and Sir Not-Appearing-in-this-Film, along with their squires and Robin\\'s troubadours.'\n",
       "\n",
       "        documents.push({doc_text: doc_text, doc_spans: doc_spans})\n",
       "\n",
       "    }\n",
       "\n",
       "        const instance = new window.SpanArray.SpanArray(documents, true, script_context)\n",
       "        instance.render()\n",
       "    }\n",
       "</script>\n",
       "\n"
      ],
      "text/plain": [
       "<SpanArray>\n",
       "[                                    [0, 2): 'In',\n",
       "                                     [3, 5): 'AD',\n",
       "                                    [6, 9): '932',\n",
       "                                 [11, 15): 'King',\n",
       "                               [16, 22): 'Arthur',\n",
       "                                  [23, 26): 'and',\n",
       "                                  [27, 30): 'his',\n",
       "                               [31, 37): 'squire',\n",
       "                                [39, 44): 'Patsy',\n",
       "                               [46, 52): 'travel',\n",
       "                           [53, 63): 'throughout',\n",
       "                              [64, 71): 'Britain',\n",
       "                            [72, 81): 'searching',\n",
       "                                  [82, 85): 'for',\n",
       "                                  [86, 89): 'men',\n",
       "                                   [90, 92): 'to',\n",
       "                                 [93, 97): 'join',\n",
       "                                 [98, 101): 'the',\n",
       "                            [102, 109): 'Knights',\n",
       "                                 [110, 112): 'of',\n",
       "                                [113, 116): 'the',\n",
       "                              [117, 122): 'Round',\n",
       "                              [123, 128): 'Table',\n",
       "                              [130, 135): 'Along',\n",
       "                                [136, 139): 'the',\n",
       "                                [140, 143): 'way',\n",
       "                                 [145, 147): 'he',\n",
       "                           [148, 156): 'recruits',\n",
       "                                [157, 160): 'Sir',\n",
       "                           [161, 169): 'Bedevere',\n",
       "                                [170, 173): 'the',\n",
       "                               [174, 178): 'Wise',\n",
       "                                [180, 183): 'Sir',\n",
       "                           [184, 192): 'Lancelot',\n",
       "                                [193, 196): 'the',\n",
       "                              [197, 202): 'Brave',\n",
       "                                [204, 207): 'Sir',\n",
       "                            [208, 215): 'Galahad',\n",
       "                                [216, 219): 'the',\n",
       "                               [220, 224): 'Pure',\n",
       "                                [226, 229): 'Sir',\n",
       "                              [230, 235): 'Robin',\n",
       "                                [236, 239): 'the',\n",
       " [240, 274): 'Not-Quite-So-Brave-as-Sir-Lancelot',\n",
       "                                [276, 279): 'and',\n",
       "                                [280, 283): 'Sir',\n",
       "         [284, 310): 'Not-Appearing-in-this-Film',\n",
       "                              [312, 317): 'along',\n",
       "                               [318, 322): 'with',\n",
       "                              [323, 328): 'their',\n",
       "                            [329, 336): 'squires',\n",
       "                                [337, 340): 'and',\n",
       "                            [341, 348): 'Robin's',\n",
       "                        [349, 360): 'troubadours']\n",
       "Length: 54, dtype: SpanDtype"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Tokenize the text to get begin, end offsets and construct a `SpanArray`.\n",
    "begins, ends = tokenize_with_offsets(text)\n",
    "tokens = tp.SpanArray(text, begins, ends)\n",
    "\n",
    "# The array nicely renders in HTML to show offsets, text of the span,\n",
    "# and highlighted target text.\n",
    "tokens"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[240, 274): 'Not-Quite-So-Brave-as-Sir-Lancelot'"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Indexing the array with an integer will produce a `Span`, which is a single\n",
    "# element in the array.\n",
    "tok = tokens[43]\n",
    "tok"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<style class=\"span-array-css\">\n",
       "            .span-array {\n",
       "    --thead-background-color: var(--jp-layout-color1, inherit);\n",
       "    --thead-text-color: var(--jp-ui-font-color1, inherit);\n",
       "    --tbody-background-color-1: var(--jp-layout-color1, inherit);\n",
       "    --tbody-background-color-2: var(--jp-layout-color2, inherit);\n",
       "    --tbody-background-color-hover: var(--jp-rendermime-table-row-hover-background, var(--jp-layout-color3, inherit));\n",
       "    --tbody-background-color-disabled: var(--jp-layout-color4, #ccccd1);\n",
       "    --tbody-text-color: var(--jp-ui-font-color0, inherit);\n",
       "    --tbody-text-color-disabled: var(--jp-ui-inverse-font-color0, #b3b3b9);\n",
       "    --table-font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "\n",
       "    --table-control-background: rgba(0, 0, 0, 0.2);\n",
       "    --table-control-color: var(--jp-ui-font-color0);\n",
       "    --table-control-border: 1px solid rgba(0, 0, 0, 0.8);\n",
       "    --table-control-border-radius: 0.5em;\n",
       "\n",
       "    --root-highlight: #a0c4ff;\n",
       "    --nested-highlight: #ffadad;\n",
       "    --hover-highlight: #ffd6a5;\n",
       "\n",
       "    --inverted-background-color: #0B525B;\n",
       "    --inverted-text-color: rgb(243, 243, 243);\n",
       "    --paragraph-border-color: var(--jp-layout-color2, inherit);\n",
       "\n",
       "    --fallback-font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif\n",
       "}\n",
       "\n",
       "/* Table of span offsets */\n",
       ".span-array>.document>table {\n",
       "    table-layout: auto;\n",
       "    overflow: hidden;\n",
       "    width: 100%;\n",
       "    border-collapse: collapse;\n",
       "    font-family: var(--table-font-family);\n",
       "}\n",
       "\n",
       ".span-array>.document>table thead {\n",
       "    font-variant-caps: all-petite-caps;\n",
       "}\n",
       "\n",
       ".span-array>.document>table th {\n",
       "    padding: 1em;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:last-child, .span-array>.document>table tr>th:last-child {\n",
       "    text-align: right;\n",
       "    width: 100%;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:not(tr>td:last-child), .span-array>.document>table tr>th:not(tr>th:last-child) {\n",
       "    text-align: left;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n), .span-array>.document>table tr.disabled.hover:nth-child(n) {\n",
       "    background-color: var(--tbody-background-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n)>td, .span-array>.document>table tr.disabled.hover:nth-child(n)>td {\n",
       "    color: var(--tbody-text-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.hover:not(.disabled) {\n",
       "    background: var(--jp-rendermime-table-row-hover-background);\n",
       "}\n",
       "\n",
       "/* Table control buttons */\n",
       "\n",
       ".span-array>.document>table td.sa-table-controls-container {\n",
       "    vertical-align: center;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls {\n",
       "    display: flex;\n",
       "    flex-direction: row;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button {\n",
       "    background-color: var(--table-control-background);\n",
       "    color: var(--table-control-color);\n",
       "    border: var(--table-control-border);\n",
       "    border-right: none;\n",
       "    border-radius: 0;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:first-child {\n",
       "    border-radius: var(--table-control-border-radius) 0 0 var(--table-control-border-radius);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:last-child {\n",
       "    border-radius: 0 var(--table-control-border-radius) var(--table-control-border-radius) 0;\n",
       "    border-right: var(--table-control-border);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button[data-control=\"visibility\"]:hover {\n",
       "    background-color: var(--root-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"]:hover {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.highlighted:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"] {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       "/* Styling for spans within document context */\n",
       ".span-array>.document>p {\n",
       "    border:1px solid var(--paragraph-border-color);\n",
       "    border-radius: 0.2em;\n",
       "    padding: 1em;\n",
       "    line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "    box-sizing: border-box;\n",
       "    font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "}\n",
       "\n",
       "body[data-jp-theme-light=\"false\"].span-array>.document>p {\n",
       "    border: 1px solid black;\n",
       "    background-color: var(--inverted-background-color);\n",
       "    color: var(--inverted-text-color);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    padding: 0.4em 0.4em;\n",
       "    border-radius: 0.35em;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>p .mark {\n",
       "    color: var(black);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    background-color: var(--root-highlight);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>mark {\n",
       "    background-color: var(--nested-highlight);\n",
       "    padding: 0.2em 0.4em;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.complex-set {\n",
       "    background: linear-gradient(to right, var(--root-highlight), var(--nested-highlight))\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>span.mark-tag {\n",
       "    font-weight: bolder;\n",
       "    font-size: 0.8em;\n",
       "    font-variant: small-caps;\n",
       "    font-variant-caps: all-small-caps;\n",
       "    margin-left: 8px;\n",
       "    text-transform: uppercase;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.hover, .span-array.span-array>.document>p mark>mark.hover, .span-array>.document>p mark.complex-set.hover, .span-array>.document>p mark.highlighted, .span-array>.document>p mark.complex-set.highlighted, .span-array.span-array>.document>p mark>mark.highlighted {\n",
       "    background: none;\n",
       "    background-color: var(--hover-highlight);\n",
       "}\n",
       "\n",
       "</style>\n",
       "<script>\n",
       "{\n",
       "            // Increment the version to invalidate the cached script\n",
       "const VERSION = 0.79\n",
       "const global_stylesheet = document.head.querySelector(\"style.span-array-css\")\n",
       "const local_stylesheet = document.currentScript.parentElement.querySelector(\"style.span-array-css\")\n",
       "\n",
       "if(window.SpanArray == undefined || window.SpanArray.VERSION == undefined || window.SpanArray.VERSION < VERSION) {\n",
       "\n",
       "    // Replace global SpanArray CSS with latest copy\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet != undefined) {\n",
       "            document.head.removeChild(global_stylesheet)\n",
       "        }\n",
       "        document.head.appendChild(local_stylesheet)\n",
       "    }\n",
       "\n",
       "    // Sets up the SpanArray global namespace\n",
       "    window.SpanArray = {}\n",
       "    window.SpanArray.VERSION = VERSION\n",
       "\n",
       "    window.SpanArray.TYPE_OVERLAP = 0;\n",
       "    window.SpanArray.TYPE_NESTED = 1;\n",
       "    window.SpanArray.TYPE_COMPLEX = 2;\n",
       "    window.SpanArray.TYPE_SOLO = 3;\n",
       "\n",
       "    const TYPE_OVERLAP = window.SpanArray.TYPE_OVERLAP;\n",
       "    const TYPE_NESTED = window.SpanArray.TYPE_NESTED;\n",
       "    const TYPE_COMPLEX = window.SpanArray.TYPE_COMPLEX;\n",
       "    const TYPE_SOLO = window.SpanArray.TYPE_SOLO;\n",
       "\n",
       "    function sanitize(input) {\n",
       "        let out = input.slice();\n",
       "        out = out.replace(/&/g, \"&amp;\")\n",
       "        out = out.replace(/</g, \"&lt;\")\n",
       "        out = out.replace(/>/g, \"&gt;\")\n",
       "        out = out.replace(/\\$/g, \"<span>&#36;</span>\")\n",
       "        out = out.replace(/\"/g, \"&quot;\")\n",
       "        out = out.replace(/(\\r|\\n)/g, \"<br>\")\n",
       "        return out;\n",
       "    }\n",
       "\n",
       "    /** Comparison function used to sort SpanArrays by position and length\n",
       "     *  Will sort by primarily by earliest beginning point. On a tie, will prioritize latest end point (first and largest)\n",
       "     *  Used by mark relationship algorithm.\n",
       "     */\n",
       "    function compareSpanArrays(a, b) {\n",
       "        const start_diff = a.begin - b.begin\n",
       "        if(start_diff == 0) {\n",
       "            return b.end - a.end\n",
       "        }\n",
       "        return start_diff;\n",
       "    }\n",
       "\n",
       "    /** Models an instance of a SpanArray, with document-separated spans and text\n",
       "     * NOTE: Using docs instead of documents to avoid unintentionally manipulating the global 'document' object.\n",
       "    */\n",
       "    class SpanArray {\n",
       "        constructor(docs, show_offsets, script_context) {\n",
       "            this.docs = docs\n",
       "            this.show_offsets = show_offsets\n",
       "            this.script_context = script_context\n",
       "\n",
       "            // For each doc, generate a lookup map for quick ID access\n",
       "            this.docs = this.docs.map(doc => {\n",
       "                doc.span_objects = Span.arrayFromSpanArray(doc.doc_spans)\n",
       "                doc.lookup_table = {}\n",
       "                doc.span_objects.forEach(span => {\n",
       "                    doc.lookup_table[span.id] = span\n",
       "                })\n",
       "                return doc\n",
       "            })\n",
       "        }\n",
       "\n",
       "        render() {\n",
       "            let span_array_frag = document.createDocumentFragment()\n",
       "            // For each document, create a document fragment and append to a document container\n",
       "            for(let doc_index = 0; doc_index < this.docs.length; doc_index++)\n",
       "            {\n",
       "                let doc = this.docs[doc_index]\n",
       "                let doc_container = document.createElement(\"div\")\n",
       "                // Using the data-doc-id attribute allows a selector to access a document's render by its index\n",
       "                doc_container.setAttribute(\"data-doc-id\", doc_index)\n",
       "                doc_container.classList.add(\"document\")\n",
       "                const document_fragment = getDocumentFragment(doc, this.show_offsets)\n",
       "                if(this.show_offsets) {\n",
       "                    attachDocumentEvents(document_fragment, doc, this)\n",
       "                }\n",
       "                doc_container.appendChild(document_fragment)\n",
       "                span_array_frag.appendChild(doc_container)\n",
       "            }\n",
       "            let container = this.script_context.parentElement.querySelector(\".span-array\")\n",
       "            if(container != undefined) {\n",
       "                container.innerHTML = \"\"\n",
       "                container.appendChild(span_array_frag)\n",
       "            } else {\n",
       "                console.error(\"No container found for SpanArray renderer\")\n",
       "            }\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.SpanArray = SpanArray\n",
       "\n",
       "\n",
       "    /** Models an instance of a Span and its relationship to other spans in the document */\n",
       "    class Span {\n",
       "\n",
       "        // Creates an ordered list of entries from a list of spans with struct [begin, end]\n",
       "        static arrayFromSpanArray(spanArray) {\n",
       "            let entries = []\n",
       "            let span;\n",
       "            for(let i = 0; i < spanArray.length; i++)\n",
       "            {\n",
       "                span = spanArray[i];\n",
       "                entries.push(new Span(i, span[0], span[1]))\n",
       "            }\n",
       "\n",
       "            entries = entries.sort(compareSpanArrays)\n",
       "\n",
       "            let set;\n",
       "            for(let i = 0; i < entries.length; i++) {\n",
       "                for(let j = i+1; j < entries.length && entries[j].begin < entries[i].end; j++) {\n",
       "                    if(entries[j].end <= entries[i].end) {\n",
       "                        set = {type: TYPE_NESTED, entry: entries[j]}\n",
       "                    } else {\n",
       "                        set = {type: TYPE_OVERLAP, entry: entries[j]}\n",
       "                    }\n",
       "                    entries[i].sets.push(set)\n",
       "                }\n",
       "            }\n",
       "\n",
       "            return entries\n",
       "        }\n",
       "\n",
       "        constructor(id, begin, end) {\n",
       "            this.id = id\n",
       "            this.begin = begin\n",
       "            this.end = end\n",
       "            this.sets = []\n",
       "            this.visible = true\n",
       "            this.highlighted = false\n",
       "        }\n",
       "\n",
       "        // Returns only visible sets\n",
       "        get valid_sets() {\n",
       "            let valid_sets = []\n",
       "\n",
       "            this.sets.forEach(set => {\n",
       "                if(set.entry.visible) valid_sets.push(set)\n",
       "            })\n",
       "\n",
       "            return valid_sets\n",
       "        }\n",
       "\n",
       "        // Returns true if mark should render as a compound set of spans\n",
       "        isComplex() {\n",
       "            for(let i = 0; i < this.valid_sets.length; i++) {\n",
       "                let otherMember = this.valid_sets[i].entry;\n",
       "                if(this.valid_sets[i].type == TYPE_OVERLAP && otherMember.visible) {\n",
       "                    return true;\n",
       "                } else {\n",
       "                    if(otherMember.valid_sets.length > 0 && otherMember.visible) {\n",
       "                        return true;\n",
       "                    }\n",
       "                }\n",
       "            }\n",
       "            return false;\n",
       "        }\n",
       "\n",
       "        // Gets the combined span of all connected elements\n",
       "        getSetSpan() {\n",
       "            let begin = this.begin\n",
       "            let end = this.end\n",
       "            let highest_id = this.id\n",
       "\n",
       "            this.valid_sets.forEach(set => {\n",
       "                let other = set.entry.getSetSpan()\n",
       "                if(other.begin < begin) begin = other.begin\n",
       "                if(other.end > end) end = other.end\n",
       "                if(other.highest_id > highest_id) highest_id = other.highest_id\n",
       "            })\n",
       "\n",
       "            return {begin: begin, end: end, highest_id: highest_id}\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.Span = Span\n",
       "\n",
       "    // Get the DocumentFragment for a single document \n",
       "    function getDocumentFragment(doc, show_offsets) {\n",
       "\n",
       "        const doc_text = doc.doc_text;\n",
       "        const entries = doc.span_objects;\n",
       "\n",
       "        let frag = document.createDocumentFragment()\n",
       "\n",
       "        // Render Table\n",
       "        if(show_offsets) {\n",
       "            let table = document.createElement(\"table\")\n",
       "            table.innerHTML = `\n",
       "            <thead>\n",
       "            <tr>\n",
       "                <th></th>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "                ${(doc['doc_token_spans'] != undefined) ? '<th>begin token</th> <th>end token</th>' : ''}\n",
       "                <th>context</th>\n",
       "            </tr>\n",
       "            </thead>`\n",
       "            let tbody = document.createElement(\"tbody\")\n",
       "            entries.forEach(entry => {\n",
       "                let row = document.createElement(\"tr\")\n",
       "                row.setAttribute(\"data-id\", entry.id.toString())\n",
       "                if(!entry.visible)\n",
       "                {\n",
       "                    row.classList.add(\"disabled\")\n",
       "                }\n",
       "                if(entry.highlighted)\n",
       "                {\n",
       "                    row.classList.add(\"highlighted\")\n",
       "                }\n",
       "\n",
       "                // Adds the span entry to the table. doc_text is sanitized by replacing the reserved\n",
       "                // symbols by their entity name representations\n",
       "                row.innerHTML += `\n",
       "                <td>\n",
       "                    <div class='sa-table-controls'>\n",
       "                    <button data-control='visibility' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 576 512\"><path d=\"M572.52 241.4C518.29 135.59 410.93 64 288 64S57.68 135.64 3.48 241.41a32.35 32.35 0 0 0 0 29.19C57.71 376.41 165.07 448 288 448s230.32-71.64 284.52-177.41a32.35 32.35 0 0 0 0-29.19zM288 400a144 144 0 1 1 144-144 143.93 143.93 0 0 1-144 144zm0-240a95.31 95.31 0 0 0-25.31 3.79 47.85 47.85 0 0 1-66.9 66.9A95.78 95.78 0 1 0 288 160z\"/></svg></button>\n",
       "                    <button data-control='highlight' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 512 512\"><path d=\"M256 160c-52.9 0-96 43.1-96 96s43.1 96 96 96 96-43.1 96-96-43.1-96-96-96zm246.4 80.5l-94.7-47.3 33.5-100.4c4.5-13.6-8.4-26.5-21.9-21.9l-100.4 33.5-47.4-94.8c-6.4-12.8-24.6-12.8-31 0l-47.3 94.7L92.7 70.8c-13.6-4.5-26.5 8.4-21.9 21.9l33.5 100.4-94.7 47.4c-12.8 6.4-12.8 24.6 0 31l94.7 47.3-33.5 100.5c-4.5 13.6 8.4 26.5 21.9 21.9l100.4-33.5 47.3 94.7c6.4 12.8 24.6 12.8 31 0l47.3-94.7 100.4 33.5c13.6 4.5 26.5-8.4 21.9-21.9l-33.5-100.4 94.7-47.3c13-6.5 13-24.7.2-31.1zm-155.9 106c-49.9 49.9-131.1 49.9-181 0-49.9-49.9-49.9-131.1 0-181 49.9-49.9 131.1-49.9 181 0 49.9 49.9 49.9 131.1 0 181z\"/></svg></button>\n",
       "                    </div>\n",
       "                </td>\n",
       "                <td><b>${entry.id.toString()}</b></td>\n",
       "                <td>${entry.begin}</td>\n",
       "                <td>${entry.end}</td>\n",
       "                ${(doc.doc_token_spans != undefined) ? `<td>${doc.doc_token_spans[entry.id][0]}</td><td>${doc.doc_token_spans[entry.id][1]}</td>` : ''}\n",
       "                <td>${sanitize(doc_text.substring(entry.begin, entry.end))}</td>`\n",
       "\n",
       "                tbody.appendChild(row)\n",
       "            })\n",
       "            table.appendChild(tbody)\n",
       "            frag.appendChild(table)\n",
       "        }\n",
       "\n",
       "        // Render Text\n",
       "        let highlight_regions = []\n",
       "        for(let i = 0; i < entries.length; i++)\n",
       "        {\n",
       "            if(!entries[i].visible) continue\n",
       "            if(entries[i].valid_sets.length > 0)\n",
       "            {\n",
       "                let span = entries[i].getSetSpan();\n",
       "                let ids = [entries[i].id, ...entries[i].valid_sets.map(set => { return set.entry.id })]\n",
       "                if(entries[i].isComplex()) {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_COMPLEX, ids: ids})\n",
       "                } else {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_NESTED, ids: ids})\n",
       "                }\n",
       "                i = span.highest_id\n",
       "            } else {\n",
       "                highlight_regions.push({begin: entries[i].begin, end: entries[i].end, type: TYPE_SOLO, ids: [entries[i].id]})\n",
       "            }\n",
       "        }\n",
       "\n",
       "        let paragraph = document.createElement(\"p\")\n",
       "        if(highlight_regions.length == 0) {\n",
       "            paragraph.innerHTML = sanitize(doc_text)\n",
       "        } else {\n",
       "            let begin = 0\n",
       "            highlight_regions.forEach(region => {\n",
       "                paragraph.innerHTML += sanitize(doc_text.substring(begin, region.begin))\n",
       "\n",
       "                let mark = document.createElement(\"mark\")\n",
       "                // The data-ids tag is a list of comma-separated reference IDs for matching Spans \n",
       "                mark.setAttribute(\"data-ids\", \"\");\n",
       "                if (region.type != TYPE_NESTED) {\n",
       "                    region.ids.forEach(id => {\n",
       "                        mark.setAttribute(\"data-ids\", mark.getAttribute(\"data-ids\") + `#${id},`)\n",
       "                        if(doc.lookup_table[id].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    })\n",
       "                    mark.innerHTML = sanitize(doc_text.substring(region.begin, region.end))\n",
       "                } else {\n",
       "                    mark.setAttribute(\"data-ids\", `#${region.ids[0]},`)\n",
       "                    if(doc.lookup_table[region.ids[0]].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    let nested_begin = region.begin\n",
       "                    region.ids.slice(1).forEach(nested_id => {\n",
       "                        let nested_region = doc.lookup_table[nested_id]\n",
       "                        mark.innerHTML += sanitize(doc_text.substring(nested_begin, nested_region.begin))\n",
       "                        let nested_mark = document.createElement(\"mark\")\n",
       "                        nested_mark.setAttribute(\"data-ids\", `#${nested_id},`)\n",
       "                        if(nested_region.highlighted) nested_mark.classList.add(\"highlighted\")\n",
       "                        nested_mark.innerHTML = sanitize(doc_text.substring(nested_region.begin, nested_region.end))\n",
       "                        nested_begin = nested_region.end\n",
       "                        mark.appendChild(nested_mark)\n",
       "                    })\n",
       "                    mark.innerHTML += sanitize(doc_text.substring(nested_begin, region.end))\n",
       "                }\n",
       "\n",
       "                if(region.type == TYPE_COMPLEX) {\n",
       "                    let markTag = document.createElement(\"span\")\n",
       "                    markTag.textContent = \"Set\"\n",
       "                    markTag.classList.add(\"mark-tag\")\n",
       "                    mark.classList.add(\"complex-set\")\n",
       "                    mark.appendChild(markTag)\n",
       "                }\n",
       "\n",
       "                begin = region.end\n",
       "                paragraph.appendChild(mark)\n",
       "            })\n",
       "            paragraph.innerHTML += sanitize(doc_text.substring(highlight_regions[highlight_regions.length - 1].end, doc_text.length))\n",
       "        }\n",
       "\n",
       "        frag.appendChild(paragraph)\n",
       "\n",
       "        return frag\n",
       "    }\n",
       "\n",
       "    /** Attach hover and click events to a document render via event delegation */\n",
       "    function attachDocumentEvents(fragment, doc_object, source_spanarray) {\n",
       "        const doc_table_body = fragment.querySelector(\"table>tbody\")\n",
       "        const doc_text = fragment.querySelector(\"p\")\n",
       "\n",
       "        // Hover highlight events\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        // Click disable/enable events\n",
       "\n",
       "        doc_table_body.addEventListener(\"click\", (event) => {\n",
       "            const closest_control_button = event.target.closest(\"button[data-control]\")\n",
       "            if(closest_control_button == undefined) return\n",
       "\n",
       "            const closest_tr = event.target.closest(\"tr\")\n",
       "            if(closest_tr == undefined) return\n",
       "\n",
       "            const matching_span = doc_object.lookup_table[closest_tr.getAttribute(\"data-id\")]\n",
       "            if(matching_span == undefined) return\n",
       "\n",
       "            switch(closest_control_button.getAttribute(\"data-control\")) {\n",
       "                case \"visibility\":\n",
       "                    {\n",
       "                        matching_span.visible = !matching_span.visible\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "                case \"highlight\":\n",
       "                    {\n",
       "                        matching_span.highlighted = !matching_span.highlighted\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "            }\n",
       "\n",
       "\n",
       "\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"click\", (event) => {\n",
       "            const closest_mark = event.target.closest(\"mark\")\n",
       "            if(closest_mark == undefined) return\n",
       "\n",
       "            // Preprocess ID string into a list of IDs\n",
       "            const ids = closest_mark.getAttribute(\"data-ids\")\n",
       "                .split(\",\")\n",
       "                .slice(0, -1)\n",
       "                .map(id => {\n",
       "                    return id.substring(1)\n",
       "                })\n",
       "\n",
       "            // If any of the connected IDs are highlighted, we set all spans in the list to not highlighted.\n",
       "            // Inversely, we want all spans highlighted if none were previously.\n",
       "\n",
       "            const highlighted_entry = ids.find(id => {\n",
       "                return doc_object.lookup_table[id].highlighted\n",
       "            })\n",
       "\n",
       "            const is_highlighted = (highlighted_entry != undefined)\n",
       "\n",
       "            ids.forEach(id => {\n",
       "                const span = doc_object.lookup_table[id]\n",
       "                if(span != undefined) span.highlighted = !is_highlighted\n",
       "            })\n",
       "\n",
       "            source_spanarray.render()\n",
       "        })\n",
       "    }\n",
       "} else {\n",
       "    // SpanArray JS is already defined and not an outdated copy\n",
       "    // Replace global SpanArray CSS with latest copy IFF global stylesheet is undefined\n",
       "\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet == undefined) {\n",
       "            document.head.appendChild(local_stylesheet)\n",
       "        } else {\n",
       "            document.currentScript.parentElement.removeChild(local_stylesheet)\n",
       "        }\n",
       "    }       \n",
       "}\n",
       "}\n",
       "</script>\n",
       "<div class=\"span-array\">\n",
       "\n",
       "    <div class='document'>\n",
       "        <table style='\n",
       "            table-layout: auto;\n",
       "            overflow: hidden;\n",
       "            width: 100%;\n",
       "            border-collapse: collapse;\n",
       "            '>\n",
       "            <thead style='font-variant-caps: all-petite-caps;'>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "\n",
       "                <th style='text-align:right;width:100%'>context</th>\n",
       "            </tr></thead>\n",
       "            <tbody>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>0</b></td>\n",
       "            <td>226</td>\n",
       "            <td>229</td>\n",
       "\n",
       "            <td>Sir</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>1</b></td>\n",
       "            <td>230</td>\n",
       "            <td>235</td>\n",
       "\n",
       "            <td>Robin</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>2</b></td>\n",
       "            <td>236</td>\n",
       "            <td>239</td>\n",
       "\n",
       "            <td>the</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>3</b></td>\n",
       "            <td>240</td>\n",
       "            <td>274</td>\n",
       "\n",
       "            <td>Not-Quite-So-Brave-as-Sir-Lancelot</td>\n",
       "        </tr>\n",
       "\n",
       "            </tbody>\n",
       "        </table>\n",
       "        <p style='\n",
       "            padding: 1em;\n",
       "            line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "            '>\n",
       "\n",
       "            In AD 932, King Arthur and his squire, Patsy, travel throughout Britain searching for men to join the Knights of the Round Table. Along the way, he recruits Sir Bedevere the Wise, Sir Lancelot the Brave, Sir Galahad the Pure, \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Robin</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>the</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Not-Quite-So-Brave-as-Sir-Lancelot</span>\n",
       "            , and Sir Not-Appearing-in-this-Film, along with their squires and Robin&#39;s troubadours.\n",
       "        </p>\n",
       "    </div>\n",
       "\n",
       "    <span style=\"font-size: 0.8em;color: #b3b3b3;\">Your notebook viewer does not support Javascript execution. The above rendering will not be interactive.</span>\n",
       "</div>\n",
       "<script>\n",
       "    {\n",
       "        const Span = window.SpanArray.Span\n",
       "        const script_context = document.currentScript\n",
       "        const documents = []\n",
       "\n",
       "    {\n",
       "\n",
       "    const doc_spans = [[226,229],[230,235],[236,239],[240,274]]\n",
       "    const doc_text = 'In AD 932, King Arthur and his squire, Patsy, travel throughout Britain searching for men to join the Knights of the Round Table. Along the way, he recruits Sir Bedevere the Wise, Sir Lancelot the Brave, Sir Galahad the Pure, Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot, and Sir Not-Appearing-in-this-Film, along with their squires and Robin\\'s troubadours.'\n",
       "\n",
       "        documents.push({doc_text: doc_text, doc_spans: doc_spans})\n",
       "\n",
       "    }\n",
       "\n",
       "        const instance = new window.SpanArray.SpanArray(documents, true, script_context)\n",
       "        instance.render()\n",
       "    }\n",
       "</script>\n",
       "\n"
      ],
      "text/plain": [
       "<SpanArray>\n",
       "[                               [226, 229): 'Sir',\n",
       "                              [230, 235): 'Robin',\n",
       "                                [236, 239): 'the',\n",
       " [240, 274): 'Not-Quite-So-Brave-as-Sir-Lancelot']\n",
       "Length: 4, dtype: SpanDtype"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# It can also be indexed with a slice, producing another `SpanArray`.\n",
    "toks = tokens[40:44]\n",
    "toks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[226, 229): 'Sir',\n",
       " [230, 235): 'Robin',\n",
       " [236, 239): 'the',\n",
       " [240, 274): 'Not-Quite-So-Brave-as-Sir-Lancelot']"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Iterate over the array to get each `Span`.\n",
    "toks = [span for span in tokens[40:44]]\n",
    "toks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[226, 274): 'Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot'"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Addition of `Span`s or `SpanArray`s are supported.\n",
    "# The result is the minimum `Span` that covers both `Span`s.\n",
    "result = toks[0] + toks[-1]\n",
    "result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# You can check if one `Span` contains another.\n",
    "result.contains(toks[1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Also if two `Span`s overlap.\n",
    "a = toks[0] + toks[2]\n",
    "b = toks[2] + toks[3]\n",
    "a.overlaps(b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([204, 207): 'Sir', [226, 229): 'Sir')"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Get 2 `Span`s to test equality.\n",
    "sir = tokens[36]\n",
    "other_sir = tokens[40]\n",
    "sir, other_sir"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(False, True)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Equality is determined by text and offset values, not just text.\n",
    "sir == other_sir, \\\n",
    "sir.covered_text == other_sir.covered_text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Only a `Span` from the same target text with matching offsets is equal.\n",
    "sir == tp.Span(text, 204, 207)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TokenSpanArray\n",
    "\n",
    "A `TokenSpanArray` builds on a `SpanArray` with the ability to span text as indices of a `SpanArray` instead of character based offsets. This makes it convenient to use when doing analysis on the token level. Similar to `SpanArray`, a single item in a `TokenSpanArray` is a `TokenSpan`. For an example, let's define a single `TokenSpan` using the target text from above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[11, 22): 'King Arthur'"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Single `TokenSpan` to cover \"King Arthur\" - notice we begin with the third\n",
    "# token and end at the fifth.\n",
    "tp.TokenSpan(tokens, 3, 5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<style class=\"span-array-css\">\n",
       "            .span-array {\n",
       "    --thead-background-color: var(--jp-layout-color1, inherit);\n",
       "    --thead-text-color: var(--jp-ui-font-color1, inherit);\n",
       "    --tbody-background-color-1: var(--jp-layout-color1, inherit);\n",
       "    --tbody-background-color-2: var(--jp-layout-color2, inherit);\n",
       "    --tbody-background-color-hover: var(--jp-rendermime-table-row-hover-background, var(--jp-layout-color3, inherit));\n",
       "    --tbody-background-color-disabled: var(--jp-layout-color4, #ccccd1);\n",
       "    --tbody-text-color: var(--jp-ui-font-color0, inherit);\n",
       "    --tbody-text-color-disabled: var(--jp-ui-inverse-font-color0, #b3b3b9);\n",
       "    --table-font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "\n",
       "    --table-control-background: rgba(0, 0, 0, 0.2);\n",
       "    --table-control-color: var(--jp-ui-font-color0);\n",
       "    --table-control-border: 1px solid rgba(0, 0, 0, 0.8);\n",
       "    --table-control-border-radius: 0.5em;\n",
       "\n",
       "    --root-highlight: #a0c4ff;\n",
       "    --nested-highlight: #ffadad;\n",
       "    --hover-highlight: #ffd6a5;\n",
       "\n",
       "    --inverted-background-color: #0B525B;\n",
       "    --inverted-text-color: rgb(243, 243, 243);\n",
       "    --paragraph-border-color: var(--jp-layout-color2, inherit);\n",
       "\n",
       "    --fallback-font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif\n",
       "}\n",
       "\n",
       "/* Table of span offsets */\n",
       ".span-array>.document>table {\n",
       "    table-layout: auto;\n",
       "    overflow: hidden;\n",
       "    width: 100%;\n",
       "    border-collapse: collapse;\n",
       "    font-family: var(--table-font-family);\n",
       "}\n",
       "\n",
       ".span-array>.document>table thead {\n",
       "    font-variant-caps: all-petite-caps;\n",
       "}\n",
       "\n",
       ".span-array>.document>table th {\n",
       "    padding: 1em;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:last-child, .span-array>.document>table tr>th:last-child {\n",
       "    text-align: right;\n",
       "    width: 100%;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:not(tr>td:last-child), .span-array>.document>table tr>th:not(tr>th:last-child) {\n",
       "    text-align: left;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n), .span-array>.document>table tr.disabled.hover:nth-child(n) {\n",
       "    background-color: var(--tbody-background-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n)>td, .span-array>.document>table tr.disabled.hover:nth-child(n)>td {\n",
       "    color: var(--tbody-text-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.hover:not(.disabled) {\n",
       "    background: var(--jp-rendermime-table-row-hover-background);\n",
       "}\n",
       "\n",
       "/* Table control buttons */\n",
       "\n",
       ".span-array>.document>table td.sa-table-controls-container {\n",
       "    vertical-align: center;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls {\n",
       "    display: flex;\n",
       "    flex-direction: row;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button {\n",
       "    background-color: var(--table-control-background);\n",
       "    color: var(--table-control-color);\n",
       "    border: var(--table-control-border);\n",
       "    border-right: none;\n",
       "    border-radius: 0;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:first-child {\n",
       "    border-radius: var(--table-control-border-radius) 0 0 var(--table-control-border-radius);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:last-child {\n",
       "    border-radius: 0 var(--table-control-border-radius) var(--table-control-border-radius) 0;\n",
       "    border-right: var(--table-control-border);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button[data-control=\"visibility\"]:hover {\n",
       "    background-color: var(--root-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"]:hover {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.highlighted:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"] {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       "/* Styling for spans within document context */\n",
       ".span-array>.document>p {\n",
       "    border:1px solid var(--paragraph-border-color);\n",
       "    border-radius: 0.2em;\n",
       "    padding: 1em;\n",
       "    line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "    box-sizing: border-box;\n",
       "    font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "}\n",
       "\n",
       "body[data-jp-theme-light=\"false\"].span-array>.document>p {\n",
       "    border: 1px solid black;\n",
       "    background-color: var(--inverted-background-color);\n",
       "    color: var(--inverted-text-color);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    padding: 0.4em 0.4em;\n",
       "    border-radius: 0.35em;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>p .mark {\n",
       "    color: var(black);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    background-color: var(--root-highlight);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>mark {\n",
       "    background-color: var(--nested-highlight);\n",
       "    padding: 0.2em 0.4em;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.complex-set {\n",
       "    background: linear-gradient(to right, var(--root-highlight), var(--nested-highlight))\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>span.mark-tag {\n",
       "    font-weight: bolder;\n",
       "    font-size: 0.8em;\n",
       "    font-variant: small-caps;\n",
       "    font-variant-caps: all-small-caps;\n",
       "    margin-left: 8px;\n",
       "    text-transform: uppercase;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.hover, .span-array.span-array>.document>p mark>mark.hover, .span-array>.document>p mark.complex-set.hover, .span-array>.document>p mark.highlighted, .span-array>.document>p mark.complex-set.highlighted, .span-array.span-array>.document>p mark>mark.highlighted {\n",
       "    background: none;\n",
       "    background-color: var(--hover-highlight);\n",
       "}\n",
       "\n",
       "</style>\n",
       "<script>\n",
       "{\n",
       "            // Increment the version to invalidate the cached script\n",
       "const VERSION = 0.79\n",
       "const global_stylesheet = document.head.querySelector(\"style.span-array-css\")\n",
       "const local_stylesheet = document.currentScript.parentElement.querySelector(\"style.span-array-css\")\n",
       "\n",
       "if(window.SpanArray == undefined || window.SpanArray.VERSION == undefined || window.SpanArray.VERSION < VERSION) {\n",
       "\n",
       "    // Replace global SpanArray CSS with latest copy\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet != undefined) {\n",
       "            document.head.removeChild(global_stylesheet)\n",
       "        }\n",
       "        document.head.appendChild(local_stylesheet)\n",
       "    }\n",
       "\n",
       "    // Sets up the SpanArray global namespace\n",
       "    window.SpanArray = {}\n",
       "    window.SpanArray.VERSION = VERSION\n",
       "\n",
       "    window.SpanArray.TYPE_OVERLAP = 0;\n",
       "    window.SpanArray.TYPE_NESTED = 1;\n",
       "    window.SpanArray.TYPE_COMPLEX = 2;\n",
       "    window.SpanArray.TYPE_SOLO = 3;\n",
       "\n",
       "    const TYPE_OVERLAP = window.SpanArray.TYPE_OVERLAP;\n",
       "    const TYPE_NESTED = window.SpanArray.TYPE_NESTED;\n",
       "    const TYPE_COMPLEX = window.SpanArray.TYPE_COMPLEX;\n",
       "    const TYPE_SOLO = window.SpanArray.TYPE_SOLO;\n",
       "\n",
       "    function sanitize(input) {\n",
       "        let out = input.slice();\n",
       "        out = out.replace(/&/g, \"&amp;\")\n",
       "        out = out.replace(/</g, \"&lt;\")\n",
       "        out = out.replace(/>/g, \"&gt;\")\n",
       "        out = out.replace(/\\$/g, \"<span>&#36;</span>\")\n",
       "        out = out.replace(/\"/g, \"&quot;\")\n",
       "        out = out.replace(/(\\r|\\n)/g, \"<br>\")\n",
       "        return out;\n",
       "    }\n",
       "\n",
       "    /** Comparison function used to sort SpanArrays by position and length\n",
       "     *  Will sort by primarily by earliest beginning point. On a tie, will prioritize latest end point (first and largest)\n",
       "     *  Used by mark relationship algorithm.\n",
       "     */\n",
       "    function compareSpanArrays(a, b) {\n",
       "        const start_diff = a.begin - b.begin\n",
       "        if(start_diff == 0) {\n",
       "            return b.end - a.end\n",
       "        }\n",
       "        return start_diff;\n",
       "    }\n",
       "\n",
       "    /** Models an instance of a SpanArray, with document-separated spans and text\n",
       "     * NOTE: Using docs instead of documents to avoid unintentionally manipulating the global 'document' object.\n",
       "    */\n",
       "    class SpanArray {\n",
       "        constructor(docs, show_offsets, script_context) {\n",
       "            this.docs = docs\n",
       "            this.show_offsets = show_offsets\n",
       "            this.script_context = script_context\n",
       "\n",
       "            // For each doc, generate a lookup map for quick ID access\n",
       "            this.docs = this.docs.map(doc => {\n",
       "                doc.span_objects = Span.arrayFromSpanArray(doc.doc_spans)\n",
       "                doc.lookup_table = {}\n",
       "                doc.span_objects.forEach(span => {\n",
       "                    doc.lookup_table[span.id] = span\n",
       "                })\n",
       "                return doc\n",
       "            })\n",
       "        }\n",
       "\n",
       "        render() {\n",
       "            let span_array_frag = document.createDocumentFragment()\n",
       "            // For each document, create a document fragment and append to a document container\n",
       "            for(let doc_index = 0; doc_index < this.docs.length; doc_index++)\n",
       "            {\n",
       "                let doc = this.docs[doc_index]\n",
       "                let doc_container = document.createElement(\"div\")\n",
       "                // Using the data-doc-id attribute allows a selector to access a document's render by its index\n",
       "                doc_container.setAttribute(\"data-doc-id\", doc_index)\n",
       "                doc_container.classList.add(\"document\")\n",
       "                const document_fragment = getDocumentFragment(doc, this.show_offsets)\n",
       "                if(this.show_offsets) {\n",
       "                    attachDocumentEvents(document_fragment, doc, this)\n",
       "                }\n",
       "                doc_container.appendChild(document_fragment)\n",
       "                span_array_frag.appendChild(doc_container)\n",
       "            }\n",
       "            let container = this.script_context.parentElement.querySelector(\".span-array\")\n",
       "            if(container != undefined) {\n",
       "                container.innerHTML = \"\"\n",
       "                container.appendChild(span_array_frag)\n",
       "            } else {\n",
       "                console.error(\"No container found for SpanArray renderer\")\n",
       "            }\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.SpanArray = SpanArray\n",
       "\n",
       "\n",
       "    /** Models an instance of a Span and its relationship to other spans in the document */\n",
       "    class Span {\n",
       "\n",
       "        // Creates an ordered list of entries from a list of spans with struct [begin, end]\n",
       "        static arrayFromSpanArray(spanArray) {\n",
       "            let entries = []\n",
       "            let span;\n",
       "            for(let i = 0; i < spanArray.length; i++)\n",
       "            {\n",
       "                span = spanArray[i];\n",
       "                entries.push(new Span(i, span[0], span[1]))\n",
       "            }\n",
       "\n",
       "            entries = entries.sort(compareSpanArrays)\n",
       "\n",
       "            let set;\n",
       "            for(let i = 0; i < entries.length; i++) {\n",
       "                for(let j = i+1; j < entries.length && entries[j].begin < entries[i].end; j++) {\n",
       "                    if(entries[j].end <= entries[i].end) {\n",
       "                        set = {type: TYPE_NESTED, entry: entries[j]}\n",
       "                    } else {\n",
       "                        set = {type: TYPE_OVERLAP, entry: entries[j]}\n",
       "                    }\n",
       "                    entries[i].sets.push(set)\n",
       "                }\n",
       "            }\n",
       "\n",
       "            return entries\n",
       "        }\n",
       "\n",
       "        constructor(id, begin, end) {\n",
       "            this.id = id\n",
       "            this.begin = begin\n",
       "            this.end = end\n",
       "            this.sets = []\n",
       "            this.visible = true\n",
       "            this.highlighted = false\n",
       "        }\n",
       "\n",
       "        // Returns only visible sets\n",
       "        get valid_sets() {\n",
       "            let valid_sets = []\n",
       "\n",
       "            this.sets.forEach(set => {\n",
       "                if(set.entry.visible) valid_sets.push(set)\n",
       "            })\n",
       "\n",
       "            return valid_sets\n",
       "        }\n",
       "\n",
       "        // Returns true if mark should render as a compound set of spans\n",
       "        isComplex() {\n",
       "            for(let i = 0; i < this.valid_sets.length; i++) {\n",
       "                let otherMember = this.valid_sets[i].entry;\n",
       "                if(this.valid_sets[i].type == TYPE_OVERLAP && otherMember.visible) {\n",
       "                    return true;\n",
       "                } else {\n",
       "                    if(otherMember.valid_sets.length > 0 && otherMember.visible) {\n",
       "                        return true;\n",
       "                    }\n",
       "                }\n",
       "            }\n",
       "            return false;\n",
       "        }\n",
       "\n",
       "        // Gets the combined span of all connected elements\n",
       "        getSetSpan() {\n",
       "            let begin = this.begin\n",
       "            let end = this.end\n",
       "            let highest_id = this.id\n",
       "\n",
       "            this.valid_sets.forEach(set => {\n",
       "                let other = set.entry.getSetSpan()\n",
       "                if(other.begin < begin) begin = other.begin\n",
       "                if(other.end > end) end = other.end\n",
       "                if(other.highest_id > highest_id) highest_id = other.highest_id\n",
       "            })\n",
       "\n",
       "            return {begin: begin, end: end, highest_id: highest_id}\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.Span = Span\n",
       "\n",
       "    // Get the DocumentFragment for a single document \n",
       "    function getDocumentFragment(doc, show_offsets) {\n",
       "\n",
       "        const doc_text = doc.doc_text;\n",
       "        const entries = doc.span_objects;\n",
       "\n",
       "        let frag = document.createDocumentFragment()\n",
       "\n",
       "        // Render Table\n",
       "        if(show_offsets) {\n",
       "            let table = document.createElement(\"table\")\n",
       "            table.innerHTML = `\n",
       "            <thead>\n",
       "            <tr>\n",
       "                <th></th>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "                ${(doc['doc_token_spans'] != undefined) ? '<th>begin token</th> <th>end token</th>' : ''}\n",
       "                <th>context</th>\n",
       "            </tr>\n",
       "            </thead>`\n",
       "            let tbody = document.createElement(\"tbody\")\n",
       "            entries.forEach(entry => {\n",
       "                let row = document.createElement(\"tr\")\n",
       "                row.setAttribute(\"data-id\", entry.id.toString())\n",
       "                if(!entry.visible)\n",
       "                {\n",
       "                    row.classList.add(\"disabled\")\n",
       "                }\n",
       "                if(entry.highlighted)\n",
       "                {\n",
       "                    row.classList.add(\"highlighted\")\n",
       "                }\n",
       "\n",
       "                // Adds the span entry to the table. doc_text is sanitized by replacing the reserved\n",
       "                // symbols by their entity name representations\n",
       "                row.innerHTML += `\n",
       "                <td>\n",
       "                    <div class='sa-table-controls'>\n",
       "                    <button data-control='visibility' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 576 512\"><path d=\"M572.52 241.4C518.29 135.59 410.93 64 288 64S57.68 135.64 3.48 241.41a32.35 32.35 0 0 0 0 29.19C57.71 376.41 165.07 448 288 448s230.32-71.64 284.52-177.41a32.35 32.35 0 0 0 0-29.19zM288 400a144 144 0 1 1 144-144 143.93 143.93 0 0 1-144 144zm0-240a95.31 95.31 0 0 0-25.31 3.79 47.85 47.85 0 0 1-66.9 66.9A95.78 95.78 0 1 0 288 160z\"/></svg></button>\n",
       "                    <button data-control='highlight' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 512 512\"><path d=\"M256 160c-52.9 0-96 43.1-96 96s43.1 96 96 96 96-43.1 96-96-43.1-96-96-96zm246.4 80.5l-94.7-47.3 33.5-100.4c4.5-13.6-8.4-26.5-21.9-21.9l-100.4 33.5-47.4-94.8c-6.4-12.8-24.6-12.8-31 0l-47.3 94.7L92.7 70.8c-13.6-4.5-26.5 8.4-21.9 21.9l33.5 100.4-94.7 47.4c-12.8 6.4-12.8 24.6 0 31l94.7 47.3-33.5 100.5c-4.5 13.6 8.4 26.5 21.9 21.9l100.4-33.5 47.3 94.7c6.4 12.8 24.6 12.8 31 0l47.3-94.7 100.4 33.5c13.6 4.5 26.5-8.4 21.9-21.9l-33.5-100.4 94.7-47.3c13-6.5 13-24.7.2-31.1zm-155.9 106c-49.9 49.9-131.1 49.9-181 0-49.9-49.9-49.9-131.1 0-181 49.9-49.9 131.1-49.9 181 0 49.9 49.9 49.9 131.1 0 181z\"/></svg></button>\n",
       "                    </div>\n",
       "                </td>\n",
       "                <td><b>${entry.id.toString()}</b></td>\n",
       "                <td>${entry.begin}</td>\n",
       "                <td>${entry.end}</td>\n",
       "                ${(doc.doc_token_spans != undefined) ? `<td>${doc.doc_token_spans[entry.id][0]}</td><td>${doc.doc_token_spans[entry.id][1]}</td>` : ''}\n",
       "                <td>${sanitize(doc_text.substring(entry.begin, entry.end))}</td>`\n",
       "\n",
       "                tbody.appendChild(row)\n",
       "            })\n",
       "            table.appendChild(tbody)\n",
       "            frag.appendChild(table)\n",
       "        }\n",
       "\n",
       "        // Render Text\n",
       "        let highlight_regions = []\n",
       "        for(let i = 0; i < entries.length; i++)\n",
       "        {\n",
       "            if(!entries[i].visible) continue\n",
       "            if(entries[i].valid_sets.length > 0)\n",
       "            {\n",
       "                let span = entries[i].getSetSpan();\n",
       "                let ids = [entries[i].id, ...entries[i].valid_sets.map(set => { return set.entry.id })]\n",
       "                if(entries[i].isComplex()) {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_COMPLEX, ids: ids})\n",
       "                } else {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_NESTED, ids: ids})\n",
       "                }\n",
       "                i = span.highest_id\n",
       "            } else {\n",
       "                highlight_regions.push({begin: entries[i].begin, end: entries[i].end, type: TYPE_SOLO, ids: [entries[i].id]})\n",
       "            }\n",
       "        }\n",
       "\n",
       "        let paragraph = document.createElement(\"p\")\n",
       "        if(highlight_regions.length == 0) {\n",
       "            paragraph.innerHTML = sanitize(doc_text)\n",
       "        } else {\n",
       "            let begin = 0\n",
       "            highlight_regions.forEach(region => {\n",
       "                paragraph.innerHTML += sanitize(doc_text.substring(begin, region.begin))\n",
       "\n",
       "                let mark = document.createElement(\"mark\")\n",
       "                // The data-ids tag is a list of comma-separated reference IDs for matching Spans \n",
       "                mark.setAttribute(\"data-ids\", \"\");\n",
       "                if (region.type != TYPE_NESTED) {\n",
       "                    region.ids.forEach(id => {\n",
       "                        mark.setAttribute(\"data-ids\", mark.getAttribute(\"data-ids\") + `#${id},`)\n",
       "                        if(doc.lookup_table[id].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    })\n",
       "                    mark.innerHTML = sanitize(doc_text.substring(region.begin, region.end))\n",
       "                } else {\n",
       "                    mark.setAttribute(\"data-ids\", `#${region.ids[0]},`)\n",
       "                    if(doc.lookup_table[region.ids[0]].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    let nested_begin = region.begin\n",
       "                    region.ids.slice(1).forEach(nested_id => {\n",
       "                        let nested_region = doc.lookup_table[nested_id]\n",
       "                        mark.innerHTML += sanitize(doc_text.substring(nested_begin, nested_region.begin))\n",
       "                        let nested_mark = document.createElement(\"mark\")\n",
       "                        nested_mark.setAttribute(\"data-ids\", `#${nested_id},`)\n",
       "                        if(nested_region.highlighted) nested_mark.classList.add(\"highlighted\")\n",
       "                        nested_mark.innerHTML = sanitize(doc_text.substring(nested_region.begin, nested_region.end))\n",
       "                        nested_begin = nested_region.end\n",
       "                        mark.appendChild(nested_mark)\n",
       "                    })\n",
       "                    mark.innerHTML += sanitize(doc_text.substring(nested_begin, region.end))\n",
       "                }\n",
       "\n",
       "                if(region.type == TYPE_COMPLEX) {\n",
       "                    let markTag = document.createElement(\"span\")\n",
       "                    markTag.textContent = \"Set\"\n",
       "                    markTag.classList.add(\"mark-tag\")\n",
       "                    mark.classList.add(\"complex-set\")\n",
       "                    mark.appendChild(markTag)\n",
       "                }\n",
       "\n",
       "                begin = region.end\n",
       "                paragraph.appendChild(mark)\n",
       "            })\n",
       "            paragraph.innerHTML += sanitize(doc_text.substring(highlight_regions[highlight_regions.length - 1].end, doc_text.length))\n",
       "        }\n",
       "\n",
       "        frag.appendChild(paragraph)\n",
       "\n",
       "        return frag\n",
       "    }\n",
       "\n",
       "    /** Attach hover and click events to a document render via event delegation */\n",
       "    function attachDocumentEvents(fragment, doc_object, source_spanarray) {\n",
       "        const doc_table_body = fragment.querySelector(\"table>tbody\")\n",
       "        const doc_text = fragment.querySelector(\"p\")\n",
       "\n",
       "        // Hover highlight events\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        // Click disable/enable events\n",
       "\n",
       "        doc_table_body.addEventListener(\"click\", (event) => {\n",
       "            const closest_control_button = event.target.closest(\"button[data-control]\")\n",
       "            if(closest_control_button == undefined) return\n",
       "\n",
       "            const closest_tr = event.target.closest(\"tr\")\n",
       "            if(closest_tr == undefined) return\n",
       "\n",
       "            const matching_span = doc_object.lookup_table[closest_tr.getAttribute(\"data-id\")]\n",
       "            if(matching_span == undefined) return\n",
       "\n",
       "            switch(closest_control_button.getAttribute(\"data-control\")) {\n",
       "                case \"visibility\":\n",
       "                    {\n",
       "                        matching_span.visible = !matching_span.visible\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "                case \"highlight\":\n",
       "                    {\n",
       "                        matching_span.highlighted = !matching_span.highlighted\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "            }\n",
       "\n",
       "\n",
       "\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"click\", (event) => {\n",
       "            const closest_mark = event.target.closest(\"mark\")\n",
       "            if(closest_mark == undefined) return\n",
       "\n",
       "            // Preprocess ID string into a list of IDs\n",
       "            const ids = closest_mark.getAttribute(\"data-ids\")\n",
       "                .split(\",\")\n",
       "                .slice(0, -1)\n",
       "                .map(id => {\n",
       "                    return id.substring(1)\n",
       "                })\n",
       "\n",
       "            // If any of the connected IDs are highlighted, we set all spans in the list to not highlighted.\n",
       "            // Inversely, we want all spans highlighted if none were previously.\n",
       "\n",
       "            const highlighted_entry = ids.find(id => {\n",
       "                return doc_object.lookup_table[id].highlighted\n",
       "            })\n",
       "\n",
       "            const is_highlighted = (highlighted_entry != undefined)\n",
       "\n",
       "            ids.forEach(id => {\n",
       "                const span = doc_object.lookup_table[id]\n",
       "                if(span != undefined) span.highlighted = !is_highlighted\n",
       "            })\n",
       "\n",
       "            source_spanarray.render()\n",
       "        })\n",
       "    }\n",
       "} else {\n",
       "    // SpanArray JS is already defined and not an outdated copy\n",
       "    // Replace global SpanArray CSS with latest copy IFF global stylesheet is undefined\n",
       "\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet == undefined) {\n",
       "            document.head.appendChild(local_stylesheet)\n",
       "        } else {\n",
       "            document.currentScript.parentElement.removeChild(local_stylesheet)\n",
       "        }\n",
       "    }       \n",
       "}\n",
       "}\n",
       "</script>\n",
       "<div class=\"span-array\">\n",
       "\n",
       "    <div class='document'>\n",
       "        <table style='\n",
       "            table-layout: auto;\n",
       "            overflow: hidden;\n",
       "            width: 100%;\n",
       "            border-collapse: collapse;\n",
       "            '>\n",
       "            <thead style='font-variant-caps: all-petite-caps;'>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "                <th>begin token</th><th>end token</th>\n",
       "                <th style='text-align:right;width:100%'>context</th>\n",
       "            </tr></thead>\n",
       "            <tbody>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>0</b></td>\n",
       "            <td>11</td>\n",
       "            <td>22</td>\n",
       "\n",
       "            <td>3</td>\n",
       "            <td>5</td>\n",
       "\n",
       "            <td>King Arthur</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>1</b></td>\n",
       "            <td>39</td>\n",
       "            <td>44</td>\n",
       "\n",
       "            <td>8</td>\n",
       "            <td>9</td>\n",
       "\n",
       "            <td>Patsy</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>2</b></td>\n",
       "            <td>157</td>\n",
       "            <td>178</td>\n",
       "\n",
       "            <td>28</td>\n",
       "            <td>32</td>\n",
       "\n",
       "            <td>Sir Bedevere the Wise</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>3</b></td>\n",
       "            <td>180</td>\n",
       "            <td>202</td>\n",
       "\n",
       "            <td>32</td>\n",
       "            <td>36</td>\n",
       "\n",
       "            <td>Sir Lancelot the Brave</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>4</b></td>\n",
       "            <td>204</td>\n",
       "            <td>224</td>\n",
       "\n",
       "            <td>36</td>\n",
       "            <td>40</td>\n",
       "\n",
       "            <td>Sir Galahad the Pure</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>5</b></td>\n",
       "            <td>226</td>\n",
       "            <td>274</td>\n",
       "\n",
       "            <td>40</td>\n",
       "            <td>44</td>\n",
       "\n",
       "            <td>Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>6</b></td>\n",
       "            <td>280</td>\n",
       "            <td>310</td>\n",
       "\n",
       "            <td>45</td>\n",
       "            <td>47</td>\n",
       "\n",
       "            <td>Sir Not-Appearing-in-this-Film</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>7</b></td>\n",
       "            <td>341</td>\n",
       "            <td>348</td>\n",
       "\n",
       "            <td>52</td>\n",
       "            <td>53</td>\n",
       "\n",
       "            <td>Robin&#39;s</td>\n",
       "        </tr>\n",
       "\n",
       "            </tbody>\n",
       "        </table>\n",
       "        <p style='\n",
       "            padding: 1em;\n",
       "            line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "            '>\n",
       "\n",
       "            In AD 932, \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>King Arthur</span>\n",
       "\n",
       "             and his squire, \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Patsy</span>\n",
       "\n",
       "            , travel throughout Britain searching for men to join the Knights of the Round Table. Along the way, he recruits \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir Bedevere the Wise</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir Lancelot the Brave</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir Galahad the Pure</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot</span>\n",
       "\n",
       "            , and \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Sir Not-Appearing-in-this-Film</span>\n",
       "\n",
       "            , along with their squires and \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Robin&#39;s</span>\n",
       "             troubadours.\n",
       "        </p>\n",
       "    </div>\n",
       "\n",
       "    <span style=\"font-size: 0.8em;color: #b3b3b3;\">Your notebook viewer does not support Javascript execution. The above rendering will not be interactive.</span>\n",
       "</div>\n",
       "<script>\n",
       "    {\n",
       "        const Span = window.SpanArray.Span\n",
       "        const script_context = document.currentScript\n",
       "        const documents = []\n",
       "\n",
       "    {\n",
       "\n",
       "    const doc_spans = [[11,22],[39,44],[157,178],[180,202],[204,224],[226,274],[280,310],[341,348]]\n",
       "    const doc_text = 'In AD 932, King Arthur and his squire, Patsy, travel throughout Britain searching for men to join the Knights of the Round Table. Along the way, he recruits Sir Bedevere the Wise, Sir Lancelot the Brave, Sir Galahad the Pure, Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot, and Sir Not-Appearing-in-this-Film, along with their squires and Robin\\'s troubadours.'\n",
       "\n",
       "        const doc_token_spans = [[3,5],[8,9],[28,32],[32,36],[36,40],[40,44],[45,47],[52,53]]\n",
       "        documents.push({doc_text: doc_text, doc_spans: doc_spans, doc_token_spans: doc_token_spans})\n",
       "\n",
       "    }\n",
       "\n",
       "        const instance = new window.SpanArray.SpanArray(documents, true, script_context)\n",
       "        instance.render()\n",
       "    }\n",
       "</script>\n",
       "\n"
      ],
      "text/plain": [
       "<TokenSpanArray>\n",
       "[                                       [11, 22): 'King Arthur',\n",
       "                                              [39, 44): 'Patsy',\n",
       "                            [157, 178): 'Sir Bedevere the Wise',\n",
       "                           [180, 202): 'Sir Lancelot the Brave',\n",
       "                             [204, 224): 'Sir Galahad the Pure',\n",
       " [226, 274): 'Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot',\n",
       "                   [280, 310): 'Sir Not-Appearing-in-this-Film',\n",
       "                                          [341, 348): 'Robin's']\n",
       "Length: 8, dtype: TokenSpanDtype"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# We can also make a `TokenSpanArray` with a list of begin and end offsets of\n",
    "# measured in tokens. Here we make spans of the names within the target text.\n",
    "begin_tokens = [3, 8, 28, 32, 36, 40, 45, 52]\n",
    "end_tokens =   [5, 9, 32, 36, 40, 44, 47, 53]\n",
    "token_spans = tp.TokenSpanArray(tokens, begin_tokens, end_tokens)\n",
    "token_spans"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<style class=\"span-array-css\">\n",
       "            .span-array {\n",
       "    --thead-background-color: var(--jp-layout-color1, inherit);\n",
       "    --thead-text-color: var(--jp-ui-font-color1, inherit);\n",
       "    --tbody-background-color-1: var(--jp-layout-color1, inherit);\n",
       "    --tbody-background-color-2: var(--jp-layout-color2, inherit);\n",
       "    --tbody-background-color-hover: var(--jp-rendermime-table-row-hover-background, var(--jp-layout-color3, inherit));\n",
       "    --tbody-background-color-disabled: var(--jp-layout-color4, #ccccd1);\n",
       "    --tbody-text-color: var(--jp-ui-font-color0, inherit);\n",
       "    --tbody-text-color-disabled: var(--jp-ui-inverse-font-color0, #b3b3b9);\n",
       "    --table-font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "\n",
       "    --table-control-background: rgba(0, 0, 0, 0.2);\n",
       "    --table-control-color: var(--jp-ui-font-color0);\n",
       "    --table-control-border: 1px solid rgba(0, 0, 0, 0.8);\n",
       "    --table-control-border-radius: 0.5em;\n",
       "\n",
       "    --root-highlight: #a0c4ff;\n",
       "    --nested-highlight: #ffadad;\n",
       "    --hover-highlight: #ffd6a5;\n",
       "\n",
       "    --inverted-background-color: #0B525B;\n",
       "    --inverted-text-color: rgb(243, 243, 243);\n",
       "    --paragraph-border-color: var(--jp-layout-color2, inherit);\n",
       "\n",
       "    --fallback-font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif\n",
       "}\n",
       "\n",
       "/* Table of span offsets */\n",
       ".span-array>.document>table {\n",
       "    table-layout: auto;\n",
       "    overflow: hidden;\n",
       "    width: 100%;\n",
       "    border-collapse: collapse;\n",
       "    font-family: var(--table-font-family);\n",
       "}\n",
       "\n",
       ".span-array>.document>table thead {\n",
       "    font-variant-caps: all-petite-caps;\n",
       "}\n",
       "\n",
       ".span-array>.document>table th {\n",
       "    padding: 1em;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:last-child, .span-array>.document>table tr>th:last-child {\n",
       "    text-align: right;\n",
       "    width: 100%;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:not(tr>td:last-child), .span-array>.document>table tr>th:not(tr>th:last-child) {\n",
       "    text-align: left;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n), .span-array>.document>table tr.disabled.hover:nth-child(n) {\n",
       "    background-color: var(--tbody-background-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n)>td, .span-array>.document>table tr.disabled.hover:nth-child(n)>td {\n",
       "    color: var(--tbody-text-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.hover:not(.disabled) {\n",
       "    background: var(--jp-rendermime-table-row-hover-background);\n",
       "}\n",
       "\n",
       "/* Table control buttons */\n",
       "\n",
       ".span-array>.document>table td.sa-table-controls-container {\n",
       "    vertical-align: center;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls {\n",
       "    display: flex;\n",
       "    flex-direction: row;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button {\n",
       "    background-color: var(--table-control-background);\n",
       "    color: var(--table-control-color);\n",
       "    border: var(--table-control-border);\n",
       "    border-right: none;\n",
       "    border-radius: 0;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:first-child {\n",
       "    border-radius: var(--table-control-border-radius) 0 0 var(--table-control-border-radius);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:last-child {\n",
       "    border-radius: 0 var(--table-control-border-radius) var(--table-control-border-radius) 0;\n",
       "    border-right: var(--table-control-border);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button[data-control=\"visibility\"]:hover {\n",
       "    background-color: var(--root-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"]:hover {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.highlighted:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"] {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       "/* Styling for spans within document context */\n",
       ".span-array>.document>p {\n",
       "    border:1px solid var(--paragraph-border-color);\n",
       "    border-radius: 0.2em;\n",
       "    padding: 1em;\n",
       "    line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "    box-sizing: border-box;\n",
       "    font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "}\n",
       "\n",
       "body[data-jp-theme-light=\"false\"].span-array>.document>p {\n",
       "    border: 1px solid black;\n",
       "    background-color: var(--inverted-background-color);\n",
       "    color: var(--inverted-text-color);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    padding: 0.4em 0.4em;\n",
       "    border-radius: 0.35em;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>p .mark {\n",
       "    color: var(black);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    background-color: var(--root-highlight);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>mark {\n",
       "    background-color: var(--nested-highlight);\n",
       "    padding: 0.2em 0.4em;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.complex-set {\n",
       "    background: linear-gradient(to right, var(--root-highlight), var(--nested-highlight))\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>span.mark-tag {\n",
       "    font-weight: bolder;\n",
       "    font-size: 0.8em;\n",
       "    font-variant: small-caps;\n",
       "    font-variant-caps: all-small-caps;\n",
       "    margin-left: 8px;\n",
       "    text-transform: uppercase;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.hover, .span-array.span-array>.document>p mark>mark.hover, .span-array>.document>p mark.complex-set.hover, .span-array>.document>p mark.highlighted, .span-array>.document>p mark.complex-set.highlighted, .span-array.span-array>.document>p mark>mark.highlighted {\n",
       "    background: none;\n",
       "    background-color: var(--hover-highlight);\n",
       "}\n",
       "\n",
       "</style>\n",
       "<script>\n",
       "{\n",
       "            // Increment the version to invalidate the cached script\n",
       "const VERSION = 0.79\n",
       "const global_stylesheet = document.head.querySelector(\"style.span-array-css\")\n",
       "const local_stylesheet = document.currentScript.parentElement.querySelector(\"style.span-array-css\")\n",
       "\n",
       "if(window.SpanArray == undefined || window.SpanArray.VERSION == undefined || window.SpanArray.VERSION < VERSION) {\n",
       "\n",
       "    // Replace global SpanArray CSS with latest copy\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet != undefined) {\n",
       "            document.head.removeChild(global_stylesheet)\n",
       "        }\n",
       "        document.head.appendChild(local_stylesheet)\n",
       "    }\n",
       "\n",
       "    // Sets up the SpanArray global namespace\n",
       "    window.SpanArray = {}\n",
       "    window.SpanArray.VERSION = VERSION\n",
       "\n",
       "    window.SpanArray.TYPE_OVERLAP = 0;\n",
       "    window.SpanArray.TYPE_NESTED = 1;\n",
       "    window.SpanArray.TYPE_COMPLEX = 2;\n",
       "    window.SpanArray.TYPE_SOLO = 3;\n",
       "\n",
       "    const TYPE_OVERLAP = window.SpanArray.TYPE_OVERLAP;\n",
       "    const TYPE_NESTED = window.SpanArray.TYPE_NESTED;\n",
       "    const TYPE_COMPLEX = window.SpanArray.TYPE_COMPLEX;\n",
       "    const TYPE_SOLO = window.SpanArray.TYPE_SOLO;\n",
       "\n",
       "    function sanitize(input) {\n",
       "        let out = input.slice();\n",
       "        out = out.replace(/&/g, \"&amp;\")\n",
       "        out = out.replace(/</g, \"&lt;\")\n",
       "        out = out.replace(/>/g, \"&gt;\")\n",
       "        out = out.replace(/\\$/g, \"<span>&#36;</span>\")\n",
       "        out = out.replace(/\"/g, \"&quot;\")\n",
       "        out = out.replace(/(\\r|\\n)/g, \"<br>\")\n",
       "        return out;\n",
       "    }\n",
       "\n",
       "    /** Comparison function used to sort SpanArrays by position and length\n",
       "     *  Will sort by primarily by earliest beginning point. On a tie, will prioritize latest end point (first and largest)\n",
       "     *  Used by mark relationship algorithm.\n",
       "     */\n",
       "    function compareSpanArrays(a, b) {\n",
       "        const start_diff = a.begin - b.begin\n",
       "        if(start_diff == 0) {\n",
       "            return b.end - a.end\n",
       "        }\n",
       "        return start_diff;\n",
       "    }\n",
       "\n",
       "    /** Models an instance of a SpanArray, with document-separated spans and text\n",
       "     * NOTE: Using docs instead of documents to avoid unintentionally manipulating the global 'document' object.\n",
       "    */\n",
       "    class SpanArray {\n",
       "        constructor(docs, show_offsets, script_context) {\n",
       "            this.docs = docs\n",
       "            this.show_offsets = show_offsets\n",
       "            this.script_context = script_context\n",
       "\n",
       "            // For each doc, generate a lookup map for quick ID access\n",
       "            this.docs = this.docs.map(doc => {\n",
       "                doc.span_objects = Span.arrayFromSpanArray(doc.doc_spans)\n",
       "                doc.lookup_table = {}\n",
       "                doc.span_objects.forEach(span => {\n",
       "                    doc.lookup_table[span.id] = span\n",
       "                })\n",
       "                return doc\n",
       "            })\n",
       "        }\n",
       "\n",
       "        render() {\n",
       "            let span_array_frag = document.createDocumentFragment()\n",
       "            // For each document, create a document fragment and append to a document container\n",
       "            for(let doc_index = 0; doc_index < this.docs.length; doc_index++)\n",
       "            {\n",
       "                let doc = this.docs[doc_index]\n",
       "                let doc_container = document.createElement(\"div\")\n",
       "                // Using the data-doc-id attribute allows a selector to access a document's render by its index\n",
       "                doc_container.setAttribute(\"data-doc-id\", doc_index)\n",
       "                doc_container.classList.add(\"document\")\n",
       "                const document_fragment = getDocumentFragment(doc, this.show_offsets)\n",
       "                if(this.show_offsets) {\n",
       "                    attachDocumentEvents(document_fragment, doc, this)\n",
       "                }\n",
       "                doc_container.appendChild(document_fragment)\n",
       "                span_array_frag.appendChild(doc_container)\n",
       "            }\n",
       "            let container = this.script_context.parentElement.querySelector(\".span-array\")\n",
       "            if(container != undefined) {\n",
       "                container.innerHTML = \"\"\n",
       "                container.appendChild(span_array_frag)\n",
       "            } else {\n",
       "                console.error(\"No container found for SpanArray renderer\")\n",
       "            }\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.SpanArray = SpanArray\n",
       "\n",
       "\n",
       "    /** Models an instance of a Span and its relationship to other spans in the document */\n",
       "    class Span {\n",
       "\n",
       "        // Creates an ordered list of entries from a list of spans with struct [begin, end]\n",
       "        static arrayFromSpanArray(spanArray) {\n",
       "            let entries = []\n",
       "            let span;\n",
       "            for(let i = 0; i < spanArray.length; i++)\n",
       "            {\n",
       "                span = spanArray[i];\n",
       "                entries.push(new Span(i, span[0], span[1]))\n",
       "            }\n",
       "\n",
       "            entries = entries.sort(compareSpanArrays)\n",
       "\n",
       "            let set;\n",
       "            for(let i = 0; i < entries.length; i++) {\n",
       "                for(let j = i+1; j < entries.length && entries[j].begin < entries[i].end; j++) {\n",
       "                    if(entries[j].end <= entries[i].end) {\n",
       "                        set = {type: TYPE_NESTED, entry: entries[j]}\n",
       "                    } else {\n",
       "                        set = {type: TYPE_OVERLAP, entry: entries[j]}\n",
       "                    }\n",
       "                    entries[i].sets.push(set)\n",
       "                }\n",
       "            }\n",
       "\n",
       "            return entries\n",
       "        }\n",
       "\n",
       "        constructor(id, begin, end) {\n",
       "            this.id = id\n",
       "            this.begin = begin\n",
       "            this.end = end\n",
       "            this.sets = []\n",
       "            this.visible = true\n",
       "            this.highlighted = false\n",
       "        }\n",
       "\n",
       "        // Returns only visible sets\n",
       "        get valid_sets() {\n",
       "            let valid_sets = []\n",
       "\n",
       "            this.sets.forEach(set => {\n",
       "                if(set.entry.visible) valid_sets.push(set)\n",
       "            })\n",
       "\n",
       "            return valid_sets\n",
       "        }\n",
       "\n",
       "        // Returns true if mark should render as a compound set of spans\n",
       "        isComplex() {\n",
       "            for(let i = 0; i < this.valid_sets.length; i++) {\n",
       "                let otherMember = this.valid_sets[i].entry;\n",
       "                if(this.valid_sets[i].type == TYPE_OVERLAP && otherMember.visible) {\n",
       "                    return true;\n",
       "                } else {\n",
       "                    if(otherMember.valid_sets.length > 0 && otherMember.visible) {\n",
       "                        return true;\n",
       "                    }\n",
       "                }\n",
       "            }\n",
       "            return false;\n",
       "        }\n",
       "\n",
       "        // Gets the combined span of all connected elements\n",
       "        getSetSpan() {\n",
       "            let begin = this.begin\n",
       "            let end = this.end\n",
       "            let highest_id = this.id\n",
       "\n",
       "            this.valid_sets.forEach(set => {\n",
       "                let other = set.entry.getSetSpan()\n",
       "                if(other.begin < begin) begin = other.begin\n",
       "                if(other.end > end) end = other.end\n",
       "                if(other.highest_id > highest_id) highest_id = other.highest_id\n",
       "            })\n",
       "\n",
       "            return {begin: begin, end: end, highest_id: highest_id}\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.Span = Span\n",
       "\n",
       "    // Get the DocumentFragment for a single document \n",
       "    function getDocumentFragment(doc, show_offsets) {\n",
       "\n",
       "        const doc_text = doc.doc_text;\n",
       "        const entries = doc.span_objects;\n",
       "\n",
       "        let frag = document.createDocumentFragment()\n",
       "\n",
       "        // Render Table\n",
       "        if(show_offsets) {\n",
       "            let table = document.createElement(\"table\")\n",
       "            table.innerHTML = `\n",
       "            <thead>\n",
       "            <tr>\n",
       "                <th></th>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "                ${(doc['doc_token_spans'] != undefined) ? '<th>begin token</th> <th>end token</th>' : ''}\n",
       "                <th>context</th>\n",
       "            </tr>\n",
       "            </thead>`\n",
       "            let tbody = document.createElement(\"tbody\")\n",
       "            entries.forEach(entry => {\n",
       "                let row = document.createElement(\"tr\")\n",
       "                row.setAttribute(\"data-id\", entry.id.toString())\n",
       "                if(!entry.visible)\n",
       "                {\n",
       "                    row.classList.add(\"disabled\")\n",
       "                }\n",
       "                if(entry.highlighted)\n",
       "                {\n",
       "                    row.classList.add(\"highlighted\")\n",
       "                }\n",
       "\n",
       "                // Adds the span entry to the table. doc_text is sanitized by replacing the reserved\n",
       "                // symbols by their entity name representations\n",
       "                row.innerHTML += `\n",
       "                <td>\n",
       "                    <div class='sa-table-controls'>\n",
       "                    <button data-control='visibility' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 576 512\"><path d=\"M572.52 241.4C518.29 135.59 410.93 64 288 64S57.68 135.64 3.48 241.41a32.35 32.35 0 0 0 0 29.19C57.71 376.41 165.07 448 288 448s230.32-71.64 284.52-177.41a32.35 32.35 0 0 0 0-29.19zM288 400a144 144 0 1 1 144-144 143.93 143.93 0 0 1-144 144zm0-240a95.31 95.31 0 0 0-25.31 3.79 47.85 47.85 0 0 1-66.9 66.9A95.78 95.78 0 1 0 288 160z\"/></svg></button>\n",
       "                    <button data-control='highlight' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 512 512\"><path d=\"M256 160c-52.9 0-96 43.1-96 96s43.1 96 96 96 96-43.1 96-96-43.1-96-96-96zm246.4 80.5l-94.7-47.3 33.5-100.4c4.5-13.6-8.4-26.5-21.9-21.9l-100.4 33.5-47.4-94.8c-6.4-12.8-24.6-12.8-31 0l-47.3 94.7L92.7 70.8c-13.6-4.5-26.5 8.4-21.9 21.9l33.5 100.4-94.7 47.4c-12.8 6.4-12.8 24.6 0 31l94.7 47.3-33.5 100.5c-4.5 13.6 8.4 26.5 21.9 21.9l100.4-33.5 47.3 94.7c6.4 12.8 24.6 12.8 31 0l47.3-94.7 100.4 33.5c13.6 4.5 26.5-8.4 21.9-21.9l-33.5-100.4 94.7-47.3c13-6.5 13-24.7.2-31.1zm-155.9 106c-49.9 49.9-131.1 49.9-181 0-49.9-49.9-49.9-131.1 0-181 49.9-49.9 131.1-49.9 181 0 49.9 49.9 49.9 131.1 0 181z\"/></svg></button>\n",
       "                    </div>\n",
       "                </td>\n",
       "                <td><b>${entry.id.toString()}</b></td>\n",
       "                <td>${entry.begin}</td>\n",
       "                <td>${entry.end}</td>\n",
       "                ${(doc.doc_token_spans != undefined) ? `<td>${doc.doc_token_spans[entry.id][0]}</td><td>${doc.doc_token_spans[entry.id][1]}</td>` : ''}\n",
       "                <td>${sanitize(doc_text.substring(entry.begin, entry.end))}</td>`\n",
       "\n",
       "                tbody.appendChild(row)\n",
       "            })\n",
       "            table.appendChild(tbody)\n",
       "            frag.appendChild(table)\n",
       "        }\n",
       "\n",
       "        // Render Text\n",
       "        let highlight_regions = []\n",
       "        for(let i = 0; i < entries.length; i++)\n",
       "        {\n",
       "            if(!entries[i].visible) continue\n",
       "            if(entries[i].valid_sets.length > 0)\n",
       "            {\n",
       "                let span = entries[i].getSetSpan();\n",
       "                let ids = [entries[i].id, ...entries[i].valid_sets.map(set => { return set.entry.id })]\n",
       "                if(entries[i].isComplex()) {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_COMPLEX, ids: ids})\n",
       "                } else {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_NESTED, ids: ids})\n",
       "                }\n",
       "                i = span.highest_id\n",
       "            } else {\n",
       "                highlight_regions.push({begin: entries[i].begin, end: entries[i].end, type: TYPE_SOLO, ids: [entries[i].id]})\n",
       "            }\n",
       "        }\n",
       "\n",
       "        let paragraph = document.createElement(\"p\")\n",
       "        if(highlight_regions.length == 0) {\n",
       "            paragraph.innerHTML = sanitize(doc_text)\n",
       "        } else {\n",
       "            let begin = 0\n",
       "            highlight_regions.forEach(region => {\n",
       "                paragraph.innerHTML += sanitize(doc_text.substring(begin, region.begin))\n",
       "\n",
       "                let mark = document.createElement(\"mark\")\n",
       "                // The data-ids tag is a list of comma-separated reference IDs for matching Spans \n",
       "                mark.setAttribute(\"data-ids\", \"\");\n",
       "                if (region.type != TYPE_NESTED) {\n",
       "                    region.ids.forEach(id => {\n",
       "                        mark.setAttribute(\"data-ids\", mark.getAttribute(\"data-ids\") + `#${id},`)\n",
       "                        if(doc.lookup_table[id].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    })\n",
       "                    mark.innerHTML = sanitize(doc_text.substring(region.begin, region.end))\n",
       "                } else {\n",
       "                    mark.setAttribute(\"data-ids\", `#${region.ids[0]},`)\n",
       "                    if(doc.lookup_table[region.ids[0]].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    let nested_begin = region.begin\n",
       "                    region.ids.slice(1).forEach(nested_id => {\n",
       "                        let nested_region = doc.lookup_table[nested_id]\n",
       "                        mark.innerHTML += sanitize(doc_text.substring(nested_begin, nested_region.begin))\n",
       "                        let nested_mark = document.createElement(\"mark\")\n",
       "                        nested_mark.setAttribute(\"data-ids\", `#${nested_id},`)\n",
       "                        if(nested_region.highlighted) nested_mark.classList.add(\"highlighted\")\n",
       "                        nested_mark.innerHTML = sanitize(doc_text.substring(nested_region.begin, nested_region.end))\n",
       "                        nested_begin = nested_region.end\n",
       "                        mark.appendChild(nested_mark)\n",
       "                    })\n",
       "                    mark.innerHTML += sanitize(doc_text.substring(nested_begin, region.end))\n",
       "                }\n",
       "\n",
       "                if(region.type == TYPE_COMPLEX) {\n",
       "                    let markTag = document.createElement(\"span\")\n",
       "                    markTag.textContent = \"Set\"\n",
       "                    markTag.classList.add(\"mark-tag\")\n",
       "                    mark.classList.add(\"complex-set\")\n",
       "                    mark.appendChild(markTag)\n",
       "                }\n",
       "\n",
       "                begin = region.end\n",
       "                paragraph.appendChild(mark)\n",
       "            })\n",
       "            paragraph.innerHTML += sanitize(doc_text.substring(highlight_regions[highlight_regions.length - 1].end, doc_text.length))\n",
       "        }\n",
       "\n",
       "        frag.appendChild(paragraph)\n",
       "\n",
       "        return frag\n",
       "    }\n",
       "\n",
       "    /** Attach hover and click events to a document render via event delegation */\n",
       "    function attachDocumentEvents(fragment, doc_object, source_spanarray) {\n",
       "        const doc_table_body = fragment.querySelector(\"table>tbody\")\n",
       "        const doc_text = fragment.querySelector(\"p\")\n",
       "\n",
       "        // Hover highlight events\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        // Click disable/enable events\n",
       "\n",
       "        doc_table_body.addEventListener(\"click\", (event) => {\n",
       "            const closest_control_button = event.target.closest(\"button[data-control]\")\n",
       "            if(closest_control_button == undefined) return\n",
       "\n",
       "            const closest_tr = event.target.closest(\"tr\")\n",
       "            if(closest_tr == undefined) return\n",
       "\n",
       "            const matching_span = doc_object.lookup_table[closest_tr.getAttribute(\"data-id\")]\n",
       "            if(matching_span == undefined) return\n",
       "\n",
       "            switch(closest_control_button.getAttribute(\"data-control\")) {\n",
       "                case \"visibility\":\n",
       "                    {\n",
       "                        matching_span.visible = !matching_span.visible\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "                case \"highlight\":\n",
       "                    {\n",
       "                        matching_span.highlighted = !matching_span.highlighted\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "            }\n",
       "\n",
       "\n",
       "\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"click\", (event) => {\n",
       "            const closest_mark = event.target.closest(\"mark\")\n",
       "            if(closest_mark == undefined) return\n",
       "\n",
       "            // Preprocess ID string into a list of IDs\n",
       "            const ids = closest_mark.getAttribute(\"data-ids\")\n",
       "                .split(\",\")\n",
       "                .slice(0, -1)\n",
       "                .map(id => {\n",
       "                    return id.substring(1)\n",
       "                })\n",
       "\n",
       "            // If any of the connected IDs are highlighted, we set all spans in the list to not highlighted.\n",
       "            // Inversely, we want all spans highlighted if none were previously.\n",
       "\n",
       "            const highlighted_entry = ids.find(id => {\n",
       "                return doc_object.lookup_table[id].highlighted\n",
       "            })\n",
       "\n",
       "            const is_highlighted = (highlighted_entry != undefined)\n",
       "\n",
       "            ids.forEach(id => {\n",
       "                const span = doc_object.lookup_table[id]\n",
       "                if(span != undefined) span.highlighted = !is_highlighted\n",
       "            })\n",
       "\n",
       "            source_spanarray.render()\n",
       "        })\n",
       "    }\n",
       "} else {\n",
       "    // SpanArray JS is already defined and not an outdated copy\n",
       "    // Replace global SpanArray CSS with latest copy IFF global stylesheet is undefined\n",
       "\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet == undefined) {\n",
       "            document.head.appendChild(local_stylesheet)\n",
       "        } else {\n",
       "            document.currentScript.parentElement.removeChild(local_stylesheet)\n",
       "        }\n",
       "    }       \n",
       "}\n",
       "}\n",
       "</script>\n",
       "<div class=\"span-array\">\n",
       "\n",
       "    <div class='document'>\n",
       "        <table style='\n",
       "            table-layout: auto;\n",
       "            overflow: hidden;\n",
       "            width: 100%;\n",
       "            border-collapse: collapse;\n",
       "            '>\n",
       "            <thead style='font-variant-caps: all-petite-caps;'>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "\n",
       "                <th style='text-align:right;width:100%'>context</th>\n",
       "            </tr></thead>\n",
       "            <tbody>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>0</b></td>\n",
       "            <td>0</td>\n",
       "            <td>2</td>\n",
       "\n",
       "            <td>In</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>1</b></td>\n",
       "            <td>3</td>\n",
       "            <td>5</td>\n",
       "\n",
       "            <td>AD</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>2</b></td>\n",
       "            <td>6</td>\n",
       "            <td>9</td>\n",
       "\n",
       "            <td>932</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>3</b></td>\n",
       "            <td>11</td>\n",
       "            <td>15</td>\n",
       "\n",
       "            <td>King</td>\n",
       "        </tr>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>4</b></td>\n",
       "            <td>16</td>\n",
       "            <td>22</td>\n",
       "\n",
       "            <td>Arthur</td>\n",
       "        </tr>\n",
       "\n",
       "            </tbody>\n",
       "        </table>\n",
       "        <p style='\n",
       "            padding: 1em;\n",
       "            line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "            '>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>In</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>AD</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>932</span>\n",
       "\n",
       "            , \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>King</span>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Arthur</span>\n",
       "             and his squire, Patsy, travel throughout Britain searching for men to join the Knights of the Round Table. Along the way, he recruits Sir Bedevere the Wise, Sir Lancelot the Brave, Sir Galahad the Pure, Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot, and Sir Not-Appearing-in-this-Film, along with their squires and Robin&#39;s troubadours.\n",
       "        </p>\n",
       "    </div>\n",
       "\n",
       "    <span style=\"font-size: 0.8em;color: #b3b3b3;\">Your notebook viewer does not support Javascript execution. The above rendering will not be interactive.</span>\n",
       "</div>\n",
       "<script>\n",
       "    {\n",
       "        const Span = window.SpanArray.Span\n",
       "        const script_context = document.currentScript\n",
       "        const documents = []\n",
       "\n",
       "    {\n",
       "\n",
       "    const doc_spans = [[0,2],[3,5],[6,9],[11,15],[16,22]]\n",
       "    const doc_text = 'In AD 932, King Arthur and his squire, Patsy, travel throughout Britain searching for men to join the Knights of the Round Table. Along the way, he recruits Sir Bedevere the Wise, Sir Lancelot the Brave, Sir Galahad the Pure, Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot, and Sir Not-Appearing-in-this-Film, along with their squires and Robin\\'s troubadours.'\n",
       "\n",
       "        documents.push({doc_text: doc_text, doc_spans: doc_spans})\n",
       "\n",
       "    }\n",
       "\n",
       "        const instance = new window.SpanArray.SpanArray(documents, true, script_context)\n",
       "        instance.render()\n",
       "    }\n",
       "</script>\n",
       "\n"
      ],
      "text/plain": [
       "<SpanArray>\n",
       "[[0, 2): 'In', [3, 5): 'AD', [6, 9): '932', [11, 15): 'King',\n",
       " [16, 22): 'Arthur']\n",
       "Length: 5, dtype: SpanDtype"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# When all the spans in a `TokenSpanArray` come from the same document, you can access\n",
    "# the tokens of that document via the `document_tokens` property:\n",
    "token_spans.document_tokens[:5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<style class=\"span-array-css\">\n",
       "            .span-array {\n",
       "    --thead-background-color: var(--jp-layout-color1, inherit);\n",
       "    --thead-text-color: var(--jp-ui-font-color1, inherit);\n",
       "    --tbody-background-color-1: var(--jp-layout-color1, inherit);\n",
       "    --tbody-background-color-2: var(--jp-layout-color2, inherit);\n",
       "    --tbody-background-color-hover: var(--jp-rendermime-table-row-hover-background, var(--jp-layout-color3, inherit));\n",
       "    --tbody-background-color-disabled: var(--jp-layout-color4, #ccccd1);\n",
       "    --tbody-text-color: var(--jp-ui-font-color0, inherit);\n",
       "    --tbody-text-color-disabled: var(--jp-ui-inverse-font-color0, #b3b3b9);\n",
       "    --table-font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "\n",
       "    --table-control-background: rgba(0, 0, 0, 0.2);\n",
       "    --table-control-color: var(--jp-ui-font-color0);\n",
       "    --table-control-border: 1px solid rgba(0, 0, 0, 0.8);\n",
       "    --table-control-border-radius: 0.5em;\n",
       "\n",
       "    --root-highlight: #a0c4ff;\n",
       "    --nested-highlight: #ffadad;\n",
       "    --hover-highlight: #ffd6a5;\n",
       "\n",
       "    --inverted-background-color: #0B525B;\n",
       "    --inverted-text-color: rgb(243, 243, 243);\n",
       "    --paragraph-border-color: var(--jp-layout-color2, inherit);\n",
       "\n",
       "    --fallback-font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif\n",
       "}\n",
       "\n",
       "/* Table of span offsets */\n",
       ".span-array>.document>table {\n",
       "    table-layout: auto;\n",
       "    overflow: hidden;\n",
       "    width: 100%;\n",
       "    border-collapse: collapse;\n",
       "    font-family: var(--table-font-family);\n",
       "}\n",
       "\n",
       ".span-array>.document>table thead {\n",
       "    font-variant-caps: all-petite-caps;\n",
       "}\n",
       "\n",
       ".span-array>.document>table th {\n",
       "    padding: 1em;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:last-child, .span-array>.document>table tr>th:last-child {\n",
       "    text-align: right;\n",
       "    width: 100%;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr>td:not(tr>td:last-child), .span-array>.document>table tr>th:not(tr>th:last-child) {\n",
       "    text-align: left;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n), .span-array>.document>table tr.disabled.hover:nth-child(n) {\n",
       "    background-color: var(--tbody-background-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.disabled:nth-child(n)>td, .span-array>.document>table tr.disabled.hover:nth-child(n)>td {\n",
       "    color: var(--tbody-text-color-disabled);\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.hover:not(.disabled) {\n",
       "    background: var(--jp-rendermime-table-row-hover-background);\n",
       "}\n",
       "\n",
       "/* Table control buttons */\n",
       "\n",
       ".span-array>.document>table td.sa-table-controls-container {\n",
       "    vertical-align: center;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls {\n",
       "    display: flex;\n",
       "    flex-direction: row;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button {\n",
       "    background-color: var(--table-control-background);\n",
       "    color: var(--table-control-color);\n",
       "    border: var(--table-control-border);\n",
       "    border-right: none;\n",
       "    border-radius: 0;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:first-child {\n",
       "    border-radius: var(--table-control-border-radius) 0 0 var(--table-control-border-radius);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button:last-child {\n",
       "    border-radius: 0 var(--table-control-border-radius) var(--table-control-border-radius) 0;\n",
       "    border-right: var(--table-control-border);\n",
       "}\n",
       "\n",
       ".span-array>.document>table div.sa-table-controls button[data-control=\"visibility\"]:hover {\n",
       "    background-color: var(--root-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"]:hover {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       ".span-array>.document>table tr.highlighted:not(tr.disabled) div.sa-table-controls button[data-control=\"highlight\"] {\n",
       "    background-color: var(--hover-highlight);\n",
       "    color: black;\n",
       "}\n",
       "\n",
       "/* Styling for spans within document context */\n",
       ".span-array>.document>p {\n",
       "    border:1px solid var(--paragraph-border-color);\n",
       "    border-radius: 0.2em;\n",
       "    padding: 1em;\n",
       "    line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "    box-sizing: border-box;\n",
       "    font-family: var(--jp-content-font-family, var(--fallback-font-family, inherit));\n",
       "}\n",
       "\n",
       "body[data-jp-theme-light=\"false\"].span-array>.document>p {\n",
       "    border: 1px solid black;\n",
       "    background-color: var(--inverted-background-color);\n",
       "    color: var(--inverted-text-color);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    padding: 0.4em 0.4em;\n",
       "    border-radius: 0.35em;\n",
       "    cursor: pointer;\n",
       "}\n",
       "\n",
       ".span-array>.document>p .mark {\n",
       "    color: var(black);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark {\n",
       "    background-color: var(--root-highlight);\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>mark {\n",
       "    background-color: var(--nested-highlight);\n",
       "    padding: 0.2em 0.4em;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.complex-set {\n",
       "    background: linear-gradient(to right, var(--root-highlight), var(--nested-highlight))\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark>span.mark-tag {\n",
       "    font-weight: bolder;\n",
       "    font-size: 0.8em;\n",
       "    font-variant: small-caps;\n",
       "    font-variant-caps: all-small-caps;\n",
       "    margin-left: 8px;\n",
       "    text-transform: uppercase;\n",
       "}\n",
       "\n",
       ".span-array>.document>p mark.hover, .span-array.span-array>.document>p mark>mark.hover, .span-array>.document>p mark.complex-set.hover, .span-array>.document>p mark.highlighted, .span-array>.document>p mark.complex-set.highlighted, .span-array.span-array>.document>p mark>mark.highlighted {\n",
       "    background: none;\n",
       "    background-color: var(--hover-highlight);\n",
       "}\n",
       "\n",
       "</style>\n",
       "<script>\n",
       "{\n",
       "            // Increment the version to invalidate the cached script\n",
       "const VERSION = 0.79\n",
       "const global_stylesheet = document.head.querySelector(\"style.span-array-css\")\n",
       "const local_stylesheet = document.currentScript.parentElement.querySelector(\"style.span-array-css\")\n",
       "\n",
       "if(window.SpanArray == undefined || window.SpanArray.VERSION == undefined || window.SpanArray.VERSION < VERSION) {\n",
       "\n",
       "    // Replace global SpanArray CSS with latest copy\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet != undefined) {\n",
       "            document.head.removeChild(global_stylesheet)\n",
       "        }\n",
       "        document.head.appendChild(local_stylesheet)\n",
       "    }\n",
       "\n",
       "    // Sets up the SpanArray global namespace\n",
       "    window.SpanArray = {}\n",
       "    window.SpanArray.VERSION = VERSION\n",
       "\n",
       "    window.SpanArray.TYPE_OVERLAP = 0;\n",
       "    window.SpanArray.TYPE_NESTED = 1;\n",
       "    window.SpanArray.TYPE_COMPLEX = 2;\n",
       "    window.SpanArray.TYPE_SOLO = 3;\n",
       "\n",
       "    const TYPE_OVERLAP = window.SpanArray.TYPE_OVERLAP;\n",
       "    const TYPE_NESTED = window.SpanArray.TYPE_NESTED;\n",
       "    const TYPE_COMPLEX = window.SpanArray.TYPE_COMPLEX;\n",
       "    const TYPE_SOLO = window.SpanArray.TYPE_SOLO;\n",
       "\n",
       "    function sanitize(input) {\n",
       "        let out = input.slice();\n",
       "        out = out.replace(/&/g, \"&amp;\")\n",
       "        out = out.replace(/</g, \"&lt;\")\n",
       "        out = out.replace(/>/g, \"&gt;\")\n",
       "        out = out.replace(/\\$/g, \"<span>&#36;</span>\")\n",
       "        out = out.replace(/\"/g, \"&quot;\")\n",
       "        out = out.replace(/(\\r|\\n)/g, \"<br>\")\n",
       "        return out;\n",
       "    }\n",
       "\n",
       "    /** Comparison function used to sort SpanArrays by position and length\n",
       "     *  Will sort by primarily by earliest beginning point. On a tie, will prioritize latest end point (first and largest)\n",
       "     *  Used by mark relationship algorithm.\n",
       "     */\n",
       "    function compareSpanArrays(a, b) {\n",
       "        const start_diff = a.begin - b.begin\n",
       "        if(start_diff == 0) {\n",
       "            return b.end - a.end\n",
       "        }\n",
       "        return start_diff;\n",
       "    }\n",
       "\n",
       "    /** Models an instance of a SpanArray, with document-separated spans and text\n",
       "     * NOTE: Using docs instead of documents to avoid unintentionally manipulating the global 'document' object.\n",
       "    */\n",
       "    class SpanArray {\n",
       "        constructor(docs, show_offsets, script_context) {\n",
       "            this.docs = docs\n",
       "            this.show_offsets = show_offsets\n",
       "            this.script_context = script_context\n",
       "\n",
       "            // For each doc, generate a lookup map for quick ID access\n",
       "            this.docs = this.docs.map(doc => {\n",
       "                doc.span_objects = Span.arrayFromSpanArray(doc.doc_spans)\n",
       "                doc.lookup_table = {}\n",
       "                doc.span_objects.forEach(span => {\n",
       "                    doc.lookup_table[span.id] = span\n",
       "                })\n",
       "                return doc\n",
       "            })\n",
       "        }\n",
       "\n",
       "        render() {\n",
       "            let span_array_frag = document.createDocumentFragment()\n",
       "            // For each document, create a document fragment and append to a document container\n",
       "            for(let doc_index = 0; doc_index < this.docs.length; doc_index++)\n",
       "            {\n",
       "                let doc = this.docs[doc_index]\n",
       "                let doc_container = document.createElement(\"div\")\n",
       "                // Using the data-doc-id attribute allows a selector to access a document's render by its index\n",
       "                doc_container.setAttribute(\"data-doc-id\", doc_index)\n",
       "                doc_container.classList.add(\"document\")\n",
       "                const document_fragment = getDocumentFragment(doc, this.show_offsets)\n",
       "                if(this.show_offsets) {\n",
       "                    attachDocumentEvents(document_fragment, doc, this)\n",
       "                }\n",
       "                doc_container.appendChild(document_fragment)\n",
       "                span_array_frag.appendChild(doc_container)\n",
       "            }\n",
       "            let container = this.script_context.parentElement.querySelector(\".span-array\")\n",
       "            if(container != undefined) {\n",
       "                container.innerHTML = \"\"\n",
       "                container.appendChild(span_array_frag)\n",
       "            } else {\n",
       "                console.error(\"No container found for SpanArray renderer\")\n",
       "            }\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.SpanArray = SpanArray\n",
       "\n",
       "\n",
       "    /** Models an instance of a Span and its relationship to other spans in the document */\n",
       "    class Span {\n",
       "\n",
       "        // Creates an ordered list of entries from a list of spans with struct [begin, end]\n",
       "        static arrayFromSpanArray(spanArray) {\n",
       "            let entries = []\n",
       "            let span;\n",
       "            for(let i = 0; i < spanArray.length; i++)\n",
       "            {\n",
       "                span = spanArray[i];\n",
       "                entries.push(new Span(i, span[0], span[1]))\n",
       "            }\n",
       "\n",
       "            entries = entries.sort(compareSpanArrays)\n",
       "\n",
       "            let set;\n",
       "            for(let i = 0; i < entries.length; i++) {\n",
       "                for(let j = i+1; j < entries.length && entries[j].begin < entries[i].end; j++) {\n",
       "                    if(entries[j].end <= entries[i].end) {\n",
       "                        set = {type: TYPE_NESTED, entry: entries[j]}\n",
       "                    } else {\n",
       "                        set = {type: TYPE_OVERLAP, entry: entries[j]}\n",
       "                    }\n",
       "                    entries[i].sets.push(set)\n",
       "                }\n",
       "            }\n",
       "\n",
       "            return entries\n",
       "        }\n",
       "\n",
       "        constructor(id, begin, end) {\n",
       "            this.id = id\n",
       "            this.begin = begin\n",
       "            this.end = end\n",
       "            this.sets = []\n",
       "            this.visible = true\n",
       "            this.highlighted = false\n",
       "        }\n",
       "\n",
       "        // Returns only visible sets\n",
       "        get valid_sets() {\n",
       "            let valid_sets = []\n",
       "\n",
       "            this.sets.forEach(set => {\n",
       "                if(set.entry.visible) valid_sets.push(set)\n",
       "            })\n",
       "\n",
       "            return valid_sets\n",
       "        }\n",
       "\n",
       "        // Returns true if mark should render as a compound set of spans\n",
       "        isComplex() {\n",
       "            for(let i = 0; i < this.valid_sets.length; i++) {\n",
       "                let otherMember = this.valid_sets[i].entry;\n",
       "                if(this.valid_sets[i].type == TYPE_OVERLAP && otherMember.visible) {\n",
       "                    return true;\n",
       "                } else {\n",
       "                    if(otherMember.valid_sets.length > 0 && otherMember.visible) {\n",
       "                        return true;\n",
       "                    }\n",
       "                }\n",
       "            }\n",
       "            return false;\n",
       "        }\n",
       "\n",
       "        // Gets the combined span of all connected elements\n",
       "        getSetSpan() {\n",
       "            let begin = this.begin\n",
       "            let end = this.end\n",
       "            let highest_id = this.id\n",
       "\n",
       "            this.valid_sets.forEach(set => {\n",
       "                let other = set.entry.getSetSpan()\n",
       "                if(other.begin < begin) begin = other.begin\n",
       "                if(other.end > end) end = other.end\n",
       "                if(other.highest_id > highest_id) highest_id = other.highest_id\n",
       "            })\n",
       "\n",
       "            return {begin: begin, end: end, highest_id: highest_id}\n",
       "        }\n",
       "    }\n",
       "\n",
       "    window.SpanArray.Span = Span\n",
       "\n",
       "    // Get the DocumentFragment for a single document \n",
       "    function getDocumentFragment(doc, show_offsets) {\n",
       "\n",
       "        const doc_text = doc.doc_text;\n",
       "        const entries = doc.span_objects;\n",
       "\n",
       "        let frag = document.createDocumentFragment()\n",
       "\n",
       "        // Render Table\n",
       "        if(show_offsets) {\n",
       "            let table = document.createElement(\"table\")\n",
       "            table.innerHTML = `\n",
       "            <thead>\n",
       "            <tr>\n",
       "                <th></th>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "                ${(doc['doc_token_spans'] != undefined) ? '<th>begin token</th> <th>end token</th>' : ''}\n",
       "                <th>context</th>\n",
       "            </tr>\n",
       "            </thead>`\n",
       "            let tbody = document.createElement(\"tbody\")\n",
       "            entries.forEach(entry => {\n",
       "                let row = document.createElement(\"tr\")\n",
       "                row.setAttribute(\"data-id\", entry.id.toString())\n",
       "                if(!entry.visible)\n",
       "                {\n",
       "                    row.classList.add(\"disabled\")\n",
       "                }\n",
       "                if(entry.highlighted)\n",
       "                {\n",
       "                    row.classList.add(\"highlighted\")\n",
       "                }\n",
       "\n",
       "                // Adds the span entry to the table. doc_text is sanitized by replacing the reserved\n",
       "                // symbols by their entity name representations\n",
       "                row.innerHTML += `\n",
       "                <td>\n",
       "                    <div class='sa-table-controls'>\n",
       "                    <button data-control='visibility' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 576 512\"><path d=\"M572.52 241.4C518.29 135.59 410.93 64 288 64S57.68 135.64 3.48 241.41a32.35 32.35 0 0 0 0 29.19C57.71 376.41 165.07 448 288 448s230.32-71.64 284.52-177.41a32.35 32.35 0 0 0 0-29.19zM288 400a144 144 0 1 1 144-144 143.93 143.93 0 0 1-144 144zm0-240a95.31 95.31 0 0 0-25.31 3.79 47.85 47.85 0 0 1-66.9 66.9A95.78 95.78 0 1 0 288 160z\"/></svg></button>\n",
       "                    <button data-control='highlight' style='width:1em'><svg style='display:block;margin:0.2em auto;' xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 512 512\"><path d=\"M256 160c-52.9 0-96 43.1-96 96s43.1 96 96 96 96-43.1 96-96-43.1-96-96-96zm246.4 80.5l-94.7-47.3 33.5-100.4c4.5-13.6-8.4-26.5-21.9-21.9l-100.4 33.5-47.4-94.8c-6.4-12.8-24.6-12.8-31 0l-47.3 94.7L92.7 70.8c-13.6-4.5-26.5 8.4-21.9 21.9l33.5 100.4-94.7 47.4c-12.8 6.4-12.8 24.6 0 31l94.7 47.3-33.5 100.5c-4.5 13.6 8.4 26.5 21.9 21.9l100.4-33.5 47.3 94.7c6.4 12.8 24.6 12.8 31 0l47.3-94.7 100.4 33.5c13.6 4.5 26.5-8.4 21.9-21.9l-33.5-100.4 94.7-47.3c13-6.5 13-24.7.2-31.1zm-155.9 106c-49.9 49.9-131.1 49.9-181 0-49.9-49.9-49.9-131.1 0-181 49.9-49.9 131.1-49.9 181 0 49.9 49.9 49.9 131.1 0 181z\"/></svg></button>\n",
       "                    </div>\n",
       "                </td>\n",
       "                <td><b>${entry.id.toString()}</b></td>\n",
       "                <td>${entry.begin}</td>\n",
       "                <td>${entry.end}</td>\n",
       "                ${(doc.doc_token_spans != undefined) ? `<td>${doc.doc_token_spans[entry.id][0]}</td><td>${doc.doc_token_spans[entry.id][1]}</td>` : ''}\n",
       "                <td>${sanitize(doc_text.substring(entry.begin, entry.end))}</td>`\n",
       "\n",
       "                tbody.appendChild(row)\n",
       "            })\n",
       "            table.appendChild(tbody)\n",
       "            frag.appendChild(table)\n",
       "        }\n",
       "\n",
       "        // Render Text\n",
       "        let highlight_regions = []\n",
       "        for(let i = 0; i < entries.length; i++)\n",
       "        {\n",
       "            if(!entries[i].visible) continue\n",
       "            if(entries[i].valid_sets.length > 0)\n",
       "            {\n",
       "                let span = entries[i].getSetSpan();\n",
       "                let ids = [entries[i].id, ...entries[i].valid_sets.map(set => { return set.entry.id })]\n",
       "                if(entries[i].isComplex()) {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_COMPLEX, ids: ids})\n",
       "                } else {\n",
       "                    highlight_regions.push({begin: span.begin, end: span.end, type: TYPE_NESTED, ids: ids})\n",
       "                }\n",
       "                i = span.highest_id\n",
       "            } else {\n",
       "                highlight_regions.push({begin: entries[i].begin, end: entries[i].end, type: TYPE_SOLO, ids: [entries[i].id]})\n",
       "            }\n",
       "        }\n",
       "\n",
       "        let paragraph = document.createElement(\"p\")\n",
       "        if(highlight_regions.length == 0) {\n",
       "            paragraph.innerHTML = sanitize(doc_text)\n",
       "        } else {\n",
       "            let begin = 0\n",
       "            highlight_regions.forEach(region => {\n",
       "                paragraph.innerHTML += sanitize(doc_text.substring(begin, region.begin))\n",
       "\n",
       "                let mark = document.createElement(\"mark\")\n",
       "                // The data-ids tag is a list of comma-separated reference IDs for matching Spans \n",
       "                mark.setAttribute(\"data-ids\", \"\");\n",
       "                if (region.type != TYPE_NESTED) {\n",
       "                    region.ids.forEach(id => {\n",
       "                        mark.setAttribute(\"data-ids\", mark.getAttribute(\"data-ids\") + `#${id},`)\n",
       "                        if(doc.lookup_table[id].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    })\n",
       "                    mark.innerHTML = sanitize(doc_text.substring(region.begin, region.end))\n",
       "                } else {\n",
       "                    mark.setAttribute(\"data-ids\", `#${region.ids[0]},`)\n",
       "                    if(doc.lookup_table[region.ids[0]].highlighted) mark.classList.add(\"highlighted\")\n",
       "                    let nested_begin = region.begin\n",
       "                    region.ids.slice(1).forEach(nested_id => {\n",
       "                        let nested_region = doc.lookup_table[nested_id]\n",
       "                        mark.innerHTML += sanitize(doc_text.substring(nested_begin, nested_region.begin))\n",
       "                        let nested_mark = document.createElement(\"mark\")\n",
       "                        nested_mark.setAttribute(\"data-ids\", `#${nested_id},`)\n",
       "                        if(nested_region.highlighted) nested_mark.classList.add(\"highlighted\")\n",
       "                        nested_mark.innerHTML = sanitize(doc_text.substring(nested_region.begin, nested_region.end))\n",
       "                        nested_begin = nested_region.end\n",
       "                        mark.appendChild(nested_mark)\n",
       "                    })\n",
       "                    mark.innerHTML += sanitize(doc_text.substring(nested_begin, region.end))\n",
       "                }\n",
       "\n",
       "                if(region.type == TYPE_COMPLEX) {\n",
       "                    let markTag = document.createElement(\"span\")\n",
       "                    markTag.textContent = \"Set\"\n",
       "                    markTag.classList.add(\"mark-tag\")\n",
       "                    mark.classList.add(\"complex-set\")\n",
       "                    mark.appendChild(markTag)\n",
       "                }\n",
       "\n",
       "                begin = region.end\n",
       "                paragraph.appendChild(mark)\n",
       "            })\n",
       "            paragraph.innerHTML += sanitize(doc_text.substring(highlight_regions[highlight_regions.length - 1].end, doc_text.length))\n",
       "        }\n",
       "\n",
       "        frag.appendChild(paragraph)\n",
       "\n",
       "        return frag\n",
       "    }\n",
       "\n",
       "    /** Attach hover and click events to a document render via event delegation */\n",
       "    function attachDocumentEvents(fragment, doc_object, source_spanarray) {\n",
       "        const doc_table_body = fragment.querySelector(\"table>tbody\")\n",
       "        const doc_text = fragment.querySelector(\"p\")\n",
       "\n",
       "        // Hover highlight events\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_table_body.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"TR\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const span_id = event.target.getAttribute(\"data-id\")\n",
       "                const marks = doc_text.querySelectorAll(\"mark[data-ids]\")\n",
       "                Array.from(marks)\n",
       "                    .filter(mark => {\n",
       "                        return mark.getAttribute(\"data-ids\").includes(`#${span_id},`)\n",
       "                    })\n",
       "                    .forEach(related_mark => {\n",
       "                        related_mark.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerenter\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.add(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.add(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"pointerleave\", (event) => {\n",
       "            if(event.target.nodeName == \"MARK\") {\n",
       "                event.target.classList.remove(\"hover\")\n",
       "                const ids = event.target.getAttribute(\"data-ids\").split(\",\").slice(0, -1)\n",
       "                Array.from(ids)\n",
       "                    .map(id_tag => {\n",
       "                        return id_tag.substring(1)\n",
       "                    })\n",
       "                    .forEach(id => {\n",
       "                        const entry = doc_table_body.querySelector(`tr[data-id=\"${id}\"]`)\n",
       "                        entry.classList.remove(\"hover\")\n",
       "                    })\n",
       "            }\n",
       "        }, true)\n",
       "\n",
       "        // Click disable/enable events\n",
       "\n",
       "        doc_table_body.addEventListener(\"click\", (event) => {\n",
       "            const closest_control_button = event.target.closest(\"button[data-control]\")\n",
       "            if(closest_control_button == undefined) return\n",
       "\n",
       "            const closest_tr = event.target.closest(\"tr\")\n",
       "            if(closest_tr == undefined) return\n",
       "\n",
       "            const matching_span = doc_object.lookup_table[closest_tr.getAttribute(\"data-id\")]\n",
       "            if(matching_span == undefined) return\n",
       "\n",
       "            switch(closest_control_button.getAttribute(\"data-control\")) {\n",
       "                case \"visibility\":\n",
       "                    {\n",
       "                        matching_span.visible = !matching_span.visible\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "                case \"highlight\":\n",
       "                    {\n",
       "                        matching_span.highlighted = !matching_span.highlighted\n",
       "                        source_spanarray.render()\n",
       "                    }\n",
       "                    break;\n",
       "            }\n",
       "\n",
       "\n",
       "\n",
       "        }, true)\n",
       "\n",
       "        doc_text.addEventListener(\"click\", (event) => {\n",
       "            const closest_mark = event.target.closest(\"mark\")\n",
       "            if(closest_mark == undefined) return\n",
       "\n",
       "            // Preprocess ID string into a list of IDs\n",
       "            const ids = closest_mark.getAttribute(\"data-ids\")\n",
       "                .split(\",\")\n",
       "                .slice(0, -1)\n",
       "                .map(id => {\n",
       "                    return id.substring(1)\n",
       "                })\n",
       "\n",
       "            // If any of the connected IDs are highlighted, we set all spans in the list to not highlighted.\n",
       "            // Inversely, we want all spans highlighted if none were previously.\n",
       "\n",
       "            const highlighted_entry = ids.find(id => {\n",
       "                return doc_object.lookup_table[id].highlighted\n",
       "            })\n",
       "\n",
       "            const is_highlighted = (highlighted_entry != undefined)\n",
       "\n",
       "            ids.forEach(id => {\n",
       "                const span = doc_object.lookup_table[id]\n",
       "                if(span != undefined) span.highlighted = !is_highlighted\n",
       "            })\n",
       "\n",
       "            source_spanarray.render()\n",
       "        })\n",
       "    }\n",
       "} else {\n",
       "    // SpanArray JS is already defined and not an outdated copy\n",
       "    // Replace global SpanArray CSS with latest copy IFF global stylesheet is undefined\n",
       "\n",
       "    if(local_stylesheet != undefined) {\n",
       "        if(global_stylesheet == undefined) {\n",
       "            document.head.appendChild(local_stylesheet)\n",
       "        } else {\n",
       "            document.currentScript.parentElement.removeChild(local_stylesheet)\n",
       "        }\n",
       "    }       \n",
       "}\n",
       "}\n",
       "</script>\n",
       "<div class=\"span-array\">\n",
       "\n",
       "    <div class='document'>\n",
       "        <table style='\n",
       "            table-layout: auto;\n",
       "            overflow: hidden;\n",
       "            width: 100%;\n",
       "            border-collapse: collapse;\n",
       "            '>\n",
       "            <thead style='font-variant-caps: all-petite-caps;'>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "                <th>begin token</th><th>end token</th>\n",
       "                <th style='text-align:right;width:100%'>context</th>\n",
       "            </tr></thead>\n",
       "            <tbody>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>0</b></td>\n",
       "            <td>11</td>\n",
       "            <td>22</td>\n",
       "\n",
       "            <td>3</td>\n",
       "            <td>5</td>\n",
       "\n",
       "            <td>King Arthur</td>\n",
       "        </tr>\n",
       "\n",
       "            </tbody>\n",
       "        </table>\n",
       "        <p style='\n",
       "            padding: 1em;\n",
       "            line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "            '>\n",
       "\n",
       "            In AD 932, \n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>King Arthur</span>\n",
       "             and his squire, Patsy, travel throughout Britain searching for men to join the Knights of the Round Table. Along the way, he recruits Sir Bedevere the Wise, Sir Lancelot the Brave, Sir Galahad the Pure, Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot, and Sir Not-Appearing-in-this-Film, along with their squires and Robin&#39;s troubadours.\n",
       "        </p>\n",
       "    </div>\n",
       "\n",
       "    <div class='document'>\n",
       "        <table style='\n",
       "            table-layout: auto;\n",
       "            overflow: hidden;\n",
       "            width: 100%;\n",
       "            border-collapse: collapse;\n",
       "            '>\n",
       "            <thead style='font-variant-caps: all-petite-caps;'>\n",
       "                <th></th>\n",
       "                <th>begin</th>\n",
       "                <th>end</th>\n",
       "                <th>begin token</th><th>end token</th>\n",
       "                <th style='text-align:right;width:100%'>context</th>\n",
       "            </tr></thead>\n",
       "            <tbody>\n",
       "\n",
       "        <tr>\n",
       "            <td><b>0</b></td>\n",
       "            <td>0</td>\n",
       "            <td>15</td>\n",
       "\n",
       "            <td>0</td>\n",
       "            <td>2</td>\n",
       "\n",
       "            <td>Second document</td>\n",
       "        </tr>\n",
       "\n",
       "            </tbody>\n",
       "        </table>\n",
       "        <p style='\n",
       "            padding: 1em;\n",
       "            line-height: calc(var(--jp-content-line-height, 1.6) * 1.6);\n",
       "            '>\n",
       "\n",
       "\n",
       "\n",
       "                <span class='mark btn-primary' style='padding:0.4em;border-radius:0.35em;background-color: #a0c4ff;color:black;'>Second document</span>\n",
       "\n",
       "        </p>\n",
       "    </div>\n",
       "\n",
       "    <span style=\"font-size: 0.8em;color: #b3b3b3;\">Your notebook viewer does not support Javascript execution. The above rendering will not be interactive.</span>\n",
       "</div>\n",
       "<script>\n",
       "    {\n",
       "        const Span = window.SpanArray.Span\n",
       "        const script_context = document.currentScript\n",
       "        const documents = []\n",
       "\n",
       "    {\n",
       "\n",
       "    const doc_spans = [[11,22]]\n",
       "    const doc_text = 'In AD 932, King Arthur and his squire, Patsy, travel throughout Britain searching for men to join the Knights of the Round Table. Along the way, he recruits Sir Bedevere the Wise, Sir Lancelot the Brave, Sir Galahad the Pure, Sir Robin the Not-Quite-So-Brave-as-Sir-Lancelot, and Sir Not-Appearing-in-this-Film, along with their squires and Robin\\'s troubadours.'\n",
       "\n",
       "        const doc_token_spans = [[3,5]]\n",
       "        documents.push({doc_text: doc_text, doc_spans: doc_spans, doc_token_spans: doc_token_spans})\n",
       "\n",
       "    }\n",
       "\n",
       "    {\n",
       "\n",
       "    const doc_spans = [[0,15]]\n",
       "    const doc_text = 'Second document'\n",
       "\n",
       "        const doc_token_spans = [[0,2]]\n",
       "        documents.push({doc_text: doc_text, doc_spans: doc_spans, doc_token_spans: doc_token_spans})\n",
       "\n",
       "    }\n",
       "\n",
       "        const instance = new window.SpanArray.SpanArray(documents, true, script_context)\n",
       "        instance.render()\n",
       "    }\n",
       "</script>\n",
       "\n"
      ],
      "text/plain": [
       "<TokenSpanArray>\n",
       "[[11, 22): 'King Arthur', [0, 15): 'Second document']\n",
       "Length: 2, dtype: TokenSpanDtype"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Both SpanArrays and TokenSpanArrays can contain spans from multiple documents.\n",
    "tokens_2 = tp.SpanArray(\"Second document\", [0, 7], [6, 15])\n",
    "token_spans_2 = tp.TokenSpanArray(tokens_2, [0], [2])\n",
    "\n",
    "two_doc_series = pd.concat([pd.Series(token_spans[0:1]), pd.Series(token_spans_2)])\n",
    "two_doc_series.array"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the HTML representation now contains the annotated text of two documents. We can use the `tokens` property to view view the two sets of tokens backing the two spans in this array:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([<SpanArray>\n",
       "       [                                    [0, 2): 'In',\n",
       "                                            [3, 5): 'AD',\n",
       "                                           [6, 9): '932',\n",
       "                                        [11, 15): 'King',\n",
       "                                      [16, 22): 'Arthur',\n",
       "                                         [23, 26): 'and',\n",
       "                                         [27, 30): 'his',\n",
       "                                      [31, 37): 'squire',\n",
       "                                       [39, 44): 'Patsy',\n",
       "                                      [46, 52): 'travel',\n",
       "                                  [53, 63): 'throughout',\n",
       "                                     [64, 71): 'Britain',\n",
       "                                   [72, 81): 'searching',\n",
       "                                         [82, 85): 'for',\n",
       "                                         [86, 89): 'men',\n",
       "                                          [90, 92): 'to',\n",
       "                                        [93, 97): 'join',\n",
       "                                        [98, 101): 'the',\n",
       "                                   [102, 109): 'Knights',\n",
       "                                        [110, 112): 'of',\n",
       "                                       [113, 116): 'the',\n",
       "                                     [117, 122): 'Round',\n",
       "                                     [123, 128): 'Table',\n",
       "                                     [130, 135): 'Along',\n",
       "                                       [136, 139): 'the',\n",
       "                                       [140, 143): 'way',\n",
       "                                        [145, 147): 'he',\n",
       "                                  [148, 156): 'recruits',\n",
       "                                       [157, 160): 'Sir',\n",
       "                                  [161, 169): 'Bedevere',\n",
       "                                       [170, 173): 'the',\n",
       "                                      [174, 178): 'Wise',\n",
       "                                       [180, 183): 'Sir',\n",
       "                                  [184, 192): 'Lancelot',\n",
       "                                       [193, 196): 'the',\n",
       "                                     [197, 202): 'Brave',\n",
       "                                       [204, 207): 'Sir',\n",
       "                                   [208, 215): 'Galahad',\n",
       "                                       [216, 219): 'the',\n",
       "                                      [220, 224): 'Pure',\n",
       "                                       [226, 229): 'Sir',\n",
       "                                     [230, 235): 'Robin',\n",
       "                                       [236, 239): 'the',\n",
       "        [240, 274): 'Not-Quite-So-Brave-as-Sir-Lancelot',\n",
       "                                       [276, 279): 'and',\n",
       "                                       [280, 283): 'Sir',\n",
       "                [284, 310): 'Not-Appearing-in-this-Film',\n",
       "                                     [312, 317): 'along',\n",
       "                                      [318, 322): 'with',\n",
       "                                     [323, 328): 'their',\n",
       "                                   [329, 336): 'squires',\n",
       "                                       [337, 340): 'and',\n",
       "                                   [341, 348): 'Robin's',\n",
       "                               [349, 360): 'troubadours']\n",
       "       Length: 54, dtype: SpanDtype                      ,\n",
       "       <SpanArray>\n",
       "       [[0, 6): 'Second', [7, 15): 'document']\n",
       "       Length: 2, dtype: SpanDtype            ], dtype=object)"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "two_doc_series.array.tokens"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Spanner\n",
    "\n",
    "The `spanner` module of Text Extensions for Pandas provides span-specific operations\n",
    "for Pandas DataFrames, based on the Document Spanners formalism, also known as\n",
    "spanner algebra.\n",
    "\n",
    "Spanner algebra is an extension of relational algebra with additional operations\n",
    "to cover NLP applications. See the paper [\"Document Spanners: A Formal Approach to\n",
    "Information Extraction\"](\n",
    "https://researcher.watson.ibm.com/researcher/files/us-fagin/jacm15.pdf) by Fagin et al.\n",
    "for more information.\n",
    "\n",
    "The available operations in `spanner` include: `consolidate()` to eliminate overlap in a span column, extract matching tokens with `extract_dict()` for dictionary matching or `extract_regex_tok()` for regular expression matching, joining series of spans with `adjacent_join()`, `contain_join()`, or `overlap_join()`, and projection on spans with `lemmatize()`.\n",
    "\n",
    "Here we will show how to extract tokens matching regular expressions and then join the results to a DataFrame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>match</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[157, 169): 'Sir Bedevere'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>[180, 192): 'Sir Lancelot'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>[204, 215): 'Sir Galahad'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>[226, 235): 'Sir Robin'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>[280, 310): 'Sir Not-Appearing-in-this-Film'</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                          match\n",
       "0                    [157, 169): 'Sir Bedevere'\n",
       "1                    [180, 192): 'Sir Lancelot'\n",
       "2                     [204, 215): 'Sir Galahad'\n",
       "3                       [226, 235): 'Sir Robin'\n",
       "4  [280, 310): 'Sir Not-Appearing-in-this-Film'"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Extract tokens using a regular expression, here we find all the knights.\n",
    "knights = tp.spanner.extract_regex_tok(tokens, regex.compile(r\"Sir.\\S+\"), max_len=2)\n",
    "knights"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>match</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[323, 328): 'their'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[98, 109): 'the Knights'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>[113, 122): 'the Round'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>[136, 143): 'the way'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>[170, 178): 'the Wise'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>[193, 202): 'the Brave'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>[216, 224): 'the Pure'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>[236, 274): 'the Not-Quite-So-Brave-as-Sir-Lan...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                               match\n",
       "0                                [323, 328): 'their'\n",
       "0                           [98, 109): 'the Knights'\n",
       "1                            [113, 122): 'the Round'\n",
       "2                              [136, 143): 'the way'\n",
       "3                             [170, 178): 'the Wise'\n",
       "4                            [193, 202): 'the Brave'\n",
       "5                             [216, 224): 'the Pure'\n",
       "6  [236, 274): 'the Not-Quite-So-Brave-as-Sir-Lan..."
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Try to find all knight's virtues, not as easy and end up with other spans. \n",
    "virtues = tp.spanner.extract_regex_tok(tokens, regex.compile(r\"the.\\S+\"), max_len=2)\n",
    "virtues"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>knight</th>\n",
       "      <th>virtue</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[157, 169): 'Sir Bedevere'</td>\n",
       "      <td>[170, 178): 'the Wise'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>[180, 192): 'Sir Lancelot'</td>\n",
       "      <td>[193, 202): 'the Brave'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>[204, 215): 'Sir Galahad'</td>\n",
       "      <td>[216, 224): 'the Pure'</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>[226, 235): 'Sir Robin'</td>\n",
       "      <td>[236, 274): 'the Not-Quite-So-Brave-as-Sir-Lan...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                       knight  \\\n",
       "0  [157, 169): 'Sir Bedevere'   \n",
       "1  [180, 192): 'Sir Lancelot'   \n",
       "2   [204, 215): 'Sir Galahad'   \n",
       "3     [226, 235): 'Sir Robin'   \n",
       "\n",
       "                                              virtue  \n",
       "0                             [170, 178): 'the Wise'  \n",
       "1                            [193, 202): 'the Brave'  \n",
       "2                             [216, 224): 'the Pure'  \n",
       "3  [236, 274): 'the Not-Quite-So-Brave-as-Sir-Lan...  "
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Calling `tp.spanner.adjacent_join()` will join two span columns, where a pair\n",
    "# of spans match if they are adjacent in the text.\n",
    "\n",
    "# Now, easily join the 2 results and match each knight to their virtue.\n",
    "tp.spanner.adjacent_join(knights[\"match\"], virtues[\"match\"], first_name=\"knight\", second_name=\"virtue\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TensorArray\n",
    "\n",
    "A `TensorArray` represents an array of [tensors](https://en.wikipedia.org/wiki/Tensor#As_multidimensional_arrays) where each element is an N-dimensional tensor of the same shape. If there are M tensor elements in the array, then the entire `TensorArray` will have a shape of M x N, where the outer dimension is the number of elements. Backing the `TensorArray` is a [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html) with shape M x N. Tensors, or numpy.ndarrays, are often used as feature vectors for machine learning model training and inference results. In Text Extensions for Pandas, they are used to store BERT embeddings from `io.bert.add_embeddings()` that can then be used to train a NLU model.\n",
    "\n",
    "`TensorArray`s can be constructed with zero copy from a single `numpy.ndarray` or with a sequence of elements of similar shape. Conversion of a `TensorArray` to a `numpy.ndarray` can be done with zero copy by calling `TensorArray.to_numpy()` or using the provided numpy array interface, e.g. `numpy.asarray(TensorArray(...))`. The `TensorArray` is a Pandas extension type of type `TensorDtype` and can be wrapped in a `pandas.Series` or used as a column in a `pandas.DataFrame` and used in standard Pandas operations. A `NULL` or missing value in the `TensorArray` is represented as a N-dimensional `numpy.ndarray` where all items are `numpy.nan`. Standard arithmetic and comparison operations are supported and delegated to the backing `numpy.ndarray`. Taking a slice or multiple item selection will produce another `TensorArray`, while a single element selection will produce a `TensorElement` that also wraps a view of the `numpy.ndarray`, with similar operator support."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([[0, 1],\n",
       "        [2, 3],\n",
       "        [4, 5],\n",
       "        [6, 7],\n",
       "        [8, 9]]),\n",
       " <text_extensions_for_pandas.array.tensor.TensorDtype at 0x7fa6a854c6a0>)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Construct from a numpy.ndarray.\n",
    "arr = tp.TensorArray(np.arange(10).reshape(5, 2))\n",
    "arr, arr.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    [0, 1]\n",
       "1    [2, 3]\n",
       "2    [4, 5]\n",
       "3    [6, 7]\n",
       "4    [8, 9]\n",
       "dtype: TensorDtype"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Wrap in a Pandas Series.\n",
    "s = pd.Series(arr)\n",
    "s"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([[0, 1],\n",
       "        [2, 3],\n",
       "        [4, 5],\n",
       "        [6, 7],\n",
       "        [8, 9]]),\n",
       " dtype('int64'))"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Convert back to numpy using the provided array interface.\n",
    "np_arr = np.asarray(s)\n",
    "np_arr, np_arr.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    [ False,  False]\n",
       "1    [ False,  False]\n",
       "2    [ False,   True]\n",
       "3    [  True,   True]\n",
       "4    [  True,   True]\n",
       "dtype: TensorDtype"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Apply operations on the Series, result is another Series of type TensorDtype.\n",
    "thresh = s > 4\n",
    "thresh"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([False, False, False,  True,  True]),\n",
       " text_extensions_for_pandas.array.tensor.TensorArray)"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Create a boolean selection mask. Use `.array` to get the Series as\n",
    "# a `TensorArray` which can be used directly on numpy operations and\n",
    "# returns another `TensorArray`\n",
    "mask = np.all(thresh.array, axis=1)\n",
    "mask, type(mask)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3    [6, 7]\n",
       "4    [8, 9]\n",
       "dtype: TensorDtype"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Apply Pandas selection on the Series of TensorDtype by converting\n",
    "# the mask to a numpy boolean array.\n",
    "s[mask.to_numpy()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time</th>\n",
       "      <th>features</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2018-01-01 00:00:00</td>\n",
       "      <td>[0, 1]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2018-01-01 01:00:00</td>\n",
       "      <td>[2, 3]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2018-01-01 02:00:00</td>\n",
       "      <td>[4, 5]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2018-01-01 03:00:00</td>\n",
       "      <td>[6, 7]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2018-01-01 04:00:00</td>\n",
       "      <td>[8, 9]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                 time features\n",
       "0 2018-01-01 00:00:00   [0, 1]\n",
       "1 2018-01-01 01:00:00   [2, 3]\n",
       "2 2018-01-01 02:00:00   [4, 5]\n",
       "3 2018-01-01 03:00:00   [6, 7]\n",
       "4 2018-01-01 04:00:00   [8, 9]"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# TensorArray can also be added to a Pandas DataFrame.\n",
    "df = pd.DataFrame({\"time\": pd.date_range('2018-01-01', periods=5, freq='H'), \"features\": arr})\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time</th>\n",
       "      <th>features</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2018-01-01 04:00:00</td>\n",
       "      <td>[8, 9]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2018-01-01 03:00:00</td>\n",
       "      <td>[6, 7]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2018-01-01 02:00:00</td>\n",
       "      <td>[4, 5]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2018-01-01 01:00:00</td>\n",
       "      <td>[2, 3]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2018-01-01 00:00:00</td>\n",
       "      <td>[0, 1]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                 time features\n",
       "4 2018-01-01 04:00:00   [8, 9]\n",
       "3 2018-01-01 03:00:00   [6, 7]\n",
       "2 2018-01-01 02:00:00   [4, 5]\n",
       "1 2018-01-01 01:00:00   [2, 3]\n",
       "0 2018-01-01 00:00:00   [0, 1]"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# TensorArray supports many of the standard DataFrame operations.\n",
    "df.sort_values(by=\"time\", ascending=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Saving Pandas Extension Arrays to Disk\n",
    "\n",
    "Pandas supports several built-in I/O formats, but currently the only supported format for saving DataFrames with Text Extensions for Pandas arrays to disk is with [Feather](https://arrow.apache.org/docs/python/feather.html) files. Text Extensions for Pandas arrays can also be converted to Apache Arrow format, see https://arrow.apache.org/docs/python/pandas.html#dataframes for more information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Dummy function to create some features.\n",
    "def hasher(span, num_features=4):\n",
    "    arr = np.zeros(num_features, dtype=\"int8\")\n",
    "    arr[hash(span.covered_text) % 4] = 1\n",
    "    return arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(54, 4)"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Create our feature vector.\n",
    "features = tp.TensorArray([hasher(span) for span in tokens])\n",
    "features.to_numpy().shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>span</th>\n",
       "      <th>features</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[0, 2): 'In'</td>\n",
       "      <td>[0, 0, 0, 1]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>[3, 5): 'AD'</td>\n",
       "      <td>[1, 0, 0, 0]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>[6, 9): '932'</td>\n",
       "      <td>[0, 0, 1, 0]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>[11, 15): 'King'</td>\n",
       "      <td>[0, 1, 0, 0]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>[16, 22): 'Arthur'</td>\n",
       "      <td>[1, 0, 0, 0]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                 span      features\n",
       "0        [0, 2): 'In'  [0, 0, 0, 1]\n",
       "1        [3, 5): 'AD'  [1, 0, 0, 0]\n",
       "2       [6, 9): '932'  [0, 0, 1, 0]\n",
       "3    [11, 15): 'King'  [0, 1, 0, 0]\n",
       "4  [16, 22): 'Arthur'  [1, 0, 0, 0]"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Add tokens and features to a DataFrame.\n",
    "df = pd.DataFrame({\"span\": tokens, \"features\": features})\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Save DataFrame to a feather file.\n",
    "# Feather is a lightweight, fast binary columnar format, with basic\n",
    "# compression and support built into Pandas.\n",
    "df.to_feather(\"outputs/tp_overview.feather\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>span</th>\n",
       "      <th>features</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[0, 2): 'In'</td>\n",
       "      <td>[0, 0, 0, 1]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>[3, 5): 'AD'</td>\n",
       "      <td>[1, 0, 0, 0]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>[6, 9): '932'</td>\n",
       "      <td>[0, 0, 1, 0]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>[11, 15): 'King'</td>\n",
       "      <td>[0, 1, 0, 0]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>[16, 22): 'Arthur'</td>\n",
       "      <td>[1, 0, 0, 0]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                 span      features\n",
       "0        [0, 2): 'In'  [0, 0, 0, 1]\n",
       "1        [3, 5): 'AD'  [1, 0, 0, 0]\n",
       "2       [6, 9): '932'  [0, 0, 1, 0]\n",
       "3    [11, 15): 'King'  [0, 1, 0, 0]\n",
       "4  [16, 22): 'Arthur'  [1, 0, 0, 0]"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Read the file back into a new DataFrame.\n",
    "\n",
    "df_load = pd.read_feather(\"outputs/tp_overview.feather\")\n",
    "df_load.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## NLP Library Input/Output Integration\n",
    "\n",
    "Text Extensions for Pandas also provides integration with other NLP libraries and datasets. It takes care of processing the inputs and outputs using Pandas DataFrame as a standard data structure and automatically producing the above extension arrays where applicable. Below is an overview of what each module provides along with more notebooks with example usage.\n",
    "\n",
    "### Watson\n",
    "\n",
    "The `io.watson` sub-package provides functions to process and help analyze responses the IBM Waton Cloud service APIs.\n",
    "\n",
    "In the module `io.watson.nlu` you can use Watson Natural Language Understanding to analyze text and then process the response into Pandas DataFrames containing `SpanArray`s for tokens, sentences and relations. See [getting started on Watson NLU](https://cloud.ibm.com/docs/natural-language-understanding?topic=natural-language-understanding-getting-started) for setting up the Watson NLU Cloud Service, and the notebook [Analyze_Text](./Analyze_Text.ipynb) for in-depth examples of using the `io.watson.nlu` module.\n",
    "\n",
    "In the module `io.watson.table` you can use Watson Discovery to extract and analyze tables within documents and web pages, and then process the response into Pandas DataFrames that make it easy to reconstruct and work with the extracted tables. See [Waston Discovery Installation](https://cloud.ibm.com/docs/discovery-data?topic=discovery-data-install) and [IBM Cloud Pak for Data](https://www.ibm.com/products/cloud-pak-for-data) for getting started with Watson Discovery, and the notebook [Understand_Tables](./Understand_Tables.ipynb) for an in-depth example of using the `watson.table` module.\n",
    "\n",
    "### SpaCy\n",
    "\n",
    "The `io.spacy` module contains functions to integrate with the popular NLP library [SpaCy](https://spacy.io/). This allows you to use a [SpaCy tokenizer](https://spacy.io/usage/spacy-101#annotations-token) on text and return the tokens as a `SpanArray` in a Pandas DataFrame with `io.spacy.make_tokens()` or with additional token features with `io.spacy.make_tokens_and_features()`. See the notebook [Integrate_NLP_Libraries](./Integrate_NLP_Libraries.ipynb) for more examples with the `io.spacy` module.\n",
    "\n",
    "### BERT\n",
    "\n",
    "The BERT model is originally from the paper [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805) by Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. The model is pre-trained with masked language modeling and next sentence prediction objectives, which make it effective for masked token prediction and NLU.\n",
    "\n",
    "Text Extension for Pandas integrates with the [Huggingface Transformers](https://huggingface.co/transformers/index.html) library to process the result of BERT tokenization into a Pandas DataFrame with tokens as a`SpanArray` column and compute BERT embbeddings that can also be added to a DataFrame as a `TensorArray`. The embeddings can be used for model training in your NLP application. See the notebook [Model_Training_with_BERT](./Model_Training_with_BERT.ipynb) for an example of tokenizing text with BERT and computing embeddings for model training/scoring.\n",
    "\n",
    "### CoNLL\n",
    "\n",
    "[CoNLL](https://www.conll.org/), the SIGNLL Conference on Computational Natural Language Learning, is an annual academic conference for natural language processing researchers. Each year's conference features a competition involving a challenging NLP task. The task for the 2003 competition involved identifying mentions of [named entities](https://en.wikipedia.org/wiki/Named-entity_recognition) in English and German news articles from the late 1990's. The corpus for this 2003 competition is one of the most widely-used benchmarks for the performance of named entity recognition models.\n",
    "\n",
    "Text Extensions for Pandas contains the module `io.conll` that can help work with an analyze the CoNLL-2003 corpus. The provided functions can help convert between the [IOB2 format](https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)) used in the corpus, and `SpanArray` with entity type for easier analysis. See the notebooks [Analyze_Model_Outputs](./Analyze_Model_Outputs.ipynb) for an in-depth analysis of the corpus and the 2003 competition results, and [Model_Training_with_BERT](./Model_Training_with_BERT.ipynb) for using the corpus to train a named entity recognition (NER) model."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.17"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}