"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Import necessary libraries\n",
"from tf.app import use\n",
"from collections import Counter\n",
"import pandas as pd # for easy data manipulation (optional, for frequency tables)\n",
"from bokeh.io import output_file, show\n",
"from bokeh.plotting import figure\n",
"from bokeh.models import ColumnDataSource, Select, CustomJS, HoverTool\n",
"from wordcloud import WordCloud\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# Load BHSA with BHSaddons features\n",
"# This will load the BHSA data (version 2021) and the additional parashot markers.\n",
"A = use(\"etcbc/BHSA\", version=\"2021\", mod=\"tonyjurg/BHSaddons/tf/\", hoist=globals())"
]
},
{
"cell_type": "markdown",
"id": "14a071cc-2ef7-43bf-8cca-1fe4ecf0bdb3",
"metadata": {},
"source": [
"# 3 - Performing the queries \n",
"##### [Back to ToC](#TOC)\n",
"\n",
"In this step, set the `parasha_num` to the portion we want to analyze. The number corresponds to the traditional sequence of weekly Torah readings (1 = Bereshit, 2 = Noach, ..., 54 = V'Zot HaBerakhah). The code below will find all verses belonging to that parasha, then gather all the words in those verses for analysis."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "c4e60413-58a6-49e7-84c5-87074aa83634",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Selected Parasha 14: Vaโera (ืึธืึตืจึธื)\n",
"Parasha range: Exodus 6:2 through Exodus 9:35\n"
]
}
],
"source": [
"# Select the parasha by its number (1 to 54)\n",
"parasha_num = 14 # <-- Change this number to select a different parasha\n",
"\n",
"# Find all verse nodes that belong to the chosen parasha\n",
"verses_in_parasha = [v for v in F.otype.s(\"verse\") if F.parashanum.v(v) == parasha_num]\n",
"\n",
"if verses_in_parasha: \n",
" # Identify the range of verses (from first to last)\n",
" first_verse = verses_in_parasha[0]\n",
" last_verse = verses_in_parasha[-1]\n",
" # Get the parasha name in Hebrew and transliteration\n",
" parasha_name_trans = F.parashatrans.v(first_verse) \n",
" parasha_name_hebrew = F.parashahebr.v(first_verse) \n",
" print(f\"Selected Parasha {parasha_num}: {parasha_name_trans} ({parasha_name_hebrew})\")\n",
" start_ref = T.sectionFromNode(first_verse) # (Book, chapter, verse)\n",
" end_ref = T.sectionFromNode(last_verse)\n",
" print(f\"Parasha range: {start_ref[0]} {start_ref[1]}:{start_ref[2]} through {end_ref[0]} {end_ref[1]}:{end_ref[2]}\")\n",
"else:\n",
" print(\"No verses found for the given parasha number. Make sure the number is 1-54.\")"
]
},
{
"cell_type": "markdown",
"id": "9b57b19a-d503-4601-ac55-d0bda63e3182",
"metadata": {},
"source": [
"Now we gather all word tokens in these verses, and also collect clause and phrase units for later structural analysis:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "296762ab-7007-4529-b939-25d562bb32cd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total verses: 121\n",
"Total words: 2512\n",
"Total clauses: 461\n",
"Total phrases: 1437\n"
]
}
],
"source": [
"# Gather all word nodes in the selected parasha\n",
"words_in_parasha = []\n",
"for v in verses_in_parasha:\n",
" words_in_parasha += L.d(v, \"word\") # all word objects descending from the verse\n",
"\n",
"# Also gather all clauses and phrases in the parasha (for later analysis)\n",
"clauses_in_parasha = []\n",
"phrases_in_parasha = []\n",
"for v in verses_in_parasha:\n",
" clauses_in_parasha += L.d(v, \"clause\")\n",
" phrases_in_parasha += L.d(v, \"phrase\")\n",
"\n",
"print(f\"Total verses: {len(verses_in_parasha)}\")\n",
"print(f\"Total words: {len(words_in_parasha)}\")\n",
"print(f\"Total clauses: {len(clauses_in_parasha)}\")\n",
"print(f\"Total phrases: {len(phrases_in_parasha)}\")"
]
},
{
"cell_type": "markdown",
"id": "114f397a-05cb-4ca1-b3fe-e2ec85c5d02a",
"metadata": {},
"source": [
"This will output the counts of verses, words, clauses, and phrases in the parasha. We will use `words_in_parasha` for lexical statistics and `clauses_in_parasha`/`phrases_in_parasha` for discourse and syntax analysis. \n",
"\n",
"*(Note: \"word\", \"clause\", and \"phrase\" are Text-Fabric object types in BHSA. Each word is a smallest textual unit (generally corresponding to a lexical item, including prefixes if attached). \"Clause\" here refers to a clause or clause atom in the syntactic hierarchy, and \"phrase\" to a phrase or subphrase. The BHSA annotations allow us to traverse these levels easily via the `L` api for local relations.)*"
]
},
{
"cell_type": "markdown",
"id": "6dfa2b06-4328-4d72-98c5-c1940dd42dec",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": []
},
"source": [
"# 4 - Display the results \n",
"##### [Back to ToC](#TOC)\n",
"\n",
"## 4.1 - Verbal forms distribution in the parasha \n",
"\n",
"Biblical Hebrew verbs appear in different conjugations / forms such as *qatal* (perfect), *yiqtol* (imperfect), *wayyiqtol* (the narrative past form with prefixed *waw*), *imperative*, *infinitive*, *participle*, etc. Weโll quantify how often each form occurs in this parasha.\n",
"\n",
"The BHSA feature `vt` (verbal tense) classifies verb *words* by form ([Vt - BHSA](https://etcbc.github.io/bhsa/features/vt/#)). Possible values include `\"perf\"` (perfect/qatal), `\"impf\"` (imperfect/yiqtol), `\"wayq\"` (wayyiqtol), `\"impv\"` (imperative), `\"infa\"/\"infc\"` (infinitive absolute/construct), `\"ptca\"/\"ptcp\"` (participle active/passive), or `\"NA\"` for words that are not verbs. \n",
"\n",
"Let's count the verb instances by `vt` value:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "a5eedd98-828f-4f7f-8a3b-d89ea0cf1371",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Verb Form Distribution:\n",
" wayq: 118\n",
" perf: 112\n",
" impf: 52\n",
" impv: 40\n",
" infc: 39\n",
" ptca: 24\n",
" infa: 2\n",
" ptcp: 1\n"
]
}
],
"source": [
"# Filter the words to only verbs and count each verb form (vt value)\n",
"verb_words = [w for w in words_in_parasha if F.sp.v(w) == \"verb\"] # sp = part of speech\n",
"verb_form_counts = Counter(F.vt.v(w) for w in verb_words)\n",
"\n",
"# Pretty-print the counts of each verb form\n",
"print(\"Verb Form Distribution:\")\n",
"for form, count in verb_form_counts.most_common():\n",
" if form == \"NA\":\n",
" continue # skip \"NA\" (if any, non-verbs)\n",
" print(f\" {form}: {count}\")"
]
},
{
"cell_type": "markdown",
"id": "cdc03ae2-7397-4e02-8498-761b9cf5d328",
"metadata": {},
"source": [
"This will list the verb forms present and their frequencies. Now, we create a bar chart to visualize this distribution. Additionally, we want to make it interactive: allow filtering by narrative vs. direct speech contexts.\n",
"To enable this, we will use the clause `domain` feature (text type) to separate verbs used in direct speech (domain = `Q`) versus narrative (domain = `N` or `D`). We'll prepare counts for: \n",
"- *All* occurrences (default),\n",
"- *Narrative* (including discursive, i.e. domain != Q),\n",
"- *Direct Speech* (domain = Q).\n",
"\n",
"Then, using a Bokeh `Select` widget, the user can switch the view between All/Narrative/Direct speech."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "70a50830-4c9a-4a4a-a334-229302803365",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
" \n",
" \n",
"
\n",
"
Loading BokehJS ...\n",
"
\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"'use strict';\n",
"(function(root) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
"\n",
" const force = true;\n",
"\n",
" if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n",
" root._bokeh_onload_callbacks = [];\n",
" root._bokeh_is_loading = undefined;\n",
" }\n",
"\n",
"const JS_MIME_TYPE = 'application/javascript';\n",
" const HTML_MIME_TYPE = 'text/html';\n",
" const EXEC_MIME_TYPE = 'application/vnd.bokehjs_exec.v0+json';\n",
" const CLASS_NAME = 'output_bokeh rendered_html';\n",
"\n",
" /**\n",
" * Render data to the DOM node\n",
" */\n",
" function render(props, node) {\n",
" const script = document.createElement(\"script\");\n",
" node.appendChild(script);\n",
" }\n",
"\n",
" /**\n",
" * Handle when an output is cleared or removed\n",
" */\n",
" function handleClearOutput(event, handle) {\n",
" function drop(id) {\n",
" const view = Bokeh.index.get_by_id(id)\n",
" if (view != null) {\n",
" view.model.document.clear()\n",
" Bokeh.index.delete(view)\n",
" }\n",
" }\n",
"\n",
" const cell = handle.cell;\n",
"\n",
" const id = cell.output_area._bokeh_element_id;\n",
" const server_id = cell.output_area._bokeh_server_id;\n",
"\n",
" // Clean up Bokeh references\n",
" if (id != null) {\n",
" drop(id)\n",
" }\n",
"\n",
" if (server_id !== undefined) {\n",
" // Clean up Bokeh references\n",
" const cmd_clean = \"from bokeh.io.state import curstate; print(curstate().uuid_to_server['\" + server_id + \"'].get_sessions()[0].document.roots[0]._id)\";\n",
" cell.notebook.kernel.execute(cmd_clean, {\n",
" iopub: {\n",
" output: function(msg) {\n",
" const id = msg.content.text.trim()\n",
" drop(id)\n",
" }\n",
" }\n",
" });\n",
" // Destroy server and session\n",
" const cmd_destroy = \"import bokeh.io.notebook as ion; ion.destroy_server('\" + server_id + \"')\";\n",
" cell.notebook.kernel.execute(cmd_destroy);\n",
" }\n",
" }\n",
"\n",
" /**\n",
" * Handle when a new output is added\n",
" */\n",
" function handleAddOutput(event, handle) {\n",
" const output_area = handle.output_area;\n",
" const output = handle.output;\n",
"\n",
" // limit handleAddOutput to display_data with EXEC_MIME_TYPE content only\n",
" if ((output.output_type != \"display_data\") || (!Object.prototype.hasOwnProperty.call(output.data, EXEC_MIME_TYPE))) {\n",
" return\n",
" }\n",
"\n",
" const toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n",
"\n",
" if (output.metadata[EXEC_MIME_TYPE][\"id\"] !== undefined) {\n",
" toinsert[toinsert.length - 1].firstChild.textContent = output.data[JS_MIME_TYPE];\n",
" // store reference to embed id on output_area\n",
" output_area._bokeh_element_id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n",
" }\n",
" if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n",
" const bk_div = document.createElement(\"div\");\n",
" bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n",
" const script_attrs = bk_div.children[0].attributes;\n",
" for (let i = 0; i < script_attrs.length; i++) {\n",
" toinsert[toinsert.length - 1].firstChild.setAttribute(script_attrs[i].name, script_attrs[i].value);\n",
" toinsert[toinsert.length - 1].firstChild.textContent = bk_div.children[0].textContent\n",
" }\n",
" // store reference to server id on output_area\n",
" output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n",
" }\n",
" }\n",
"\n",
" function register_renderer(events, OutputArea) {\n",
"\n",
" function append_mime(data, metadata, element) {\n",
" // create a DOM node to render to\n",
" const toinsert = this.create_output_subarea(\n",
" metadata,\n",
" CLASS_NAME,\n",
" EXEC_MIME_TYPE\n",
" );\n",
" this.keyboard_manager.register_events(toinsert);\n",
" // Render to node\n",
" const props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n",
" render(props, toinsert[toinsert.length - 1]);\n",
" element.append(toinsert);\n",
" return toinsert\n",
" }\n",
"\n",
" /* Handle when an output is cleared or removed */\n",
" events.on('clear_output.CodeCell', handleClearOutput);\n",
" events.on('delete.Cell', handleClearOutput);\n",
"\n",
" /* Handle when a new output is added */\n",
" events.on('output_added.OutputArea', handleAddOutput);\n",
"\n",
" /**\n",
" * Register the mime type and append_mime function with output_area\n",
" */\n",
" OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n",
" /* Is output safe? */\n",
" safe: true,\n",
" /* Index of renderer in `output_area.display_order` */\n",
" index: 0\n",
" });\n",
" }\n",
"\n",
" // register the mime type if in Jupyter Notebook environment and previously unregistered\n",
" if (root.Jupyter !== undefined) {\n",
" const events = require('base/js/events');\n",
" const OutputArea = require('notebook/js/outputarea').OutputArea;\n",
"\n",
" if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n",
" register_renderer(events, OutputArea);\n",
" }\n",
" }\n",
" if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n",
" root._bokeh_timeout = Date.now() + 5000;\n",
" root._bokeh_failed_load = false;\n",
" }\n",
"\n",
" const NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"\\n\"+\n",
" \"
\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"
\\n\"+\n",
" \"
\\n\"+\n",
" \"- re-rerun `output_notebook()` to attempt to load from CDN again, or
\\n\"+\n",
" \"- use INLINE resources instead, as so:
\\n\"+\n",
" \"
\\n\"+\n",
" \"
\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"\\n\"+\n",
" \"
\"}};\n",
"\n",
" function display_loaded(error = null) {\n",
" const el = document.getElementById(\"b1152251-c25d-4cc9-b1d0-bc7cccd65b60\");\n",
" if (el != null) {\n",
" const html = (() => {\n",
" if (typeof root.Bokeh === \"undefined\") {\n",
" if (error == null) {\n",
" return \"BokehJS is loading ...\";\n",
" } else {\n",
" return \"BokehJS failed to load.\";\n",
" }\n",
" } else {\n",
" const prefix = `BokehJS ${root.Bokeh.version}`;\n",
" if (error == null) {\n",
" return `${prefix} successfully loaded.`;\n",
" } else {\n",
" return `${prefix} encountered errors while loading and may not function as expected.`;\n",
" }\n",
" }\n",
" })();\n",
" el.innerHTML = html;\n",
"\n",
" if (error != null) {\n",
" const wrapper = document.createElement(\"div\");\n",
" wrapper.style.overflow = \"auto\";\n",
" wrapper.style.height = \"5em\";\n",
" wrapper.style.resize = \"vertical\";\n",
" const content = document.createElement(\"div\");\n",
" content.style.fontFamily = \"monospace\";\n",
" content.style.whiteSpace = \"pre-wrap\";\n",
" content.style.backgroundColor = \"rgb(255, 221, 221)\";\n",
" content.textContent = error.stack ?? error.toString();\n",
" wrapper.append(content);\n",
" el.append(wrapper);\n",
" }\n",
" } else if (Date.now() < root._bokeh_timeout) {\n",
" setTimeout(() => display_loaded(error), 100);\n",
" }\n",
" }\n",
"\n",
" function run_callbacks() {\n",
" try {\n",
" root._bokeh_onload_callbacks.forEach(function(callback) {\n",
" if (callback != null)\n",
" callback();\n",
" });\n",
" } finally {\n",
" delete root._bokeh_onload_callbacks\n",
" }\n",
" console.debug(\"Bokeh: all callbacks have finished\");\n",
" }\n",
"\n",
" function load_libs(css_urls, js_urls, callback) {\n",
" if (css_urls == null) css_urls = [];\n",
" if (js_urls == null) js_urls = [];\n",
"\n",
" root._bokeh_onload_callbacks.push(callback);\n",
" if (root._bokeh_is_loading > 0) {\n",
" console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" root._bokeh_is_loading = css_urls.length + js_urls.length;\n",
"\n",
" function on_load() {\n",
" root._bokeh_is_loading--;\n",
" if (root._bokeh_is_loading === 0) {\n",
" console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n",
" run_callbacks()\n",
" }\n",
" }\n",
"\n",
" function on_error(url) {\n",
" console.error(\"failed to load \" + url);\n",
" }\n",
"\n",
" for (let i = 0; i < css_urls.length; i++) {\n",
" const url = css_urls[i];\n",
" const element = document.createElement(\"link\");\n",
" element.onload = on_load;\n",
" element.onerror = on_error.bind(null, url);\n",
" element.rel = \"stylesheet\";\n",
" element.type = \"text/css\";\n",
" element.href = url;\n",
" console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n",
" document.body.appendChild(element);\n",
" }\n",
"\n",
" for (let i = 0; i < js_urls.length; i++) {\n",
" const url = js_urls[i];\n",
" const element = document.createElement('script');\n",
" element.onload = on_load;\n",
" element.onerror = on_error.bind(null, url);\n",
" element.async = false;\n",
" element.src = url;\n",
" console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.head.appendChild(element);\n",
" }\n",
" };\n",
"\n",
" function inject_raw_css(css) {\n",
" const element = document.createElement(\"style\");\n",
" element.appendChild(document.createTextNode(css));\n",
" document.body.appendChild(element);\n",
" }\n",
"\n",
" const js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-3.6.0.min.js\"];\n",
" const css_urls = [];\n",
"\n",
" const inline_js = [ function(Bokeh) {\n",
" Bokeh.set_log_level(\"info\");\n",
" },\n",
"function(Bokeh) {\n",
" }\n",
" ];\n",
"\n",
" function run_inline_js() {\n",
" if (root.Bokeh !== undefined || force === true) {\n",
" try {\n",
" for (let i = 0; i < inline_js.length; i++) {\n",
" inline_js[i].call(root, root.Bokeh);\n",
" }\n",
"\n",
" } catch (error) {display_loaded(error);throw error;\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < root._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!root._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" root._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" const cell = $(document.getElementById(\"b1152251-c25d-4cc9-b1d0-bc7cccd65b60\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" }\n",
"\n",
" if (root._bokeh_is_loading === 0) {\n",
" console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(css_urls, js_urls, function() {\n",
" console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
"}(window));"
],
"application/vnd.bokehjs_load.v0+json": "'use strict';\n(function(root) {\n function now() {\n return new Date();\n }\n\n const force = true;\n\n if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n\n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n const NB_LOAD_WARNING = {'data': {'text/html':\n \"\\n\"+\n \"
\\n\"+\n \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n \"
\\n\"+\n \"
\\n\"+\n \"- re-rerun `output_notebook()` to attempt to load from CDN again, or
\\n\"+\n \"- use INLINE resources instead, as so:
\\n\"+\n \"
\\n\"+\n \"
\\n\"+\n \"from bokeh.resources import INLINE\\n\"+\n \"output_notebook(resources=INLINE)\\n\"+\n \"\\n\"+\n \"
\"}};\n\n function display_loaded(error = null) {\n const el = document.getElementById(\"b1152251-c25d-4cc9-b1d0-bc7cccd65b60\");\n if (el != null) {\n const html = (() => {\n if (typeof root.Bokeh === \"undefined\") {\n if (error == null) {\n return \"BokehJS is loading ...\";\n } else {\n return \"BokehJS failed to load.\";\n }\n } else {\n const prefix = `BokehJS ${root.Bokeh.version}`;\n if (error == null) {\n return `${prefix} successfully loaded.`;\n } else {\n return `${prefix} encountered errors while loading and may not function as expected.`;\n }\n }\n })();\n el.innerHTML = html;\n\n if (error != null) {\n const wrapper = document.createElement(\"div\");\n wrapper.style.overflow = \"auto\";\n wrapper.style.height = \"5em\";\n wrapper.style.resize = \"vertical\";\n const content = document.createElement(\"div\");\n content.style.fontFamily = \"monospace\";\n content.style.whiteSpace = \"pre-wrap\";\n content.style.backgroundColor = \"rgb(255, 221, 221)\";\n content.textContent = error.stack ?? error.toString();\n wrapper.append(content);\n el.append(wrapper);\n }\n } else if (Date.now() < root._bokeh_timeout) {\n setTimeout(() => display_loaded(error), 100);\n }\n }\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) {\n if (callback != null)\n callback();\n });\n } finally {\n delete root._bokeh_onload_callbacks\n }\n console.debug(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(css_urls, js_urls, callback) {\n if (css_urls == null) css_urls = [];\n if (js_urls == null) js_urls = [];\n\n root._bokeh_onload_callbacks.push(callback);\n if (root._bokeh_is_loading > 0) {\n console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls == null || js_urls.length === 0) {\n run_callbacks();\n return null;\n }\n console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n root._bokeh_is_loading = css_urls.length + js_urls.length;\n\n function on_load() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n run_callbacks()\n }\n }\n\n function on_error(url) {\n console.error(\"failed to load \" + url);\n }\n\n for (let i = 0; i < css_urls.length; i++) {\n const url = css_urls[i];\n const element = document.createElement(\"link\");\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.rel = \"stylesheet\";\n element.type = \"text/css\";\n element.href = url;\n console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n document.body.appendChild(element);\n }\n\n for (let i = 0; i < js_urls.length; i++) {\n const url = js_urls[i];\n const element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.async = false;\n element.src = url;\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n };\n\n function inject_raw_css(css) {\n const element = document.createElement(\"style\");\n element.appendChild(document.createTextNode(css));\n document.body.appendChild(element);\n }\n\n const js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-3.6.0.min.js\"];\n const css_urls = [];\n\n const inline_js = [ function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\nfunction(Bokeh) {\n }\n ];\n\n function run_inline_js() {\n if (root.Bokeh !== undefined || force === true) {\n try {\n for (let i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }\n\n } catch (error) {display_loaded(error);throw error;\n }if (force === true) {\n display_loaded();\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n } else if (force !== true) {\n const cell = $(document.getElementById(\"b1152251-c25d-4cc9-b1d0-bc7cccd65b60\")).parents('.cell').data().cell;\n cell.output_area.append_execute_result(NB_LOAD_WARNING)\n }\n }\n\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n run_inline_js();\n } else {\n load_libs(css_urls, js_urls, function() {\n console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n run_inline_js();\n });\n }\n}(window));"
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"(function(root) {\n",
" function embed_document(root) {\n",
" const docs_json = {\"c07cdd87-71d5-47af-a19e-af1fce5d8961\":{\"version\":\"3.6.0\",\"title\":\"Bokeh Application\",\"roots\":[{\"type\":\"object\",\"name\":\"Column\",\"id\":\"p1052\",\"attributes\":{\"children\":[{\"type\":\"object\",\"name\":\"Figure\",\"id\":\"p1005\",\"attributes\":{\"width\":500,\"height\":300,\"x_range\":{\"type\":\"object\",\"name\":\"FactorRange\",\"id\":\"p1015\",\"attributes\":{\"factors\":[\"impf\",\"impv\",\"infa\",\"infc\",\"perf\",\"ptca\",\"ptcp\",\"wayq\"]}},\"y_range\":{\"type\":\"object\",\"name\":\"Range1d\",\"id\":\"p1004\",\"attributes\":{\"end\":129.8,\"bounds\":[0,129.8]}},\"x_scale\":{\"type\":\"object\",\"name\":\"CategoricalScale\",\"id\":\"p1016\"},\"y_scale\":{\"type\":\"object\",\"name\":\"LinearScale\",\"id\":\"p1017\"},\"title\":{\"type\":\"object\",\"name\":\"Title\",\"id\":\"p1008\",\"attributes\":{\"text\":\"Parasha #14: Va\\u2019era - Verb form distribution\"}},\"renderers\":[{\"type\":\"object\",\"name\":\"GlyphRenderer\",\"id\":\"p1047\",\"attributes\":{\"data_source\":{\"type\":\"object\",\"name\":\"ColumnDataSource\",\"id\":\"p1001\",\"attributes\":{\"selected\":{\"type\":\"object\",\"name\":\"Selection\",\"id\":\"p1002\",\"attributes\":{\"indices\":[],\"line_indices\":[]}},\"selection_policy\":{\"type\":\"object\",\"name\":\"UnionRenderers\",\"id\":\"p1003\"},\"data\":{\"type\":\"map\",\"entries\":[[\"form\",[\"impf\",\"impv\",\"infa\",\"infc\",\"perf\",\"ptca\",\"ptcp\",\"wayq\"]],[\"count_all\",[52,40,2,39,112,24,1,118]],[\"count_narr\",[1,0,1,16,51,6,0,118]],[\"count_direct\",[51,40,1,23,61,18,1,0]],[\"count\",[52,40,2,39,112,24,1,118]]]}}},\"view\":{\"type\":\"object\",\"name\":\"CDSView\",\"id\":\"p1048\",\"attributes\":{\"filter\":{\"type\":\"object\",\"name\":\"AllIndices\",\"id\":\"p1049\"}}},\"glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1044\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"form\"},\"width\":{\"type\":\"value\",\"value\":0.8},\"top\":{\"type\":\"field\",\"field\":\"count\"},\"line_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"fill_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"hatch_color\":{\"type\":\"value\",\"value\":\"#718dbf\"}}},\"nonselection_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1045\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"form\"},\"width\":{\"type\":\"value\",\"value\":0.8},\"top\":{\"type\":\"field\",\"field\":\"count\"},\"line_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.1},\"fill_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.1},\"hatch_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.1}}},\"muted_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1046\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"form\"},\"width\":{\"type\":\"value\",\"value\":0.8},\"top\":{\"type\":\"field\",\"field\":\"count\"},\"line_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.2},\"fill_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.2},\"hatch_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.2}}}}}],\"toolbar\":{\"type\":\"object\",\"name\":\"Toolbar\",\"id\":\"p1014\",\"attributes\":{\"tools\":[{\"type\":\"object\",\"name\":\"PanTool\",\"id\":\"p1028\"},{\"type\":\"object\",\"name\":\"WheelZoomTool\",\"id\":\"p1029\",\"attributes\":{\"renderers\":\"auto\"}},{\"type\":\"object\",\"name\":\"BoxZoomTool\",\"id\":\"p1030\",\"attributes\":{\"overlay\":{\"type\":\"object\",\"name\":\"BoxAnnotation\",\"id\":\"p1031\",\"attributes\":{\"syncable\":false,\"line_color\":\"black\",\"line_alpha\":1.0,\"line_width\":2,\"line_dash\":[4,4],\"fill_color\":\"lightgrey\",\"fill_alpha\":0.5,\"level\":\"overlay\",\"visible\":false,\"left\":{\"type\":\"number\",\"value\":\"nan\"},\"right\":{\"type\":\"number\",\"value\":\"nan\"},\"top\":{\"type\":\"number\",\"value\":\"nan\"},\"bottom\":{\"type\":\"number\",\"value\":\"nan\"},\"left_units\":\"canvas\",\"right_units\":\"canvas\",\"top_units\":\"canvas\",\"bottom_units\":\"canvas\",\"handles\":{\"type\":\"object\",\"name\":\"BoxInteractionHandles\",\"id\":\"p1037\",\"attributes\":{\"all\":{\"type\":\"object\",\"name\":\"AreaVisuals\",\"id\":\"p1036\",\"attributes\":{\"fill_color\":\"white\",\"hover_fill_color\":\"lightgray\"}}}}}}}},{\"type\":\"object\",\"name\":\"SaveTool\",\"id\":\"p1038\"},{\"type\":\"object\",\"name\":\"ResetTool\",\"id\":\"p1039\"},{\"type\":\"object\",\"name\":\"HelpTool\",\"id\":\"p1040\"}],\"active_drag\":null,\"active_scroll\":null}},\"left\":[{\"type\":\"object\",\"name\":\"LinearAxis\",\"id\":\"p1023\",\"attributes\":{\"ticker\":{\"type\":\"object\",\"name\":\"BasicTicker\",\"id\":\"p1024\",\"attributes\":{\"mantissas\":[1,2,5]}},\"formatter\":{\"type\":\"object\",\"name\":\"BasicTickFormatter\",\"id\":\"p1025\"},\"axis_label\":\"Frequency\",\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1026\"}}}],\"below\":[{\"type\":\"object\",\"name\":\"CategoricalAxis\",\"id\":\"p1018\",\"attributes\":{\"ticker\":{\"type\":\"object\",\"name\":\"CategoricalTicker\",\"id\":\"p1019\"},\"formatter\":{\"type\":\"object\",\"name\":\"CategoricalTickFormatter\",\"id\":\"p1020\"},\"axis_label\":\"Verb Form\",\"major_label_orientation\":1.0,\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1021\"}}}],\"center\":[{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1022\",\"attributes\":{\"axis\":{\"id\":\"p1018\"}}},{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1027\",\"attributes\":{\"dimension\":1,\"axis\":{\"id\":\"p1023\"}}}]}},{\"type\":\"object\",\"name\":\"Select\",\"id\":\"p1050\",\"attributes\":{\"js_property_callbacks\":{\"type\":\"map\",\"entries\":[[\"change:value\",[{\"type\":\"object\",\"name\":\"CustomJS\",\"id\":\"p1051\",\"attributes\":{\"args\":{\"type\":\"map\",\"entries\":[[\"source\",{\"id\":\"p1001\"}]]},\"code\":\"\\n const data = source.data;\\n const filter = cb_obj.value;\\n if (filter === 'Narrative only') {\\n data['count'] = data['count_narr'];\\n } else if (filter === 'Direct speech only') {\\n data['count'] = data['count_direct'];\\n } else {\\n data['count'] = data['count_all'];\\n }\\n source.change.emit();\\n\"}}]]]},\"title\":\"Filter by Text Type:\",\"options\":[\"All\",\"Narrative only\",\"Direct speech only\"],\"value\":\"All\"}}]}}]}};\n",
" const render_items = [{\"docid\":\"c07cdd87-71d5-47af-a19e-af1fce5d8961\",\"roots\":{\"p1052\":\"c357f96d-d7ac-4662-9fb5-49467b7ec1cd\"},\"root_ids\":[\"p1052\"]}];\n",
" void root.Bokeh.embed.embed_items_notebook(docs_json, render_items);\n",
" }\n",
" if (root.Bokeh !== undefined) {\n",
" embed_document(root);\n",
" } else {\n",
" let attempts = 0;\n",
" const timer = setInterval(function(root) {\n",
" if (root.Bokeh !== undefined) {\n",
" clearInterval(timer);\n",
" embed_document(root);\n",
" } else {\n",
" attempts++;\n",
" if (attempts > 100) {\n",
" clearInterval(timer);\n",
" console.log(\"Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing\");\n",
" }\n",
" }\n",
" }, 10, root)\n",
" }\n",
"})(window);"
],
"application/vnd.bokehjs_exec.v0+json": ""
},
"metadata": {
"application/vnd.bokehjs_exec.v0+json": {
"id": "p1052"
}
},
"output_type": "display_data"
}
],
"source": [
"from bokeh.layouts import column\n",
"from bokeh.models import ColumnDataSource, CustomJS, Select, Range1d\n",
"from bokeh.plotting import figure, output_file, show\n",
"from collections import Counter\n",
"from bokeh.io import output_notebook\n",
"\n",
"# Prepare the Bokeh output to display in the notebook\n",
"output_notebook()\n",
"\n",
"# Prepare data for interactive bar chart\n",
"verb_form_counts_all = Counter(F.vt.v(w) for w in verb_words)\n",
"verb_form_counts_narr = Counter(F.vt.v(w) for w in verb_words if F.domain.v(L.u(w, \"clause\")[0]) != \"Q\")\n",
"verb_form_counts_direct = Counter(F.vt.v(w) for w in verb_words if F.domain.v(L.u(w, \"clause\")[0]) == \"Q\")\n",
"\n",
"# Define the categories (verb forms) to plot (excluding \"NA\")\n",
"verb_forms = [vf for vf in verb_form_counts_all.keys() if vf != \"NA\"]\n",
"verb_forms.sort() # sort alphabetically\n",
"\n",
"# Create data source for Bokeh\n",
"data = {\n",
" 'form': verb_forms,\n",
" 'count_all': [verb_form_counts_all.get(vf, 0) for vf in verb_forms],\n",
" 'count_narr': [verb_form_counts_narr.get(vf, 0) for vf in verb_forms],\n",
" 'count_direct': [verb_form_counts_direct.get(vf, 0) for vf in verb_forms],\n",
" # Use 'count' as the currently selected counts (start with all by default)\n",
" 'count': [verb_form_counts_all.get(vf, 0) for vf in verb_forms]\n",
"}\n",
"source = ColumnDataSource(data=data)\n",
"\n",
"# Compute maximum count across all categories and add padding (10%)\n",
"max_count = max(max(data['count_all']), max(data['count_narr']), max(data['count_direct']))\n",
"y_end = max_count * 1.1\n",
"\n",
"# Create a Bokeh bar chart with a fixed y_range\n",
"p = figure(x_range=verb_forms, height=300, width=500,\n",
" title=f\"Parasha #{parasha_num}: {parasha_name_trans} - Verb form distribution\",\n",
" x_axis_label=\"Verb Form\", y_axis_label=\"Frequency\",\n",
" toolbar_location='right',\n",
" y_range=Range1d(start=0, end=y_end))\n",
"\n",
"p.vbar(x='form', top='count', width=0.8, source=source, color=\"#718dbf\")\n",
"\n",
"# Lock the y_range so it cannot be changed by interactions\n",
"p.y_range.bounds = (0, y_end)\n",
"\n",
"# Deactivate any active drag or scroll tools (if the toolbar exists)\n",
"if p.toolbar:\n",
" p.toolbar.active_drag = None\n",
" p.toolbar.active_scroll = None\n",
"\n",
"# Configure x-axis labels for better readability (rotate if needed)\n",
"p.xaxis.major_label_orientation = 1.0\n",
"\n",
"# Add a dropdown to filter by context (All / Narrative / Direct)\n",
"select = Select(title=\"Filter by Text Type:\", value=\"All\",\n",
" options=[\"All\", \"Narrative only\", \"Direct speech only\"])\n",
"\n",
"# JavaScript callback to update the bar heights based on selection\n",
"callback_code = \"\"\"\n",
" const data = source.data;\n",
" const filter = cb_obj.value;\n",
" if (filter === 'Narrative only') {\n",
" data['count'] = data['count_narr'];\n",
" } else if (filter === 'Direct speech only') {\n",
" data['count'] = data['count_direct'];\n",
" } else {\n",
" data['count'] = data['count_all'];\n",
" }\n",
" source.change.emit();\n",
"\"\"\"\n",
"select.js_on_change('value', CustomJS(args={'source': source}, code=callback_code))\n",
"\n",
"# Combine the plot and dropdown in a layout and show\n",
"layout = column(p, select)\n",
"show(layout)"
]
},
{
"cell_type": "markdown",
"id": "6e4cddc7-7833-43ed-9de1-b6b9cae60697",
"metadata": {},
"source": [
"Now also make a static image:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "810df0c9-5898-4187-b168-87df531ecd3d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" \n",
" \n",
"
\n",
"
Loading BokehJS ...\n",
"
\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"'use strict';\n",
"(function(root) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
"\n",
" const force = true;\n",
"\n",
" if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n",
" root._bokeh_onload_callbacks = [];\n",
" root._bokeh_is_loading = undefined;\n",
" }\n",
"\n",
"const JS_MIME_TYPE = 'application/javascript';\n",
" const HTML_MIME_TYPE = 'text/html';\n",
" const EXEC_MIME_TYPE = 'application/vnd.bokehjs_exec.v0+json';\n",
" const CLASS_NAME = 'output_bokeh rendered_html';\n",
"\n",
" /**\n",
" * Render data to the DOM node\n",
" */\n",
" function render(props, node) {\n",
" const script = document.createElement(\"script\");\n",
" node.appendChild(script);\n",
" }\n",
"\n",
" /**\n",
" * Handle when an output is cleared or removed\n",
" */\n",
" function handleClearOutput(event, handle) {\n",
" function drop(id) {\n",
" const view = Bokeh.index.get_by_id(id)\n",
" if (view != null) {\n",
" view.model.document.clear()\n",
" Bokeh.index.delete(view)\n",
" }\n",
" }\n",
"\n",
" const cell = handle.cell;\n",
"\n",
" const id = cell.output_area._bokeh_element_id;\n",
" const server_id = cell.output_area._bokeh_server_id;\n",
"\n",
" // Clean up Bokeh references\n",
" if (id != null) {\n",
" drop(id)\n",
" }\n",
"\n",
" if (server_id !== undefined) {\n",
" // Clean up Bokeh references\n",
" const cmd_clean = \"from bokeh.io.state import curstate; print(curstate().uuid_to_server['\" + server_id + \"'].get_sessions()[0].document.roots[0]._id)\";\n",
" cell.notebook.kernel.execute(cmd_clean, {\n",
" iopub: {\n",
" output: function(msg) {\n",
" const id = msg.content.text.trim()\n",
" drop(id)\n",
" }\n",
" }\n",
" });\n",
" // Destroy server and session\n",
" const cmd_destroy = \"import bokeh.io.notebook as ion; ion.destroy_server('\" + server_id + \"')\";\n",
" cell.notebook.kernel.execute(cmd_destroy);\n",
" }\n",
" }\n",
"\n",
" /**\n",
" * Handle when a new output is added\n",
" */\n",
" function handleAddOutput(event, handle) {\n",
" const output_area = handle.output_area;\n",
" const output = handle.output;\n",
"\n",
" // limit handleAddOutput to display_data with EXEC_MIME_TYPE content only\n",
" if ((output.output_type != \"display_data\") || (!Object.prototype.hasOwnProperty.call(output.data, EXEC_MIME_TYPE))) {\n",
" return\n",
" }\n",
"\n",
" const toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n",
"\n",
" if (output.metadata[EXEC_MIME_TYPE][\"id\"] !== undefined) {\n",
" toinsert[toinsert.length - 1].firstChild.textContent = output.data[JS_MIME_TYPE];\n",
" // store reference to embed id on output_area\n",
" output_area._bokeh_element_id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n",
" }\n",
" if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n",
" const bk_div = document.createElement(\"div\");\n",
" bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n",
" const script_attrs = bk_div.children[0].attributes;\n",
" for (let i = 0; i < script_attrs.length; i++) {\n",
" toinsert[toinsert.length - 1].firstChild.setAttribute(script_attrs[i].name, script_attrs[i].value);\n",
" toinsert[toinsert.length - 1].firstChild.textContent = bk_div.children[0].textContent\n",
" }\n",
" // store reference to server id on output_area\n",
" output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n",
" }\n",
" }\n",
"\n",
" function register_renderer(events, OutputArea) {\n",
"\n",
" function append_mime(data, metadata, element) {\n",
" // create a DOM node to render to\n",
" const toinsert = this.create_output_subarea(\n",
" metadata,\n",
" CLASS_NAME,\n",
" EXEC_MIME_TYPE\n",
" );\n",
" this.keyboard_manager.register_events(toinsert);\n",
" // Render to node\n",
" const props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n",
" render(props, toinsert[toinsert.length - 1]);\n",
" element.append(toinsert);\n",
" return toinsert\n",
" }\n",
"\n",
" /* Handle when an output is cleared or removed */\n",
" events.on('clear_output.CodeCell', handleClearOutput);\n",
" events.on('delete.Cell', handleClearOutput);\n",
"\n",
" /* Handle when a new output is added */\n",
" events.on('output_added.OutputArea', handleAddOutput);\n",
"\n",
" /**\n",
" * Register the mime type and append_mime function with output_area\n",
" */\n",
" OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n",
" /* Is output safe? */\n",
" safe: true,\n",
" /* Index of renderer in `output_area.display_order` */\n",
" index: 0\n",
" });\n",
" }\n",
"\n",
" // register the mime type if in Jupyter Notebook environment and previously unregistered\n",
" if (root.Jupyter !== undefined) {\n",
" const events = require('base/js/events');\n",
" const OutputArea = require('notebook/js/outputarea').OutputArea;\n",
"\n",
" if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n",
" register_renderer(events, OutputArea);\n",
" }\n",
" }\n",
" if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n",
" root._bokeh_timeout = Date.now() + 5000;\n",
" root._bokeh_failed_load = false;\n",
" }\n",
"\n",
" const NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"\\n\"+\n",
" \"
\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"
\\n\"+\n",
" \"
\\n\"+\n",
" \"- re-rerun `output_notebook()` to attempt to load from CDN again, or
\\n\"+\n",
" \"- use INLINE resources instead, as so:
\\n\"+\n",
" \"
\\n\"+\n",
" \"
\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"\\n\"+\n",
" \"
\"}};\n",
"\n",
" function display_loaded(error = null) {\n",
" const el = document.getElementById(\"e64f16c5-b3ba-4981-a5bf-bb6a6c259460\");\n",
" if (el != null) {\n",
" const html = (() => {\n",
" if (typeof root.Bokeh === \"undefined\") {\n",
" if (error == null) {\n",
" return \"BokehJS is loading ...\";\n",
" } else {\n",
" return \"BokehJS failed to load.\";\n",
" }\n",
" } else {\n",
" const prefix = `BokehJS ${root.Bokeh.version}`;\n",
" if (error == null) {\n",
" return `${prefix} successfully loaded.`;\n",
" } else {\n",
" return `${prefix} encountered errors while loading and may not function as expected.`;\n",
" }\n",
" }\n",
" })();\n",
" el.innerHTML = html;\n",
"\n",
" if (error != null) {\n",
" const wrapper = document.createElement(\"div\");\n",
" wrapper.style.overflow = \"auto\";\n",
" wrapper.style.height = \"5em\";\n",
" wrapper.style.resize = \"vertical\";\n",
" const content = document.createElement(\"div\");\n",
" content.style.fontFamily = \"monospace\";\n",
" content.style.whiteSpace = \"pre-wrap\";\n",
" content.style.backgroundColor = \"rgb(255, 221, 221)\";\n",
" content.textContent = error.stack ?? error.toString();\n",
" wrapper.append(content);\n",
" el.append(wrapper);\n",
" }\n",
" } else if (Date.now() < root._bokeh_timeout) {\n",
" setTimeout(() => display_loaded(error), 100);\n",
" }\n",
" }\n",
"\n",
" function run_callbacks() {\n",
" try {\n",
" root._bokeh_onload_callbacks.forEach(function(callback) {\n",
" if (callback != null)\n",
" callback();\n",
" });\n",
" } finally {\n",
" delete root._bokeh_onload_callbacks\n",
" }\n",
" console.debug(\"Bokeh: all callbacks have finished\");\n",
" }\n",
"\n",
" function load_libs(css_urls, js_urls, callback) {\n",
" if (css_urls == null) css_urls = [];\n",
" if (js_urls == null) js_urls = [];\n",
"\n",
" root._bokeh_onload_callbacks.push(callback);\n",
" if (root._bokeh_is_loading > 0) {\n",
" console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" root._bokeh_is_loading = css_urls.length + js_urls.length;\n",
"\n",
" function on_load() {\n",
" root._bokeh_is_loading--;\n",
" if (root._bokeh_is_loading === 0) {\n",
" console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n",
" run_callbacks()\n",
" }\n",
" }\n",
"\n",
" function on_error(url) {\n",
" console.error(\"failed to load \" + url);\n",
" }\n",
"\n",
" for (let i = 0; i < css_urls.length; i++) {\n",
" const url = css_urls[i];\n",
" const element = document.createElement(\"link\");\n",
" element.onload = on_load;\n",
" element.onerror = on_error.bind(null, url);\n",
" element.rel = \"stylesheet\";\n",
" element.type = \"text/css\";\n",
" element.href = url;\n",
" console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n",
" document.body.appendChild(element);\n",
" }\n",
"\n",
" for (let i = 0; i < js_urls.length; i++) {\n",
" const url = js_urls[i];\n",
" const element = document.createElement('script');\n",
" element.onload = on_load;\n",
" element.onerror = on_error.bind(null, url);\n",
" element.async = false;\n",
" element.src = url;\n",
" console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.head.appendChild(element);\n",
" }\n",
" };\n",
"\n",
" function inject_raw_css(css) {\n",
" const element = document.createElement(\"style\");\n",
" element.appendChild(document.createTextNode(css));\n",
" document.body.appendChild(element);\n",
" }\n",
"\n",
" const js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-3.6.0.min.js\"];\n",
" const css_urls = [];\n",
"\n",
" const inline_js = [ function(Bokeh) {\n",
" Bokeh.set_log_level(\"info\");\n",
" },\n",
"function(Bokeh) {\n",
" }\n",
" ];\n",
"\n",
" function run_inline_js() {\n",
" if (root.Bokeh !== undefined || force === true) {\n",
" try {\n",
" for (let i = 0; i < inline_js.length; i++) {\n",
" inline_js[i].call(root, root.Bokeh);\n",
" }\n",
"\n",
" } catch (error) {display_loaded(error);throw error;\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < root._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!root._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" root._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" const cell = $(document.getElementById(\"e64f16c5-b3ba-4981-a5bf-bb6a6c259460\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" }\n",
"\n",
" if (root._bokeh_is_loading === 0) {\n",
" console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(css_urls, js_urls, function() {\n",
" console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
"}(window));"
],
"application/vnd.bokehjs_load.v0+json": "'use strict';\n(function(root) {\n function now() {\n return new Date();\n }\n\n const force = true;\n\n if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n\n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n const NB_LOAD_WARNING = {'data': {'text/html':\n \"\\n\"+\n \"
\\n\"+\n \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n \"
\\n\"+\n \"
\\n\"+\n \"- re-rerun `output_notebook()` to attempt to load from CDN again, or
\\n\"+\n \"- use INLINE resources instead, as so:
\\n\"+\n \"
\\n\"+\n \"
\\n\"+\n \"from bokeh.resources import INLINE\\n\"+\n \"output_notebook(resources=INLINE)\\n\"+\n \"\\n\"+\n \"
\"}};\n\n function display_loaded(error = null) {\n const el = document.getElementById(\"e64f16c5-b3ba-4981-a5bf-bb6a6c259460\");\n if (el != null) {\n const html = (() => {\n if (typeof root.Bokeh === \"undefined\") {\n if (error == null) {\n return \"BokehJS is loading ...\";\n } else {\n return \"BokehJS failed to load.\";\n }\n } else {\n const prefix = `BokehJS ${root.Bokeh.version}`;\n if (error == null) {\n return `${prefix} successfully loaded.`;\n } else {\n return `${prefix} encountered errors while loading and may not function as expected.`;\n }\n }\n })();\n el.innerHTML = html;\n\n if (error != null) {\n const wrapper = document.createElement(\"div\");\n wrapper.style.overflow = \"auto\";\n wrapper.style.height = \"5em\";\n wrapper.style.resize = \"vertical\";\n const content = document.createElement(\"div\");\n content.style.fontFamily = \"monospace\";\n content.style.whiteSpace = \"pre-wrap\";\n content.style.backgroundColor = \"rgb(255, 221, 221)\";\n content.textContent = error.stack ?? error.toString();\n wrapper.append(content);\n el.append(wrapper);\n }\n } else if (Date.now() < root._bokeh_timeout) {\n setTimeout(() => display_loaded(error), 100);\n }\n }\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) {\n if (callback != null)\n callback();\n });\n } finally {\n delete root._bokeh_onload_callbacks\n }\n console.debug(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(css_urls, js_urls, callback) {\n if (css_urls == null) css_urls = [];\n if (js_urls == null) js_urls = [];\n\n root._bokeh_onload_callbacks.push(callback);\n if (root._bokeh_is_loading > 0) {\n console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls == null || js_urls.length === 0) {\n run_callbacks();\n return null;\n }\n console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n root._bokeh_is_loading = css_urls.length + js_urls.length;\n\n function on_load() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n run_callbacks()\n }\n }\n\n function on_error(url) {\n console.error(\"failed to load \" + url);\n }\n\n for (let i = 0; i < css_urls.length; i++) {\n const url = css_urls[i];\n const element = document.createElement(\"link\");\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.rel = \"stylesheet\";\n element.type = \"text/css\";\n element.href = url;\n console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n document.body.appendChild(element);\n }\n\n for (let i = 0; i < js_urls.length; i++) {\n const url = js_urls[i];\n const element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.async = false;\n element.src = url;\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n };\n\n function inject_raw_css(css) {\n const element = document.createElement(\"style\");\n element.appendChild(document.createTextNode(css));\n document.body.appendChild(element);\n }\n\n const js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-3.6.0.min.js\"];\n const css_urls = [];\n\n const inline_js = [ function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\nfunction(Bokeh) {\n }\n ];\n\n function run_inline_js() {\n if (root.Bokeh !== undefined || force === true) {\n try {\n for (let i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }\n\n } catch (error) {display_loaded(error);throw error;\n }if (force === true) {\n display_loaded();\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n } else if (force !== true) {\n const cell = $(document.getElementById(\"e64f16c5-b3ba-4981-a5bf-bb6a6c259460\")).parents('.cell').data().cell;\n cell.output_area.append_execute_result(NB_LOAD_WARNING)\n }\n }\n\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n run_inline_js();\n } else {\n load_libs(css_urls, js_urls, function() {\n console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n run_inline_js();\n });\n }\n}(window));"
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"(function(root) {\n",
" function embed_document(root) {\n",
" const docs_json = {\"8bf55f82-b6ea-472f-870a-a9402d064768\":{\"version\":\"3.6.0\",\"title\":\"Bokeh Application\",\"roots\":[{\"type\":\"object\",\"name\":\"Figure\",\"id\":\"p1057\",\"attributes\":{\"width\":500,\"height\":300,\"x_range\":{\"type\":\"object\",\"name\":\"FactorRange\",\"id\":\"p1067\",\"attributes\":{\"factors\":[\"impf\",\"impv\",\"infa\",\"infc\",\"perf\",\"ptca\",\"ptcp\",\"wayq\"]}},\"y_range\":{\"type\":\"object\",\"name\":\"Range1d\",\"id\":\"p1056\",\"attributes\":{\"end\":129.8,\"bounds\":[0,129.8]}},\"x_scale\":{\"type\":\"object\",\"name\":\"CategoricalScale\",\"id\":\"p1068\"},\"y_scale\":{\"type\":\"object\",\"name\":\"LinearScale\",\"id\":\"p1069\"},\"title\":{\"type\":\"object\",\"name\":\"Title\",\"id\":\"p1060\",\"attributes\":{\"text\":\"Parasha #14: Va\\u2019era - Verb form distribution\"}},\"renderers\":[{\"type\":\"object\",\"name\":\"GlyphRenderer\",\"id\":\"p1100\",\"attributes\":{\"data_source\":{\"type\":\"object\",\"name\":\"ColumnDataSource\",\"id\":\"p1053\",\"attributes\":{\"selected\":{\"type\":\"object\",\"name\":\"Selection\",\"id\":\"p1054\",\"attributes\":{\"indices\":[],\"line_indices\":[]}},\"selection_policy\":{\"type\":\"object\",\"name\":\"UnionRenderers\",\"id\":\"p1055\"},\"data\":{\"type\":\"map\",\"entries\":[[\"form\",[\"impf\",\"impv\",\"infa\",\"infc\",\"perf\",\"ptca\",\"ptcp\",\"wayq\"]],[\"count_narr\",[1,0,1,16,51,6,0,118]],[\"count_direct\",[51,40,1,23,61,18,1,0]]]}}},\"view\":{\"type\":\"object\",\"name\":\"CDSView\",\"id\":\"p1101\",\"attributes\":{\"filter\":{\"type\":\"object\",\"name\":\"AllIndices\",\"id\":\"p1102\"}}},\"glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1097\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"form\",\"transform\":{\"type\":\"object\",\"name\":\"Dodge\",\"id\":\"p1093\",\"attributes\":{\"value\":-0.15,\"range\":{\"id\":\"p1067\"}}}},\"width\":{\"type\":\"value\",\"value\":0.3},\"top\":{\"type\":\"field\",\"field\":\"count_direct\"},\"line_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"fill_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"hatch_color\":{\"type\":\"value\",\"value\":\"#718dbf\"}}},\"nonselection_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1098\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"form\",\"transform\":{\"id\":\"p1093\"}},\"width\":{\"type\":\"value\",\"value\":0.3},\"top\":{\"type\":\"field\",\"field\":\"count_direct\"},\"line_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.1},\"fill_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.1},\"hatch_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.1}}},\"muted_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1099\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"form\",\"transform\":{\"id\":\"p1093\"}},\"width\":{\"type\":\"value\",\"value\":0.3},\"top\":{\"type\":\"field\",\"field\":\"count_direct\"},\"line_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.2},\"fill_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.2},\"hatch_color\":{\"type\":\"value\",\"value\":\"#718dbf\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.2}}}}},{\"type\":\"object\",\"name\":\"GlyphRenderer\",\"id\":\"p1112\",\"attributes\":{\"data_source\":{\"id\":\"p1053\"},\"view\":{\"type\":\"object\",\"name\":\"CDSView\",\"id\":\"p1113\",\"attributes\":{\"filter\":{\"type\":\"object\",\"name\":\"AllIndices\",\"id\":\"p1114\"}}},\"glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1109\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"form\",\"transform\":{\"type\":\"object\",\"name\":\"Dodge\",\"id\":\"p1105\",\"attributes\":{\"value\":0.15,\"range\":{\"id\":\"p1067\"}}}},\"width\":{\"type\":\"value\",\"value\":0.3},\"top\":{\"type\":\"field\",\"field\":\"count_narr\"},\"line_color\":{\"type\":\"value\",\"value\":\"#c9d9d3\"},\"fill_color\":{\"type\":\"value\",\"value\":\"#c9d9d3\"},\"hatch_color\":{\"type\":\"value\",\"value\":\"#c9d9d3\"}}},\"nonselection_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1110\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"form\",\"transform\":{\"id\":\"p1105\"}},\"width\":{\"type\":\"value\",\"value\":0.3},\"top\":{\"type\":\"field\",\"field\":\"count_narr\"},\"line_color\":{\"type\":\"value\",\"value\":\"#c9d9d3\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.1},\"fill_color\":{\"type\":\"value\",\"value\":\"#c9d9d3\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.1},\"hatch_color\":{\"type\":\"value\",\"value\":\"#c9d9d3\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.1}}},\"muted_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1111\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"form\",\"transform\":{\"id\":\"p1105\"}},\"width\":{\"type\":\"value\",\"value\":0.3},\"top\":{\"type\":\"field\",\"field\":\"count_narr\"},\"line_color\":{\"type\":\"value\",\"value\":\"#c9d9d3\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.2},\"fill_color\":{\"type\":\"value\",\"value\":\"#c9d9d3\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.2},\"hatch_color\":{\"type\":\"value\",\"value\":\"#c9d9d3\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.2}}}}}],\"toolbar\":{\"type\":\"object\",\"name\":\"Toolbar\",\"id\":\"p1066\",\"attributes\":{\"tools\":[{\"type\":\"object\",\"name\":\"PanTool\",\"id\":\"p1080\"},{\"type\":\"object\",\"name\":\"WheelZoomTool\",\"id\":\"p1081\",\"attributes\":{\"renderers\":\"auto\"}},{\"type\":\"object\",\"name\":\"BoxZoomTool\",\"id\":\"p1082\",\"attributes\":{\"overlay\":{\"type\":\"object\",\"name\":\"BoxAnnotation\",\"id\":\"p1083\",\"attributes\":{\"syncable\":false,\"line_color\":\"black\",\"line_alpha\":1.0,\"line_width\":2,\"line_dash\":[4,4],\"fill_color\":\"lightgrey\",\"fill_alpha\":0.5,\"level\":\"overlay\",\"visible\":false,\"left\":{\"type\":\"number\",\"value\":\"nan\"},\"right\":{\"type\":\"number\",\"value\":\"nan\"},\"top\":{\"type\":\"number\",\"value\":\"nan\"},\"bottom\":{\"type\":\"number\",\"value\":\"nan\"},\"left_units\":\"canvas\",\"right_units\":\"canvas\",\"top_units\":\"canvas\",\"bottom_units\":\"canvas\",\"handles\":{\"type\":\"object\",\"name\":\"BoxInteractionHandles\",\"id\":\"p1089\",\"attributes\":{\"all\":{\"type\":\"object\",\"name\":\"AreaVisuals\",\"id\":\"p1088\",\"attributes\":{\"fill_color\":\"white\",\"hover_fill_color\":\"lightgray\"}}}}}}}},{\"type\":\"object\",\"name\":\"SaveTool\",\"id\":\"p1090\"},{\"type\":\"object\",\"name\":\"ResetTool\",\"id\":\"p1091\"},{\"type\":\"object\",\"name\":\"HelpTool\",\"id\":\"p1092\"}],\"active_drag\":null,\"active_scroll\":null}},\"left\":[{\"type\":\"object\",\"name\":\"LinearAxis\",\"id\":\"p1075\",\"attributes\":{\"ticker\":{\"type\":\"object\",\"name\":\"BasicTicker\",\"id\":\"p1076\",\"attributes\":{\"mantissas\":[1,2,5]}},\"formatter\":{\"type\":\"object\",\"name\":\"BasicTickFormatter\",\"id\":\"p1077\"},\"axis_label\":\"Frequency\",\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1078\"}}}],\"below\":[{\"type\":\"object\",\"name\":\"CategoricalAxis\",\"id\":\"p1070\",\"attributes\":{\"ticker\":{\"type\":\"object\",\"name\":\"CategoricalTicker\",\"id\":\"p1071\"},\"formatter\":{\"type\":\"object\",\"name\":\"CategoricalTickFormatter\",\"id\":\"p1072\"},\"axis_label\":\"Verb Form\",\"major_label_orientation\":1.0,\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1073\"}}}],\"center\":[{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1074\",\"attributes\":{\"axis\":{\"id\":\"p1070\"}}},{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1079\",\"attributes\":{\"dimension\":1,\"axis\":{\"id\":\"p1075\"}}},{\"type\":\"object\",\"name\":\"Legend\",\"id\":\"p1103\",\"attributes\":{\"location\":\"top_left\",\"items\":[{\"type\":\"object\",\"name\":\"LegendItem\",\"id\":\"p1104\",\"attributes\":{\"label\":{\"type\":\"value\",\"value\":\"Direct speech\"},\"renderers\":[{\"id\":\"p1100\"}]}},{\"type\":\"object\",\"name\":\"LegendItem\",\"id\":\"p1115\",\"attributes\":{\"label\":{\"type\":\"value\",\"value\":\"Narrative\"},\"renderers\":[{\"id\":\"p1112\"}]}}]}},{\"type\":\"object\",\"name\":\"LabelSet\",\"id\":\"p1117\",\"attributes\":{\"source\":{\"id\":\"p1053\"},\"x\":{\"type\":\"field\",\"field\":\"form\",\"transform\":{\"type\":\"object\",\"name\":\"Dodge\",\"id\":\"p1116\",\"attributes\":{\"value\":-0.15,\"range\":{\"id\":\"p1067\"}}}},\"y\":{\"type\":\"field\",\"field\":\"count_direct\"},\"text\":{\"type\":\"field\",\"field\":\"count_direct\"},\"text_align\":{\"type\":\"value\",\"value\":\"center\"}}},{\"type\":\"object\",\"name\":\"LabelSet\",\"id\":\"p1122\",\"attributes\":{\"source\":{\"id\":\"p1053\"},\"x\":{\"type\":\"field\",\"field\":\"form\",\"transform\":{\"type\":\"object\",\"name\":\"Dodge\",\"id\":\"p1121\",\"attributes\":{\"value\":0.15,\"range\":{\"id\":\"p1067\"}}}},\"y\":{\"type\":\"field\",\"field\":\"count_narr\"},\"text\":{\"type\":\"field\",\"field\":\"count_narr\"},\"text_align\":{\"type\":\"value\",\"value\":\"center\"}}}]}}]}};\n",
" const render_items = [{\"docid\":\"8bf55f82-b6ea-472f-870a-a9402d064768\",\"roots\":{\"p1057\":\"e80771b5-4b75-4574-a346-c3a5615d1c63\"},\"root_ids\":[\"p1057\"]}];\n",
" void root.Bokeh.embed.embed_items_notebook(docs_json, render_items);\n",
" }\n",
" if (root.Bokeh !== undefined) {\n",
" embed_document(root);\n",
" } else {\n",
" let attempts = 0;\n",
" const timer = setInterval(function(root) {\n",
" if (root.Bokeh !== undefined) {\n",
" clearInterval(timer);\n",
" embed_document(root);\n",
" } else {\n",
" attempts++;\n",
" if (attempts > 100) {\n",
" clearInterval(timer);\n",
" console.log(\"Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing\");\n",
" }\n",
" }\n",
" }, 10, root)\n",
" }\n",
"})(window);"
],
"application/vnd.bokehjs_exec.v0+json": ""
},
"metadata": {
"application/vnd.bokehjs_exec.v0+json": {
"id": "p1057"
}
},
"output_type": "display_data"
}
],
"source": [
"from bokeh.layouts import column\n",
"from bokeh.models import ColumnDataSource, Range1d, LabelSet\n",
"from bokeh.plotting import figure, output_file, show\n",
"from bokeh.transform import dodge\n",
"from collections import Counter\n",
"from bokeh.io import output_notebook\n",
"\n",
"# Prepare the Bokeh output to display in the notebook\n",
"output_notebook()\n",
"\n",
"# Prepare data for the grouped bar chart:\n",
"# Count verb forms for narrative and direct speech\n",
"verb_form_counts_narr = Counter(F.vt.v(w) for w in verb_words if F.domain.v(L.u(w, \"clause\")[0]) != \"Q\")\n",
"verb_form_counts_direct = Counter(F.vt.v(w) for w in verb_words if F.domain.v(L.u(w, \"clause\")[0]) == \"Q\")\n",
"\n",
"# Define the categories (verb forms) to plot (excluding \"NA\")\n",
"verb_forms = [vf for vf in set(verb_form_counts_narr.keys()) | set(verb_form_counts_direct.keys()) if vf != \"NA\"]\n",
"verb_forms.sort() # sort alphabetically\n",
"\n",
"# Create data source for Bokeh\n",
"data = {\n",
" 'form': verb_forms,\n",
" 'count_narr': [verb_form_counts_narr.get(vf, 0) for vf in verb_forms],\n",
" 'count_direct': [verb_form_counts_direct.get(vf, 0) for vf in verb_forms]\n",
"}\n",
"source = ColumnDataSource(data=data)\n",
"\n",
"# Compute maximum count for setting the y_range (with a 10% padding)\n",
"max_count = max(max(data['count_narr']), max(data['count_direct']))\n",
"y_end = max_count * 1.1\n",
"\n",
"# Create a Bokeh figure for the grouped bar chart\n",
"p = figure(x_range=verb_forms, height=300, width=500,\n",
" title=f\"Parasha #{parasha_num}: {parasha_name_trans} - Verb form distribution\",\n",
" x_axis_label=\"Verb Form\", y_axis_label=\"Frequency\",\n",
" toolbar_location='right',\n",
" y_range=Range1d(start=0, end=y_end))\n",
"\n",
"# Draw bars for direct speech and narrative using dodge to position them side by side\n",
"bar_width = 0.3\n",
"p.vbar(x=dodge('form', -0.15, range=p.x_range), top='count_direct', width=bar_width,\n",
" source=source, color=\"#718dbf\", legend_label=\"Direct speech\")\n",
"p.vbar(x=dodge('form', 0.15, range=p.x_range), top='count_narr', width=bar_width,\n",
" source=source, color=\"#c9d9d3\", legend_label=\"Narrative\")\n",
"\n",
"# Add count labels on top of each bar (removed render_mode attribute)\n",
"labels_direct = LabelSet(x=dodge('form', -0.15, range=p.x_range), y='count_direct',\n",
" text='count_direct', source=source,\n",
" text_align='center', text_baseline='bottom')\n",
"labels_narr = LabelSet(x=dodge('form', 0.15, range=p.x_range), y='count_narr',\n",
" text='count_narr', source=source,\n",
" text_align='center', text_baseline='bottom')\n",
"p.add_layout(labels_direct)\n",
"p.add_layout(labels_narr)\n",
"\n",
"# Lock the y_range so it cannot be changed by interactions\n",
"p.y_range.bounds = (0, y_end)\n",
"\n",
"# Optionally, disable active drag/scroll tools\n",
"if p.toolbar:\n",
" p.toolbar.active_drag = None\n",
" p.toolbar.active_scroll = None\n",
"\n",
"# Improve x-axis label readability\n",
"p.xaxis.major_label_orientation = 1.0\n",
"p.legend.location = \"top_left\"\n",
"\n",
"# Display the plot\n",
"show(p)\n",
"\n",
"# note: save image as 'verbform_distribution.png'"
]
},
{
"cell_type": "markdown",
"id": "02eb4da8-e178-4770-b608-8659faa03e3a",
"metadata": {},
"source": [
"## 4.2 - Ratio of direct speech versus narrative \n",
"\n",
"Next, let's quantify the portion of the parasha that is direct speech versus narrative description. We can measure this by the number of words in each category. Using the clause text-type (`domain`), we mark each word as belonging to a *Quotation (Q)* or *Non-quotation (N/D)* context. For simplicity, we'll treat both Narrative and Discursive (`N` and `D`) as \"narrative\" here (i.e., not direct speech). \n",
"\n",
"We'll calculate the percentage of words in direct speech vs narrative, and display it as a pie chart for a quick overview:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "988f63c4-f93e-4507-9076-c435bcdfc83d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Direct speech word count: 1167 (46.5%)\n",
"Narrative word count: 1345 (53.5%)\n"
]
}
],
"source": [
"# Calculate number of words in direct speech vs narrative\n",
"words_direct = [w for w in words_in_parasha if F.domain.v(L.u(w, \"clause\")[0]) == \"Q\"]\n",
"words_narrative = [w for w in words_in_parasha if F.domain.v(L.u(w, \"clause\")[0]) in (\"N\", \"D\")]\n",
"\n",
"count_direct = len(words_direct)\n",
"count_narr = len(words_narrative)\n",
"total_words = count_direct + count_narr\n",
"pct_direct = (count_direct / total_words * 100) if total_words else 0\n",
"pct_narr = (count_narr / total_words * 100) if total_words else 0\n",
"\n",
"print(f\"Direct speech word count: {count_direct} ({pct_direct:.1f}%)\")\n",
"print(f\"Narrative word count: {count_narr} ({pct_narr:.1f}%)\")"
]
},
{
"cell_type": "markdown",
"id": "03e2e4a7-0235-4030-bec6-479377881451",
"metadata": {},
"source": [
"This prints out the raw counts and percentage. Now, we create a pie chart using Bokeh's wedge glyph. We will also attach a hover tooltip to display the percentages:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "feff5a4a-c097-4793-a3a4-73dd8b8ab6a4",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" \n",
" \n",
"
\n",
"
Loading BokehJS ...\n",
"
\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"'use strict';\n",
"(function(root) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
"\n",
" const force = true;\n",
"\n",
" if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n",
" root._bokeh_onload_callbacks = [];\n",
" root._bokeh_is_loading = undefined;\n",
" }\n",
"\n",
"const JS_MIME_TYPE = 'application/javascript';\n",
" const HTML_MIME_TYPE = 'text/html';\n",
" const EXEC_MIME_TYPE = 'application/vnd.bokehjs_exec.v0+json';\n",
" const CLASS_NAME = 'output_bokeh rendered_html';\n",
"\n",
" /**\n",
" * Render data to the DOM node\n",
" */\n",
" function render(props, node) {\n",
" const script = document.createElement(\"script\");\n",
" node.appendChild(script);\n",
" }\n",
"\n",
" /**\n",
" * Handle when an output is cleared or removed\n",
" */\n",
" function handleClearOutput(event, handle) {\n",
" function drop(id) {\n",
" const view = Bokeh.index.get_by_id(id)\n",
" if (view != null) {\n",
" view.model.document.clear()\n",
" Bokeh.index.delete(view)\n",
" }\n",
" }\n",
"\n",
" const cell = handle.cell;\n",
"\n",
" const id = cell.output_area._bokeh_element_id;\n",
" const server_id = cell.output_area._bokeh_server_id;\n",
"\n",
" // Clean up Bokeh references\n",
" if (id != null) {\n",
" drop(id)\n",
" }\n",
"\n",
" if (server_id !== undefined) {\n",
" // Clean up Bokeh references\n",
" const cmd_clean = \"from bokeh.io.state import curstate; print(curstate().uuid_to_server['\" + server_id + \"'].get_sessions()[0].document.roots[0]._id)\";\n",
" cell.notebook.kernel.execute(cmd_clean, {\n",
" iopub: {\n",
" output: function(msg) {\n",
" const id = msg.content.text.trim()\n",
" drop(id)\n",
" }\n",
" }\n",
" });\n",
" // Destroy server and session\n",
" const cmd_destroy = \"import bokeh.io.notebook as ion; ion.destroy_server('\" + server_id + \"')\";\n",
" cell.notebook.kernel.execute(cmd_destroy);\n",
" }\n",
" }\n",
"\n",
" /**\n",
" * Handle when a new output is added\n",
" */\n",
" function handleAddOutput(event, handle) {\n",
" const output_area = handle.output_area;\n",
" const output = handle.output;\n",
"\n",
" // limit handleAddOutput to display_data with EXEC_MIME_TYPE content only\n",
" if ((output.output_type != \"display_data\") || (!Object.prototype.hasOwnProperty.call(output.data, EXEC_MIME_TYPE))) {\n",
" return\n",
" }\n",
"\n",
" const toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n",
"\n",
" if (output.metadata[EXEC_MIME_TYPE][\"id\"] !== undefined) {\n",
" toinsert[toinsert.length - 1].firstChild.textContent = output.data[JS_MIME_TYPE];\n",
" // store reference to embed id on output_area\n",
" output_area._bokeh_element_id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n",
" }\n",
" if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n",
" const bk_div = document.createElement(\"div\");\n",
" bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n",
" const script_attrs = bk_div.children[0].attributes;\n",
" for (let i = 0; i < script_attrs.length; i++) {\n",
" toinsert[toinsert.length - 1].firstChild.setAttribute(script_attrs[i].name, script_attrs[i].value);\n",
" toinsert[toinsert.length - 1].firstChild.textContent = bk_div.children[0].textContent\n",
" }\n",
" // store reference to server id on output_area\n",
" output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n",
" }\n",
" }\n",
"\n",
" function register_renderer(events, OutputArea) {\n",
"\n",
" function append_mime(data, metadata, element) {\n",
" // create a DOM node to render to\n",
" const toinsert = this.create_output_subarea(\n",
" metadata,\n",
" CLASS_NAME,\n",
" EXEC_MIME_TYPE\n",
" );\n",
" this.keyboard_manager.register_events(toinsert);\n",
" // Render to node\n",
" const props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n",
" render(props, toinsert[toinsert.length - 1]);\n",
" element.append(toinsert);\n",
" return toinsert\n",
" }\n",
"\n",
" /* Handle when an output is cleared or removed */\n",
" events.on('clear_output.CodeCell', handleClearOutput);\n",
" events.on('delete.Cell', handleClearOutput);\n",
"\n",
" /* Handle when a new output is added */\n",
" events.on('output_added.OutputArea', handleAddOutput);\n",
"\n",
" /**\n",
" * Register the mime type and append_mime function with output_area\n",
" */\n",
" OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n",
" /* Is output safe? */\n",
" safe: true,\n",
" /* Index of renderer in `output_area.display_order` */\n",
" index: 0\n",
" });\n",
" }\n",
"\n",
" // register the mime type if in Jupyter Notebook environment and previously unregistered\n",
" if (root.Jupyter !== undefined) {\n",
" const events = require('base/js/events');\n",
" const OutputArea = require('notebook/js/outputarea').OutputArea;\n",
"\n",
" if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n",
" register_renderer(events, OutputArea);\n",
" }\n",
" }\n",
" if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n",
" root._bokeh_timeout = Date.now() + 5000;\n",
" root._bokeh_failed_load = false;\n",
" }\n",
"\n",
" const NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"\\n\"+\n",
" \"
\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"
\\n\"+\n",
" \"
\\n\"+\n",
" \"- re-rerun `output_notebook()` to attempt to load from CDN again, or
\\n\"+\n",
" \"- use INLINE resources instead, as so:
\\n\"+\n",
" \"
\\n\"+\n",
" \"
\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"\\n\"+\n",
" \"
\"}};\n",
"\n",
" function display_loaded(error = null) {\n",
" const el = document.getElementById(\"deff0ab2-3756-4db1-9bbc-dffc7b0e95eb\");\n",
" if (el != null) {\n",
" const html = (() => {\n",
" if (typeof root.Bokeh === \"undefined\") {\n",
" if (error == null) {\n",
" return \"BokehJS is loading ...\";\n",
" } else {\n",
" return \"BokehJS failed to load.\";\n",
" }\n",
" } else {\n",
" const prefix = `BokehJS ${root.Bokeh.version}`;\n",
" if (error == null) {\n",
" return `${prefix} successfully loaded.`;\n",
" } else {\n",
" return `${prefix} encountered errors while loading and may not function as expected.`;\n",
" }\n",
" }\n",
" })();\n",
" el.innerHTML = html;\n",
"\n",
" if (error != null) {\n",
" const wrapper = document.createElement(\"div\");\n",
" wrapper.style.overflow = \"auto\";\n",
" wrapper.style.height = \"5em\";\n",
" wrapper.style.resize = \"vertical\";\n",
" const content = document.createElement(\"div\");\n",
" content.style.fontFamily = \"monospace\";\n",
" content.style.whiteSpace = \"pre-wrap\";\n",
" content.style.backgroundColor = \"rgb(255, 221, 221)\";\n",
" content.textContent = error.stack ?? error.toString();\n",
" wrapper.append(content);\n",
" el.append(wrapper);\n",
" }\n",
" } else if (Date.now() < root._bokeh_timeout) {\n",
" setTimeout(() => display_loaded(error), 100);\n",
" }\n",
" }\n",
"\n",
" function run_callbacks() {\n",
" try {\n",
" root._bokeh_onload_callbacks.forEach(function(callback) {\n",
" if (callback != null)\n",
" callback();\n",
" });\n",
" } finally {\n",
" delete root._bokeh_onload_callbacks\n",
" }\n",
" console.debug(\"Bokeh: all callbacks have finished\");\n",
" }\n",
"\n",
" function load_libs(css_urls, js_urls, callback) {\n",
" if (css_urls == null) css_urls = [];\n",
" if (js_urls == null) js_urls = [];\n",
"\n",
" root._bokeh_onload_callbacks.push(callback);\n",
" if (root._bokeh_is_loading > 0) {\n",
" console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" root._bokeh_is_loading = css_urls.length + js_urls.length;\n",
"\n",
" function on_load() {\n",
" root._bokeh_is_loading--;\n",
" if (root._bokeh_is_loading === 0) {\n",
" console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n",
" run_callbacks()\n",
" }\n",
" }\n",
"\n",
" function on_error(url) {\n",
" console.error(\"failed to load \" + url);\n",
" }\n",
"\n",
" for (let i = 0; i < css_urls.length; i++) {\n",
" const url = css_urls[i];\n",
" const element = document.createElement(\"link\");\n",
" element.onload = on_load;\n",
" element.onerror = on_error.bind(null, url);\n",
" element.rel = \"stylesheet\";\n",
" element.type = \"text/css\";\n",
" element.href = url;\n",
" console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n",
" document.body.appendChild(element);\n",
" }\n",
"\n",
" for (let i = 0; i < js_urls.length; i++) {\n",
" const url = js_urls[i];\n",
" const element = document.createElement('script');\n",
" element.onload = on_load;\n",
" element.onerror = on_error.bind(null, url);\n",
" element.async = false;\n",
" element.src = url;\n",
" console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.head.appendChild(element);\n",
" }\n",
" };\n",
"\n",
" function inject_raw_css(css) {\n",
" const element = document.createElement(\"style\");\n",
" element.appendChild(document.createTextNode(css));\n",
" document.body.appendChild(element);\n",
" }\n",
"\n",
" const js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-3.6.0.min.js\"];\n",
" const css_urls = [];\n",
"\n",
" const inline_js = [ function(Bokeh) {\n",
" Bokeh.set_log_level(\"info\");\n",
" },\n",
"function(Bokeh) {\n",
" }\n",
" ];\n",
"\n",
" function run_inline_js() {\n",
" if (root.Bokeh !== undefined || force === true) {\n",
" try {\n",
" for (let i = 0; i < inline_js.length; i++) {\n",
" inline_js[i].call(root, root.Bokeh);\n",
" }\n",
"\n",
" } catch (error) {display_loaded(error);throw error;\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < root._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!root._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" root._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" const cell = $(document.getElementById(\"deff0ab2-3756-4db1-9bbc-dffc7b0e95eb\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" }\n",
"\n",
" if (root._bokeh_is_loading === 0) {\n",
" console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(css_urls, js_urls, function() {\n",
" console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
"}(window));"
],
"application/vnd.bokehjs_load.v0+json": "'use strict';\n(function(root) {\n function now() {\n return new Date();\n }\n\n const force = true;\n\n if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n\n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n const NB_LOAD_WARNING = {'data': {'text/html':\n \"\\n\"+\n \"
\\n\"+\n \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n \"
\\n\"+\n \"
\\n\"+\n \"- re-rerun `output_notebook()` to attempt to load from CDN again, or
\\n\"+\n \"- use INLINE resources instead, as so:
\\n\"+\n \"
\\n\"+\n \"
\\n\"+\n \"from bokeh.resources import INLINE\\n\"+\n \"output_notebook(resources=INLINE)\\n\"+\n \"\\n\"+\n \"
\"}};\n\n function display_loaded(error = null) {\n const el = document.getElementById(\"deff0ab2-3756-4db1-9bbc-dffc7b0e95eb\");\n if (el != null) {\n const html = (() => {\n if (typeof root.Bokeh === \"undefined\") {\n if (error == null) {\n return \"BokehJS is loading ...\";\n } else {\n return \"BokehJS failed to load.\";\n }\n } else {\n const prefix = `BokehJS ${root.Bokeh.version}`;\n if (error == null) {\n return `${prefix} successfully loaded.`;\n } else {\n return `${prefix} encountered errors while loading and may not function as expected.`;\n }\n }\n })();\n el.innerHTML = html;\n\n if (error != null) {\n const wrapper = document.createElement(\"div\");\n wrapper.style.overflow = \"auto\";\n wrapper.style.height = \"5em\";\n wrapper.style.resize = \"vertical\";\n const content = document.createElement(\"div\");\n content.style.fontFamily = \"monospace\";\n content.style.whiteSpace = \"pre-wrap\";\n content.style.backgroundColor = \"rgb(255, 221, 221)\";\n content.textContent = error.stack ?? error.toString();\n wrapper.append(content);\n el.append(wrapper);\n }\n } else if (Date.now() < root._bokeh_timeout) {\n setTimeout(() => display_loaded(error), 100);\n }\n }\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) {\n if (callback != null)\n callback();\n });\n } finally {\n delete root._bokeh_onload_callbacks\n }\n console.debug(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(css_urls, js_urls, callback) {\n if (css_urls == null) css_urls = [];\n if (js_urls == null) js_urls = [];\n\n root._bokeh_onload_callbacks.push(callback);\n if (root._bokeh_is_loading > 0) {\n console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls == null || js_urls.length === 0) {\n run_callbacks();\n return null;\n }\n console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n root._bokeh_is_loading = css_urls.length + js_urls.length;\n\n function on_load() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n run_callbacks()\n }\n }\n\n function on_error(url) {\n console.error(\"failed to load \" + url);\n }\n\n for (let i = 0; i < css_urls.length; i++) {\n const url = css_urls[i];\n const element = document.createElement(\"link\");\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.rel = \"stylesheet\";\n element.type = \"text/css\";\n element.href = url;\n console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n document.body.appendChild(element);\n }\n\n for (let i = 0; i < js_urls.length; i++) {\n const url = js_urls[i];\n const element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.async = false;\n element.src = url;\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n };\n\n function inject_raw_css(css) {\n const element = document.createElement(\"style\");\n element.appendChild(document.createTextNode(css));\n document.body.appendChild(element);\n }\n\n const js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-3.6.0.min.js\"];\n const css_urls = [];\n\n const inline_js = [ function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\nfunction(Bokeh) {\n }\n ];\n\n function run_inline_js() {\n if (root.Bokeh !== undefined || force === true) {\n try {\n for (let i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }\n\n } catch (error) {display_loaded(error);throw error;\n }if (force === true) {\n display_loaded();\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n } else if (force !== true) {\n const cell = $(document.getElementById(\"deff0ab2-3756-4db1-9bbc-dffc7b0e95eb\")).parents('.cell').data().cell;\n cell.output_area.append_execute_result(NB_LOAD_WARNING)\n }\n }\n\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n run_inline_js();\n } else {\n load_libs(css_urls, js_urls, function() {\n console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n run_inline_js();\n });\n }\n}(window));"
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"(function(root) {\n",
" function embed_document(root) {\n",
" const docs_json = {\"f7faa278-7b75-4509-85b4-2a969b8e534f\":{\"version\":\"3.6.0\",\"title\":\"Bokeh Application\",\"roots\":[{\"type\":\"object\",\"name\":\"Figure\",\"id\":\"p1129\",\"attributes\":{\"width\":400,\"height\":250,\"x_range\":{\"type\":\"object\",\"name\":\"Range1d\",\"id\":\"p1139\",\"attributes\":{\"start\":-0.5}},\"y_range\":{\"type\":\"object\",\"name\":\"DataRange1d\",\"id\":\"p1131\"},\"x_scale\":{\"type\":\"object\",\"name\":\"LinearScale\",\"id\":\"p1140\"},\"y_scale\":{\"type\":\"object\",\"name\":\"LinearScale\",\"id\":\"p1141\"},\"title\":{\"type\":\"object\",\"name\":\"Title\",\"id\":\"p1132\",\"attributes\":{\"text\":\"Parasha #14: Va\\u2019era - Speech vs Narrative Ratio\"}},\"renderers\":[{\"type\":\"object\",\"name\":\"GlyphRenderer\",\"id\":\"p1174\",\"attributes\":{\"data_source\":{\"type\":\"object\",\"name\":\"ColumnDataSource\",\"id\":\"p1126\",\"attributes\":{\"selected\":{\"type\":\"object\",\"name\":\"Selection\",\"id\":\"p1127\",\"attributes\":{\"indices\":[],\"line_indices\":[]}},\"selection_policy\":{\"type\":\"object\",\"name\":\"UnionRenderers\",\"id\":\"p1128\"},\"data\":{\"type\":\"map\",\"entries\":[[\"index\",{\"type\":\"ndarray\",\"array\":{\"type\":\"bytes\",\"data\":\"AAAAAAEAAAA=\"},\"shape\":[2],\"dtype\":\"int32\",\"order\":\"little\"}],[\"category\",{\"type\":\"ndarray\",\"array\":[\"Direct Speech\",\"Narrative\"],\"shape\":[2],\"dtype\":\"object\",\"order\":\"little\"}],[\"count\",{\"type\":\"ndarray\",\"array\":{\"type\":\"bytes\",\"data\":\"jwQAAEEFAAA=\"},\"shape\":[2],\"dtype\":\"int32\",\"order\":\"little\"}],[\"angle\",{\"type\":\"ndarray\",\"array\":{\"type\":\"bytes\",\"data\":\"aBqdFBJaB0DHP+uT5OkKQA==\"},\"shape\":[2],\"dtype\":\"float64\",\"order\":\"little\"}],[\"percentage\",{\"type\":\"ndarray\",\"array\":{\"type\":\"bytes\",\"data\":\"NGBJL386R0DLn7bQgMVKQA==\"},\"shape\":[2],\"dtype\":\"float64\",\"order\":\"little\"}],[\"color\",{\"type\":\"ndarray\",\"array\":[\"#ff7f0e\",\"#1f77b4\"],\"shape\":[2],\"dtype\":\"object\",\"order\":\"little\"}],[\"label\",{\"type\":\"ndarray\",\"array\":[\"1167 (46.5%)\",\"1345 (53.5%)\"],\"shape\":[2],\"dtype\":\"object\",\"order\":\"little\"}],[\"angle_cumsum\",{\"type\":\"ndarray\",\"array\":{\"type\":\"bytes\",\"data\":\"aBqdFBJaB0AYLURU+yEZQA==\"},\"shape\":[2],\"dtype\":\"float64\",\"order\":\"little\"}],[\"angle_mid\",{\"type\":\"ndarray\",\"array\":{\"type\":\"bytes\",\"data\":\"aBqdFBJa9z8mXUkvgmcSQA==\"},\"shape\":[2],\"dtype\":\"float64\",\"order\":\"little\"}],[\"x\",{\"type\":\"ndarray\",\"array\":{\"type\":\"bytes\",\"data\":\"YrV4D7cPoT9ntXgPtw+hvw==\"},\"shape\":[2],\"dtype\":\"float64\",\"order\":\"little\"}],[\"y\",{\"type\":\"ndarray\",\"array\":{\"type\":\"bytes\",\"data\":\"1tPhrsgU0z/W0+GuyBTTvw==\"},\"shape\":[2],\"dtype\":\"float64\",\"order\":\"little\"}]]}}},\"view\":{\"type\":\"object\",\"name\":\"CDSView\",\"id\":\"p1175\",\"attributes\":{\"filter\":{\"type\":\"object\",\"name\":\"AllIndices\",\"id\":\"p1176\"}}},\"glyph\":{\"type\":\"object\",\"name\":\"Wedge\",\"id\":\"p1171\",\"attributes\":{\"x\":{\"type\":\"value\",\"value\":0},\"y\":{\"type\":\"value\",\"value\":0},\"radius\":{\"type\":\"value\",\"value\":0.4},\"start_angle\":{\"type\":\"expr\",\"expr\":{\"type\":\"object\",\"name\":\"CumSum\",\"id\":\"p1166\",\"attributes\":{\"field\":\"angle\",\"include_zero\":true}}},\"end_angle\":{\"type\":\"expr\",\"expr\":{\"type\":\"object\",\"name\":\"CumSum\",\"id\":\"p1167\",\"attributes\":{\"field\":\"angle\"}}},\"line_color\":{\"type\":\"value\",\"value\":\"white\"},\"fill_color\":{\"type\":\"field\",\"field\":\"color\"}}},\"nonselection_glyph\":{\"type\":\"object\",\"name\":\"Wedge\",\"id\":\"p1172\",\"attributes\":{\"x\":{\"type\":\"value\",\"value\":0},\"y\":{\"type\":\"value\",\"value\":0},\"radius\":{\"type\":\"value\",\"value\":0.4},\"start_angle\":{\"type\":\"expr\",\"expr\":{\"id\":\"p1166\"}},\"end_angle\":{\"type\":\"expr\",\"expr\":{\"id\":\"p1167\"}},\"line_color\":{\"type\":\"value\",\"value\":\"white\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.1},\"fill_color\":{\"type\":\"field\",\"field\":\"color\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.1},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.1}}},\"muted_glyph\":{\"type\":\"object\",\"name\":\"Wedge\",\"id\":\"p1173\",\"attributes\":{\"x\":{\"type\":\"value\",\"value\":0},\"y\":{\"type\":\"value\",\"value\":0},\"radius\":{\"type\":\"value\",\"value\":0.4},\"start_angle\":{\"type\":\"expr\",\"expr\":{\"id\":\"p1166\"}},\"end_angle\":{\"type\":\"expr\",\"expr\":{\"id\":\"p1167\"}},\"line_color\":{\"type\":\"value\",\"value\":\"white\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.2},\"fill_color\":{\"type\":\"field\",\"field\":\"color\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.2},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.2}}}}}],\"toolbar\":{\"type\":\"object\",\"name\":\"Toolbar\",\"id\":\"p1138\",\"attributes\":{\"tools\":[{\"type\":\"object\",\"name\":\"PanTool\",\"id\":\"p1152\"},{\"type\":\"object\",\"name\":\"WheelZoomTool\",\"id\":\"p1153\",\"attributes\":{\"renderers\":\"auto\"}},{\"type\":\"object\",\"name\":\"BoxZoomTool\",\"id\":\"p1154\",\"attributes\":{\"overlay\":{\"type\":\"object\",\"name\":\"BoxAnnotation\",\"id\":\"p1155\",\"attributes\":{\"syncable\":false,\"line_color\":\"black\",\"line_alpha\":1.0,\"line_width\":2,\"line_dash\":[4,4],\"fill_color\":\"lightgrey\",\"fill_alpha\":0.5,\"level\":\"overlay\",\"visible\":false,\"left\":{\"type\":\"number\",\"value\":\"nan\"},\"right\":{\"type\":\"number\",\"value\":\"nan\"},\"top\":{\"type\":\"number\",\"value\":\"nan\"},\"bottom\":{\"type\":\"number\",\"value\":\"nan\"},\"left_units\":\"canvas\",\"right_units\":\"canvas\",\"top_units\":\"canvas\",\"bottom_units\":\"canvas\",\"handles\":{\"type\":\"object\",\"name\":\"BoxInteractionHandles\",\"id\":\"p1161\",\"attributes\":{\"all\":{\"type\":\"object\",\"name\":\"AreaVisuals\",\"id\":\"p1160\",\"attributes\":{\"fill_color\":\"white\",\"hover_fill_color\":\"lightgray\"}}}}}}}},{\"type\":\"object\",\"name\":\"SaveTool\",\"id\":\"p1162\"},{\"type\":\"object\",\"name\":\"ResetTool\",\"id\":\"p1163\"},{\"type\":\"object\",\"name\":\"HelpTool\",\"id\":\"p1164\"},{\"type\":\"object\",\"name\":\"HoverTool\",\"id\":\"p1165\",\"attributes\":{\"renderers\":\"auto\",\"tooltips\":\"@category: @count words (@percentage{0.0}%)\"}}]}},\"left\":[{\"type\":\"object\",\"name\":\"LinearAxis\",\"id\":\"p1147\",\"attributes\":{\"visible\":false,\"ticker\":{\"type\":\"object\",\"name\":\"BasicTicker\",\"id\":\"p1148\",\"attributes\":{\"mantissas\":[1,2,5]}},\"formatter\":{\"type\":\"object\",\"name\":\"BasicTickFormatter\",\"id\":\"p1149\"},\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1150\"}}}],\"below\":[{\"type\":\"object\",\"name\":\"LinearAxis\",\"id\":\"p1142\",\"attributes\":{\"visible\":false,\"ticker\":{\"type\":\"object\",\"name\":\"BasicTicker\",\"id\":\"p1143\",\"attributes\":{\"mantissas\":[1,2,5]}},\"formatter\":{\"type\":\"object\",\"name\":\"BasicTickFormatter\",\"id\":\"p1144\"},\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1145\"}}}],\"center\":[{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1146\",\"attributes\":{\"axis\":{\"id\":\"p1142\"},\"grid_line_color\":null}},{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1151\",\"attributes\":{\"dimension\":1,\"axis\":{\"id\":\"p1147\"},\"grid_line_color\":null}},{\"type\":\"object\",\"name\":\"Legend\",\"id\":\"p1177\",\"attributes\":{\"label_text_font_size\":\"8pt\",\"items\":[{\"type\":\"object\",\"name\":\"LegendItem\",\"id\":\"p1178\",\"attributes\":{\"label\":{\"type\":\"field\",\"field\":\"category\"},\"renderers\":[{\"id\":\"p1174\"}]}}]}},{\"type\":\"object\",\"name\":\"LabelSet\",\"id\":\"p1179\",\"attributes\":{\"source\":{\"id\":\"p1126\"},\"x\":{\"type\":\"field\",\"field\":\"x\"},\"y\":{\"type\":\"field\",\"field\":\"y\"},\"text\":{\"type\":\"field\",\"field\":\"label\"},\"text_align\":{\"type\":\"value\",\"value\":\"center\"}}}],\"min_border_left\":53}}]}};\n",
" const render_items = [{\"docid\":\"f7faa278-7b75-4509-85b4-2a969b8e534f\",\"roots\":{\"p1129\":\"a9c71735-c259-4671-94e7-f419f2cf15b1\"},\"root_ids\":[\"p1129\"]}];\n",
" void root.Bokeh.embed.embed_items_notebook(docs_json, render_items);\n",
" }\n",
" if (root.Bokeh !== undefined) {\n",
" embed_document(root);\n",
" } else {\n",
" let attempts = 0;\n",
" const timer = setInterval(function(root) {\n",
" if (root.Bokeh !== undefined) {\n",
" clearInterval(timer);\n",
" embed_document(root);\n",
" } else {\n",
" attempts++;\n",
" if (attempts > 100) {\n",
" clearInterval(timer);\n",
" console.log(\"Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing\");\n",
" }\n",
" }\n",
" }, 10, root)\n",
" }\n",
"})(window);"
],
"application/vnd.bokehjs_exec.v0+json": ""
},
"metadata": {
"application/vnd.bokehjs_exec.v0+json": {
"id": "p1129"
}
},
"output_type": "display_data"
}
],
"source": [
"from math import pi\n",
"import numpy as np\n",
"import pandas as pd\n",
"from bokeh.io import output_file, output_notebook, show\n",
"from bokeh.plotting import figure\n",
"from bokeh.transform import cumsum\n",
"from bokeh.models import ColumnDataSource, LabelSet\n",
"\n",
"# Set up output\n",
"output_notebook()\n",
"\n",
"# Prepare data for the pie chart\n",
"pie_data = pd.DataFrame({\n",
" 'category': ['Direct Speech', 'Narrative'],\n",
" 'count': [count_direct, count_narr]\n",
"})\n",
"total_count = pie_data['count'].sum()\n",
"pie_data['angle'] = pie_data['count'] / total_count * 2 * pi\n",
"pie_data['percentage'] = pie_data['count'] / total_count * 100 # Compute percentages\n",
"pie_data['color'] = [\"#ff7f0e\", \"#1f77b4\"] # Colors for the two categories\n",
"\n",
"# Create a label column combining count and percentage\n",
"pie_data['label'] = pie_data['count'].astype(str) + \" (\" + pie_data['percentage'].round(1).astype(str) + \"%)\"\n",
"\n",
"# Calculate cumulative angles to determine the middle angle for each wedge\n",
"pie_data['angle_cumsum'] = pie_data['angle'].cumsum()\n",
"pie_data['angle_mid'] = pie_data['angle_cumsum'] - pie_data['angle'] / 2\n",
"\n",
"# Calculate label positions (adjust the radius factor as needed)\n",
"pie_data['x'] = 0.3 * np.cos(pie_data['angle_mid'])\n",
"pie_data['y'] = 0.3 * np.sin(pie_data['angle_mid'])\n",
"\n",
"source_pie = ColumnDataSource(pie_data)\n",
"\n",
"# Create pie chart figure\n",
"p_pie = figure(\n",
" height=250, width=400,\n",
" title=f\"Parasha #{parasha_num}: {parasha_name_trans} - Speech vs Narrative Ratio\",\n",
" tooltips=\"@category: @count words (@percentage{0.0}%)\",\n",
" x_range=(-0.5, 1.0), toolbar_location=\"right\"\n",
")\n",
"\n",
"# Shift the plot area to align with other plots\n",
"p_pie.min_border_left = 53\n",
"\n",
"p_pie.wedge(\n",
" x=0, y=0, radius=0.4, \n",
" start_angle=cumsum('angle', include_zero=True), end_angle=cumsum('angle'),\n",
" line_color=\"white\", fill_color='color', legend_field='category', source=source_pie\n",
")\n",
"\n",
"# Add labels showing count and percentage inside each wedge\n",
"labels = LabelSet(x='x', y='y', text='label', source=source_pie, text_align='center')\n",
"p_pie.add_layout(labels)\n",
"\n",
"p_pie.legend.location = \"top_right\"\n",
"p_pie.legend.label_text_font_size = \"8pt\"\n",
"p_pie.axis.visible = False # Hide axes\n",
"p_pie.grid.grid_line_color = None\n",
"\n",
"show(p_pie)\n",
"\n",
"# note: save image as 'speech_narrative_ratio.png'"
]
},
{
"cell_type": "markdown",
"id": "901fd0ab-3800-4cc5-b509-5cf5aa660037",
"metadata": {},
"source": [
"## 4.3 - Term frequency (excluding stopwords) \n",
"\n",
"Now we analyze the vocabulary of the parasha. We want to find the most frequent words (lexemes) and exclude very common \"stop words\" that are not content-rich. In Hebrew, such stopwords include conjunctions (like \"ื\" = \"and\"), prepositions (\"ื/ื/ื\" = \"in/like/to\", etc.), the definite article (\"ื\" = \"the\"), pronouns, and a few particles like the negative \"ืื\" or the accusative marker \"ืืช\". \n",
"\n",
"The following words with part-of-speech tags (`sp` feature) ([Sp - BHSA](https://etcbc.github.io/bhsa/features/sp/#)), that are constituting our \"stop words\" for this analysis, will be filtered:\n",
" - `prep` (preposition),\n",
" - `conj` (conjunction),\n",
" - `art` (article),\n",
" - `prps`/`prde`/`prin` (pronouns: personal, demonstrative, interrogative),\n",
" - `nega` (negative),\n",
" - `inrg` (interrogative particle),\n",
" - `intj` (interjection).\n",
"\n",
"Next the remaining lexemse are counted. BHSA provides a `lex` feature (the lexical root form) and `gloss` (an English gloss/translation) for each word. We will count by lexeme so that different inflected forms count together. For display, we'll use the English glosses of the top lexemes for readability, and also show the Hebrew lexeme. "
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "f44c44bb-0be7-46f7-91ba-16c749fe3fc0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Top 10 frequent content lexemes:\n",
" ืืืืึ (YHWH): 73\n",
" ืคืจืขืึ (pharaoh): 51\n",
" ืืืจ (say): 48\n",
" ืืฉืืึ (Moses): 47\n",
" ืืฆืจืืึ (Egypt): 45\n",
" ืืจืฅึ (earth): 41\n",
" ืืึ (whole): 30\n",
" ืืึ (son): 29\n",
" ืืื (be): 29\n",
" ืขืึ (people): 28\n"
]
}
],
"source": [
"# Define parts of speech to treat as stop words (function words)\n",
"stop_sp = {\"prep\", \"conj\", \"art\", \"prps\", \"prde\", \"prin\", \"nega\", \"inrg\", \"intj\"}\n",
"\n",
"# Build a frequency counter of lexemes (excluding stopword POS)\n",
"lex_counts = Counter()\n",
"for w in words_in_parasha:\n",
" if F.sp.v(w) in stop_sp:\n",
" continue\n",
" lex_node = L.u(w, 'lex')[0] # get the lexeme object for this word\n",
" lex_counts[lex_node] += 1\n",
"\n",
"# Get the 10 most frequent lexemes (excluding stops)\n",
"top_lexemes = [lex for lex, cnt in lex_counts.most_common(10)]\n",
"print(\"Top 10 frequent content lexemes:\")\n",
"for lex in top_lexemes:\n",
" gloss = F.gloss.v(lex) # English gloss of the lexeme\n",
" heb = F.lex_utf8.v(lex) # Hebrew lexeme in Hebrew script\n",
" freq = lex_counts[lex]\n",
" print(f\" {heb} ({gloss}): {freq}\")\n"
]
},
{
"cell_type": "markdown",
"id": "d620df0b-a370-45f5-9522-4d9ab5fb4c27",
"metadata": {},
"source": [
"Now let's visualize these frequencies in a bar chart. We'll label each bar with the lexeme's English gloss:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "d83cf672-ca5c-4f11-be1c-53c47067db3a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" \n",
" \n",
"
\n",
"
Loading BokehJS ...\n",
"
\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"'use strict';\n",
"(function(root) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
"\n",
" const force = true;\n",
"\n",
" if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n",
" root._bokeh_onload_callbacks = [];\n",
" root._bokeh_is_loading = undefined;\n",
" }\n",
"\n",
"const JS_MIME_TYPE = 'application/javascript';\n",
" const HTML_MIME_TYPE = 'text/html';\n",
" const EXEC_MIME_TYPE = 'application/vnd.bokehjs_exec.v0+json';\n",
" const CLASS_NAME = 'output_bokeh rendered_html';\n",
"\n",
" /**\n",
" * Render data to the DOM node\n",
" */\n",
" function render(props, node) {\n",
" const script = document.createElement(\"script\");\n",
" node.appendChild(script);\n",
" }\n",
"\n",
" /**\n",
" * Handle when an output is cleared or removed\n",
" */\n",
" function handleClearOutput(event, handle) {\n",
" function drop(id) {\n",
" const view = Bokeh.index.get_by_id(id)\n",
" if (view != null) {\n",
" view.model.document.clear()\n",
" Bokeh.index.delete(view)\n",
" }\n",
" }\n",
"\n",
" const cell = handle.cell;\n",
"\n",
" const id = cell.output_area._bokeh_element_id;\n",
" const server_id = cell.output_area._bokeh_server_id;\n",
"\n",
" // Clean up Bokeh references\n",
" if (id != null) {\n",
" drop(id)\n",
" }\n",
"\n",
" if (server_id !== undefined) {\n",
" // Clean up Bokeh references\n",
" const cmd_clean = \"from bokeh.io.state import curstate; print(curstate().uuid_to_server['\" + server_id + \"'].get_sessions()[0].document.roots[0]._id)\";\n",
" cell.notebook.kernel.execute(cmd_clean, {\n",
" iopub: {\n",
" output: function(msg) {\n",
" const id = msg.content.text.trim()\n",
" drop(id)\n",
" }\n",
" }\n",
" });\n",
" // Destroy server and session\n",
" const cmd_destroy = \"import bokeh.io.notebook as ion; ion.destroy_server('\" + server_id + \"')\";\n",
" cell.notebook.kernel.execute(cmd_destroy);\n",
" }\n",
" }\n",
"\n",
" /**\n",
" * Handle when a new output is added\n",
" */\n",
" function handleAddOutput(event, handle) {\n",
" const output_area = handle.output_area;\n",
" const output = handle.output;\n",
"\n",
" // limit handleAddOutput to display_data with EXEC_MIME_TYPE content only\n",
" if ((output.output_type != \"display_data\") || (!Object.prototype.hasOwnProperty.call(output.data, EXEC_MIME_TYPE))) {\n",
" return\n",
" }\n",
"\n",
" const toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n",
"\n",
" if (output.metadata[EXEC_MIME_TYPE][\"id\"] !== undefined) {\n",
" toinsert[toinsert.length - 1].firstChild.textContent = output.data[JS_MIME_TYPE];\n",
" // store reference to embed id on output_area\n",
" output_area._bokeh_element_id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n",
" }\n",
" if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n",
" const bk_div = document.createElement(\"div\");\n",
" bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n",
" const script_attrs = bk_div.children[0].attributes;\n",
" for (let i = 0; i < script_attrs.length; i++) {\n",
" toinsert[toinsert.length - 1].firstChild.setAttribute(script_attrs[i].name, script_attrs[i].value);\n",
" toinsert[toinsert.length - 1].firstChild.textContent = bk_div.children[0].textContent\n",
" }\n",
" // store reference to server id on output_area\n",
" output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n",
" }\n",
" }\n",
"\n",
" function register_renderer(events, OutputArea) {\n",
"\n",
" function append_mime(data, metadata, element) {\n",
" // create a DOM node to render to\n",
" const toinsert = this.create_output_subarea(\n",
" metadata,\n",
" CLASS_NAME,\n",
" EXEC_MIME_TYPE\n",
" );\n",
" this.keyboard_manager.register_events(toinsert);\n",
" // Render to node\n",
" const props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n",
" render(props, toinsert[toinsert.length - 1]);\n",
" element.append(toinsert);\n",
" return toinsert\n",
" }\n",
"\n",
" /* Handle when an output is cleared or removed */\n",
" events.on('clear_output.CodeCell', handleClearOutput);\n",
" events.on('delete.Cell', handleClearOutput);\n",
"\n",
" /* Handle when a new output is added */\n",
" events.on('output_added.OutputArea', handleAddOutput);\n",
"\n",
" /**\n",
" * Register the mime type and append_mime function with output_area\n",
" */\n",
" OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n",
" /* Is output safe? */\n",
" safe: true,\n",
" /* Index of renderer in `output_area.display_order` */\n",
" index: 0\n",
" });\n",
" }\n",
"\n",
" // register the mime type if in Jupyter Notebook environment and previously unregistered\n",
" if (root.Jupyter !== undefined) {\n",
" const events = require('base/js/events');\n",
" const OutputArea = require('notebook/js/outputarea').OutputArea;\n",
"\n",
" if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n",
" register_renderer(events, OutputArea);\n",
" }\n",
" }\n",
" if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n",
" root._bokeh_timeout = Date.now() + 5000;\n",
" root._bokeh_failed_load = false;\n",
" }\n",
"\n",
" const NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"\\n\"+\n",
" \"
\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"
\\n\"+\n",
" \"
\\n\"+\n",
" \"- re-rerun `output_notebook()` to attempt to load from CDN again, or
\\n\"+\n",
" \"- use INLINE resources instead, as so:
\\n\"+\n",
" \"
\\n\"+\n",
" \"
\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"\\n\"+\n",
" \"
\"}};\n",
"\n",
" function display_loaded(error = null) {\n",
" const el = document.getElementById(\"e31ca492-f2de-48e4-bcbc-bc2533ef8790\");\n",
" if (el != null) {\n",
" const html = (() => {\n",
" if (typeof root.Bokeh === \"undefined\") {\n",
" if (error == null) {\n",
" return \"BokehJS is loading ...\";\n",
" } else {\n",
" return \"BokehJS failed to load.\";\n",
" }\n",
" } else {\n",
" const prefix = `BokehJS ${root.Bokeh.version}`;\n",
" if (error == null) {\n",
" return `${prefix} successfully loaded.`;\n",
" } else {\n",
" return `${prefix} encountered errors while loading and may not function as expected.`;\n",
" }\n",
" }\n",
" })();\n",
" el.innerHTML = html;\n",
"\n",
" if (error != null) {\n",
" const wrapper = document.createElement(\"div\");\n",
" wrapper.style.overflow = \"auto\";\n",
" wrapper.style.height = \"5em\";\n",
" wrapper.style.resize = \"vertical\";\n",
" const content = document.createElement(\"div\");\n",
" content.style.fontFamily = \"monospace\";\n",
" content.style.whiteSpace = \"pre-wrap\";\n",
" content.style.backgroundColor = \"rgb(255, 221, 221)\";\n",
" content.textContent = error.stack ?? error.toString();\n",
" wrapper.append(content);\n",
" el.append(wrapper);\n",
" }\n",
" } else if (Date.now() < root._bokeh_timeout) {\n",
" setTimeout(() => display_loaded(error), 100);\n",
" }\n",
" }\n",
"\n",
" function run_callbacks() {\n",
" try {\n",
" root._bokeh_onload_callbacks.forEach(function(callback) {\n",
" if (callback != null)\n",
" callback();\n",
" });\n",
" } finally {\n",
" delete root._bokeh_onload_callbacks\n",
" }\n",
" console.debug(\"Bokeh: all callbacks have finished\");\n",
" }\n",
"\n",
" function load_libs(css_urls, js_urls, callback) {\n",
" if (css_urls == null) css_urls = [];\n",
" if (js_urls == null) js_urls = [];\n",
"\n",
" root._bokeh_onload_callbacks.push(callback);\n",
" if (root._bokeh_is_loading > 0) {\n",
" console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" root._bokeh_is_loading = css_urls.length + js_urls.length;\n",
"\n",
" function on_load() {\n",
" root._bokeh_is_loading--;\n",
" if (root._bokeh_is_loading === 0) {\n",
" console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n",
" run_callbacks()\n",
" }\n",
" }\n",
"\n",
" function on_error(url) {\n",
" console.error(\"failed to load \" + url);\n",
" }\n",
"\n",
" for (let i = 0; i < css_urls.length; i++) {\n",
" const url = css_urls[i];\n",
" const element = document.createElement(\"link\");\n",
" element.onload = on_load;\n",
" element.onerror = on_error.bind(null, url);\n",
" element.rel = \"stylesheet\";\n",
" element.type = \"text/css\";\n",
" element.href = url;\n",
" console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n",
" document.body.appendChild(element);\n",
" }\n",
"\n",
" for (let i = 0; i < js_urls.length; i++) {\n",
" const url = js_urls[i];\n",
" const element = document.createElement('script');\n",
" element.onload = on_load;\n",
" element.onerror = on_error.bind(null, url);\n",
" element.async = false;\n",
" element.src = url;\n",
" console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.head.appendChild(element);\n",
" }\n",
" };\n",
"\n",
" function inject_raw_css(css) {\n",
" const element = document.createElement(\"style\");\n",
" element.appendChild(document.createTextNode(css));\n",
" document.body.appendChild(element);\n",
" }\n",
"\n",
" const js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-3.6.0.min.js\"];\n",
" const css_urls = [];\n",
"\n",
" const inline_js = [ function(Bokeh) {\n",
" Bokeh.set_log_level(\"info\");\n",
" },\n",
"function(Bokeh) {\n",
" }\n",
" ];\n",
"\n",
" function run_inline_js() {\n",
" if (root.Bokeh !== undefined || force === true) {\n",
" try {\n",
" for (let i = 0; i < inline_js.length; i++) {\n",
" inline_js[i].call(root, root.Bokeh);\n",
" }\n",
"\n",
" } catch (error) {display_loaded(error);throw error;\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < root._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!root._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" root._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" const cell = $(document.getElementById(\"e31ca492-f2de-48e4-bcbc-bc2533ef8790\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" }\n",
"\n",
" if (root._bokeh_is_loading === 0) {\n",
" console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(css_urls, js_urls, function() {\n",
" console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
"}(window));"
],
"application/vnd.bokehjs_load.v0+json": "'use strict';\n(function(root) {\n function now() {\n return new Date();\n }\n\n const force = true;\n\n if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n\n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n const NB_LOAD_WARNING = {'data': {'text/html':\n \"\\n\"+\n \"
\\n\"+\n \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n \"
\\n\"+\n \"
\\n\"+\n \"- re-rerun `output_notebook()` to attempt to load from CDN again, or
\\n\"+\n \"- use INLINE resources instead, as so:
\\n\"+\n \"
\\n\"+\n \"
\\n\"+\n \"from bokeh.resources import INLINE\\n\"+\n \"output_notebook(resources=INLINE)\\n\"+\n \"\\n\"+\n \"
\"}};\n\n function display_loaded(error = null) {\n const el = document.getElementById(\"e31ca492-f2de-48e4-bcbc-bc2533ef8790\");\n if (el != null) {\n const html = (() => {\n if (typeof root.Bokeh === \"undefined\") {\n if (error == null) {\n return \"BokehJS is loading ...\";\n } else {\n return \"BokehJS failed to load.\";\n }\n } else {\n const prefix = `BokehJS ${root.Bokeh.version}`;\n if (error == null) {\n return `${prefix} successfully loaded.`;\n } else {\n return `${prefix} encountered errors while loading and may not function as expected.`;\n }\n }\n })();\n el.innerHTML = html;\n\n if (error != null) {\n const wrapper = document.createElement(\"div\");\n wrapper.style.overflow = \"auto\";\n wrapper.style.height = \"5em\";\n wrapper.style.resize = \"vertical\";\n const content = document.createElement(\"div\");\n content.style.fontFamily = \"monospace\";\n content.style.whiteSpace = \"pre-wrap\";\n content.style.backgroundColor = \"rgb(255, 221, 221)\";\n content.textContent = error.stack ?? error.toString();\n wrapper.append(content);\n el.append(wrapper);\n }\n } else if (Date.now() < root._bokeh_timeout) {\n setTimeout(() => display_loaded(error), 100);\n }\n }\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) {\n if (callback != null)\n callback();\n });\n } finally {\n delete root._bokeh_onload_callbacks\n }\n console.debug(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(css_urls, js_urls, callback) {\n if (css_urls == null) css_urls = [];\n if (js_urls == null) js_urls = [];\n\n root._bokeh_onload_callbacks.push(callback);\n if (root._bokeh_is_loading > 0) {\n console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls == null || js_urls.length === 0) {\n run_callbacks();\n return null;\n }\n console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n root._bokeh_is_loading = css_urls.length + js_urls.length;\n\n function on_load() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n run_callbacks()\n }\n }\n\n function on_error(url) {\n console.error(\"failed to load \" + url);\n }\n\n for (let i = 0; i < css_urls.length; i++) {\n const url = css_urls[i];\n const element = document.createElement(\"link\");\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.rel = \"stylesheet\";\n element.type = \"text/css\";\n element.href = url;\n console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n document.body.appendChild(element);\n }\n\n for (let i = 0; i < js_urls.length; i++) {\n const url = js_urls[i];\n const element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.async = false;\n element.src = url;\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n };\n\n function inject_raw_css(css) {\n const element = document.createElement(\"style\");\n element.appendChild(document.createTextNode(css));\n document.body.appendChild(element);\n }\n\n const js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.6.0.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-3.6.0.min.js\"];\n const css_urls = [];\n\n const inline_js = [ function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\nfunction(Bokeh) {\n }\n ];\n\n function run_inline_js() {\n if (root.Bokeh !== undefined || force === true) {\n try {\n for (let i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }\n\n } catch (error) {display_loaded(error);throw error;\n }if (force === true) {\n display_loaded();\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n } else if (force !== true) {\n const cell = $(document.getElementById(\"e31ca492-f2de-48e4-bcbc-bc2533ef8790\")).parents('.cell').data().cell;\n cell.output_area.append_execute_result(NB_LOAD_WARNING)\n }\n }\n\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n run_inline_js();\n } else {\n load_libs(css_urls, js_urls, function() {\n console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n run_inline_js();\n });\n }\n}(window));"
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Lexeme (Hebrew) | \n",
" Gloss (English) | \n",
" Frequency | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" ืืืืึ | \n",
" YHWH | \n",
" 73 | \n",
"
\n",
" \n",
" | 1 | \n",
" ืคืจืขืึ | \n",
" pharaoh | \n",
" 51 | \n",
"
\n",
" \n",
" | 2 | \n",
" ืืืจ | \n",
" say | \n",
" 48 | \n",
"
\n",
" \n",
" | 3 | \n",
" ืืฉืืึ | \n",
" Moses | \n",
" 47 | \n",
"
\n",
" \n",
" | 4 | \n",
" ืืฆืจืืึ | \n",
" Egypt | \n",
" 45 | \n",
"
\n",
" \n",
" | 5 | \n",
" ืืจืฅึ | \n",
" earth | \n",
" 41 | \n",
"
\n",
" \n",
" | 6 | \n",
" ืืึ | \n",
" whole | \n",
" 30 | \n",
"
\n",
" \n",
" | 7 | \n",
" ืืึ | \n",
" son | \n",
" 29 | \n",
"
\n",
" \n",
" | 8 | \n",
" ืืื | \n",
" be | \n",
" 29 | \n",
"
\n",
" \n",
" | 9 | \n",
" ืขืึ | \n",
" people | \n",
" 28 | \n",
"
\n",
" \n",
" | 10 | \n",
" ืฉืืื | \n",
" send | \n",
" 26 | \n",
"
\n",
" \n",
" | 11 | \n",
" ืืืจืึ | \n",
" Aaron | \n",
" 26 | \n",
"
\n",
" \n",
" | 12 | \n",
" ืืืจ | \n",
" speak | \n",
" 22 | \n",
"
\n",
" \n",
" | 13 | \n",
" ืืฉืืจืืึ | \n",
" Israel | \n",
" 19 | \n",
"
\n",
" \n",
" | 14 | \n",
" ืขืืึ | \n",
" servant | \n",
" 16 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Lexeme (Hebrew) Gloss (English) Frequency\n",
"0 ืืืืึ YHWH 73\n",
"1 ืคืจืขืึ pharaoh 51\n",
"2 ืืืจ say 48\n",
"3 ืืฉืืึ Moses 47\n",
"4 ืืฆืจืืึ Egypt 45\n",
"5 ืืจืฅึ earth 41\n",
"6 ืืึ whole 30\n",
"7 ืืึ son 29\n",
"8 ืืื be 29\n",
"9 ืขืึ people 28\n",
"10 ืฉืืื send 26\n",
"11 ืืืจืึ Aaron 26\n",
"12 ืืืจ speak 22\n",
"13 ืืฉืืจืืึ Israel 19\n",
"14 ืขืืึ servant 16"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from bokeh.io import output_notebook\n",
"\n",
"# Prepare the Bokeh output to display in the notebook\n",
"output_notebook()\n",
"\n",
"# Prepare data for bar chart of top terms\n",
"top_lex_nodes = [lex for lex, cnt in lex_counts.most_common(15)]\n",
"top_lex_heb = [F.lex_utf8.v(lex) for lex in top_lex_nodes] # Hebrew form\n",
"top_lex_gloss = [F.gloss.v(lex) or \"\" for lex in top_lex_nodes] # English gloss (if available)\n",
"top_lex_freq = [lex_counts[lex] for lex in top_lex_nodes]\n",
"\n",
"# Create DataFrame or dict for convenience\n",
"freq_df = pd.DataFrame({\n",
" \"Lexeme (Hebrew)\": top_lex_heb,\n",
" \"Gloss (English)\": top_lex_gloss,\n",
" \"Frequency\": top_lex_freq\n",
"})\n",
"# Display the table of top words\n",
"display(freq_df)"
]
},
{
"cell_type": "markdown",
"id": "b02faa1c-dea6-40f1-9e3c-be35e67636c4",
"metadata": {},
"source": [
"This tabular output shows the top 15 content words with their gloss and counts. Now the bar chart using Bokeh:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "214d38d0-3970-4750-a705-73d52097e51c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"(function(root) {\n",
" function embed_document(root) {\n",
" const docs_json = {\"1bd8d2ee-cd0d-49a8-8887-658f84140cc7\":{\"version\":\"3.6.0\",\"title\":\"Bokeh Application\",\"roots\":[{\"type\":\"object\",\"name\":\"Figure\",\"id\":\"p1192\",\"attributes\":{\"width\":500,\"height\":300,\"x_range\":{\"type\":\"object\",\"name\":\"FactorRange\",\"id\":\"p1202\",\"attributes\":{\"factors\":[\"YHWH (\\u05d9\\u05d4\\u05d5\\u05d4\\u059c)\",\"pharaoh (\\u05e4\\u05e8\\u05e2\\u05d4\\u059c)\",\"say (\\u05d0\\u05de\\u05e8)\",\"Moses (\\u05de\\u05e9\\u05c1\\u05d4\\u059c)\",\"Egypt (\\u05de\\u05e6\\u05e8\\u05d9\\u05dd\\u059c)\",\"earth (\\u05d0\\u05e8\\u05e5\\u059c)\",\"whole (\\u05db\\u05dc\\u059c)\",\"son (\\u05d1\\u05df\\u059c)\",\"be (\\u05d4\\u05d9\\u05d4)\",\"people (\\u05e2\\u05dd\\u059c)\",\"send (\\u05e9\\u05c1\\u05dc\\u05d7)\",\"Aaron (\\u05d0\\u05d4\\u05e8\\u05df\\u059c)\",\"speak (\\u05d3\\u05d1\\u05e8)\",\"Israel (\\u05d9\\u05e9\\u05c2\\u05e8\\u05d0\\u05dc\\u059c)\",\"servant (\\u05e2\\u05d1\\u05d3\\u059c)\"]}},\"y_range\":{\"type\":\"object\",\"name\":\"Range1d\",\"id\":\"p1191\",\"attributes\":{\"end\":80.30000000000001,\"bounds\":[0,80.30000000000001]}},\"x_scale\":{\"type\":\"object\",\"name\":\"CategoricalScale\",\"id\":\"p1203\"},\"y_scale\":{\"type\":\"object\",\"name\":\"LinearScale\",\"id\":\"p1204\"},\"title\":{\"type\":\"object\",\"name\":\"Title\",\"id\":\"p1195\",\"attributes\":{\"text\":\"Parasha #14: Va\\u2019era - Top terms (excluding stopwords)\"}},\"renderers\":[{\"type\":\"object\",\"name\":\"GlyphRenderer\",\"id\":\"p1237\",\"attributes\":{\"data_source\":{\"type\":\"object\",\"name\":\"ColumnDataSource\",\"id\":\"p1228\",\"attributes\":{\"selected\":{\"type\":\"object\",\"name\":\"Selection\",\"id\":\"p1229\",\"attributes\":{\"indices\":[],\"line_indices\":[]}},\"selection_policy\":{\"type\":\"object\",\"name\":\"UnionRenderers\",\"id\":\"p1230\"},\"data\":{\"type\":\"map\",\"entries\":[[\"x\",[\"YHWH (\\u05d9\\u05d4\\u05d5\\u05d4\\u059c)\",\"pharaoh (\\u05e4\\u05e8\\u05e2\\u05d4\\u059c)\",\"say (\\u05d0\\u05de\\u05e8)\",\"Moses (\\u05de\\u05e9\\u05c1\\u05d4\\u059c)\",\"Egypt (\\u05de\\u05e6\\u05e8\\u05d9\\u05dd\\u059c)\",\"earth (\\u05d0\\u05e8\\u05e5\\u059c)\",\"whole (\\u05db\\u05dc\\u059c)\",\"son (\\u05d1\\u05df\\u059c)\",\"be (\\u05d4\\u05d9\\u05d4)\",\"people (\\u05e2\\u05dd\\u059c)\",\"send (\\u05e9\\u05c1\\u05dc\\u05d7)\",\"Aaron (\\u05d0\\u05d4\\u05e8\\u05df\\u059c)\",\"speak (\\u05d3\\u05d1\\u05e8)\",\"Israel (\\u05d9\\u05e9\\u05c2\\u05e8\\u05d0\\u05dc\\u059c)\",\"servant (\\u05e2\\u05d1\\u05d3\\u059c)\"]],[\"top\",[73,51,48,47,45,41,30,29,29,28,26,26,22,19,16]]]}}},\"view\":{\"type\":\"object\",\"name\":\"CDSView\",\"id\":\"p1238\",\"attributes\":{\"filter\":{\"type\":\"object\",\"name\":\"AllIndices\",\"id\":\"p1239\"}}},\"glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1234\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"x\"},\"width\":{\"type\":\"value\",\"value\":0.8},\"top\":{\"type\":\"field\",\"field\":\"top\"},\"line_color\":{\"type\":\"value\",\"value\":\"#2ca02c\"},\"fill_color\":{\"type\":\"value\",\"value\":\"#2ca02c\"},\"hatch_color\":{\"type\":\"value\",\"value\":\"#2ca02c\"}}},\"nonselection_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1235\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"x\"},\"width\":{\"type\":\"value\",\"value\":0.8},\"top\":{\"type\":\"field\",\"field\":\"top\"},\"line_color\":{\"type\":\"value\",\"value\":\"#2ca02c\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.1},\"fill_color\":{\"type\":\"value\",\"value\":\"#2ca02c\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.1},\"hatch_color\":{\"type\":\"value\",\"value\":\"#2ca02c\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.1}}},\"muted_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1236\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"x\"},\"width\":{\"type\":\"value\",\"value\":0.8},\"top\":{\"type\":\"field\",\"field\":\"top\"},\"line_color\":{\"type\":\"value\",\"value\":\"#2ca02c\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.2},\"fill_color\":{\"type\":\"value\",\"value\":\"#2ca02c\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.2},\"hatch_color\":{\"type\":\"value\",\"value\":\"#2ca02c\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.2}}}}}],\"toolbar\":{\"type\":\"object\",\"name\":\"Toolbar\",\"id\":\"p1201\",\"attributes\":{\"tools\":[{\"type\":\"object\",\"name\":\"PanTool\",\"id\":\"p1215\"},{\"type\":\"object\",\"name\":\"WheelZoomTool\",\"id\":\"p1216\",\"attributes\":{\"renderers\":\"auto\"}},{\"type\":\"object\",\"name\":\"BoxZoomTool\",\"id\":\"p1217\",\"attributes\":{\"overlay\":{\"type\":\"object\",\"name\":\"BoxAnnotation\",\"id\":\"p1218\",\"attributes\":{\"syncable\":false,\"line_color\":\"black\",\"line_alpha\":1.0,\"line_width\":2,\"line_dash\":[4,4],\"fill_color\":\"lightgrey\",\"fill_alpha\":0.5,\"level\":\"overlay\",\"visible\":false,\"left\":{\"type\":\"number\",\"value\":\"nan\"},\"right\":{\"type\":\"number\",\"value\":\"nan\"},\"top\":{\"type\":\"number\",\"value\":\"nan\"},\"bottom\":{\"type\":\"number\",\"value\":\"nan\"},\"left_units\":\"canvas\",\"right_units\":\"canvas\",\"top_units\":\"canvas\",\"bottom_units\":\"canvas\",\"handles\":{\"type\":\"object\",\"name\":\"BoxInteractionHandles\",\"id\":\"p1224\",\"attributes\":{\"all\":{\"type\":\"object\",\"name\":\"AreaVisuals\",\"id\":\"p1223\",\"attributes\":{\"fill_color\":\"white\",\"hover_fill_color\":\"lightgray\"}}}}}}}},{\"type\":\"object\",\"name\":\"SaveTool\",\"id\":\"p1225\"},{\"type\":\"object\",\"name\":\"ResetTool\",\"id\":\"p1226\"},{\"type\":\"object\",\"name\":\"HelpTool\",\"id\":\"p1227\"}],\"active_drag\":null,\"active_scroll\":null}},\"left\":[{\"type\":\"object\",\"name\":\"LinearAxis\",\"id\":\"p1210\",\"attributes\":{\"ticker\":{\"type\":\"object\",\"name\":\"BasicTicker\",\"id\":\"p1211\",\"attributes\":{\"mantissas\":[1,2,5]}},\"formatter\":{\"type\":\"object\",\"name\":\"BasicTickFormatter\",\"id\":\"p1212\"},\"axis_label\":\"Frequency\",\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1213\"}}}],\"below\":[{\"type\":\"object\",\"name\":\"CategoricalAxis\",\"id\":\"p1205\",\"attributes\":{\"ticker\":{\"type\":\"object\",\"name\":\"CategoricalTicker\",\"id\":\"p1206\"},\"formatter\":{\"type\":\"object\",\"name\":\"CategoricalTickFormatter\",\"id\":\"p1207\"},\"axis_label\":\"Lexeme\",\"major_label_orientation\":0.9,\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1208\"},\"major_label_text_font_size\":\"8pt\"}}],\"center\":[{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1209\",\"attributes\":{\"axis\":{\"id\":\"p1205\"}}},{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1214\",\"attributes\":{\"dimension\":1,\"axis\":{\"id\":\"p1210\"}}},{\"type\":\"object\",\"name\":\"LabelSet\",\"id\":\"p1240\",\"attributes\":{\"level\":\"glyph\",\"source\":{\"id\":\"p1228\"},\"x\":{\"type\":\"field\",\"field\":\"x\"},\"y\":{\"type\":\"field\",\"field\":\"top\"},\"text\":{\"type\":\"field\",\"field\":\"top\"},\"x_offset\":{\"type\":\"value\",\"value\":-10},\"y_offset\":{\"type\":\"value\",\"value\":1}}}]}}]}};\n",
" const render_items = [{\"docid\":\"1bd8d2ee-cd0d-49a8-8887-658f84140cc7\",\"roots\":{\"p1192\":\"d9bbf82a-60de-4ec5-8aa9-c2ea182bd24c\"},\"root_ids\":[\"p1192\"]}];\n",
" void root.Bokeh.embed.embed_items_notebook(docs_json, render_items);\n",
" }\n",
" if (root.Bokeh !== undefined) {\n",
" embed_document(root);\n",
" } else {\n",
" let attempts = 0;\n",
" const timer = setInterval(function(root) {\n",
" if (root.Bokeh !== undefined) {\n",
" clearInterval(timer);\n",
" embed_document(root);\n",
" } else {\n",
" attempts++;\n",
" if (attempts > 100) {\n",
" clearInterval(timer);\n",
" console.log(\"Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing\");\n",
" }\n",
" }\n",
" }, 10, root)\n",
" }\n",
"})(window);"
],
"application/vnd.bokehjs_exec.v0+json": ""
},
"metadata": {
"application/vnd.bokehjs_exec.v0+json": {
"id": "p1192"
}
},
"output_type": "display_data"
}
],
"source": [
"\n",
"from bokeh.models import Range1d, HoverTool, ColumnDataSource, LabelSet\n",
"from bokeh.plotting import figure, output_file, show\n",
"from bokeh.io import export_svgs\n",
"from bokeh.embed import components\n",
"from bokeh.models import Div\n",
"\n",
"# Create the list of terms, combining gloss and Hebrew as needed\n",
"terms = [f\"{gloss} ({heb})\" if gloss else heb for gloss, heb in zip(top_lex_gloss, top_lex_heb)]\n",
"\n",
"# Compute the maximum frequency and add padding (e.g., 10%)\n",
"max_freq = max(top_lex_freq)\n",
"y_end = max_freq * 1.1\n",
"\n",
"# Create the figure with a fixed y_range\n",
"p_terms = figure(x_range=terms, height=300, width=500,\n",
" title=f\"Parasha #{parasha_num}: {parasha_name_trans} - Top terms (excluding stopwords)\",\n",
" x_axis_label=\"Lexeme\", y_axis_label=\"Frequency\",\n",
" toolbar_location=\"right\",\n",
" y_range=Range1d(start=0, end=y_end))\n",
"\n",
"# Create a ColumnDataSource to hold the data\n",
"source = ColumnDataSource(data=dict(x=terms, top=top_lex_freq))\n",
"\n",
"# Draw the bars\n",
"p_terms.vbar(x='x', top='top', width=0.8, color=\"#2ca02c\", source=source)\n",
"\n",
"# Lock the y_range to prevent zooming or panning\n",
"p_terms.y_range.bounds = (0, y_end)\n",
"\n",
"# If a toolbar exists, ensure no active drag or scroll tools are set\n",
"if p_terms.toolbar:\n",
" p_terms.toolbar.active_drag = None\n",
" p_terms.toolbar.active_scroll = None\n",
"\n",
"# Rotate x-axis labels and adjust font size to prevent overlap\n",
"p_terms.xaxis.major_label_orientation = 0.9\n",
"p_terms.xaxis.major_label_text_font_size = \"8pt\"\n",
"\n",
"# Add labels above each bar\n",
"labels = LabelSet(x='x', y='top', text='top', level='glyph', x_offset=-10, y_offset=1, source=source)\n",
"p_terms.add_layout(labels)\n",
"\n",
"# Display the plot\n",
"show(p_terms)\n",
"\n",
"# note: save image as 'top_terms.png'"
]
},
{
"cell_type": "markdown",
"id": "6a509809-2d8b-41b4-a462-d6e9a83277cf",
"metadata": {},
"source": [
"## 4.4 - Clause types and phrase functions \n",
"\n",
"Finally, we consider some syntactic patterns and markers in the parasha. The BHSA dataset provides detailed syntactic analysis:\n",
"- *Clause types:* Each clause is classified by its structure and leading element. For example, a clause beginning with a *wayyiqtol* verb is labeled as `Wayyiqtol-X clause (WayX)` if it has an explicit subject, or `Wayyiqtol-0 clause (Way0)` if no subject is explicitly present ([Typ - BHSA](https://etcbc.github.io/bhsa/features/typ/#)). There are many such codes (e.g., clauses starting with *weqatal*, infinitives, participles, etc., as listed in BHSA documentation ([Typ - BHSA](https://etcbc.github.io/bhsa/features/typ/#)) ([Typ - BHSA](https://etcbc.github.io/bhsa/features/typ/#))). These clause type patterns often correlate with narrative structure (Wayyiqtol chains for narrative sequence, X-qatal for past background, etc.).\n",
"- *Phrase functions:* Each phrase in the syntax tree has a grammatical function, such as Subject (Subj), Object (Objc), Predicate (Pred), Adjunct (Adju), etc. ([Function - BHSA](https://etcbc.github.io/bhsa/features/function/#)) ([Function - BHSA](https://etcbc.github.io/bhsa/features/function/#)). This tells us the role a phrase plays in the clause.\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "26ce7f52-18ce-45b1-a5cd-7c2465df25ea",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most common phrase functions:\n",
" Pred: 343\n",
" Conj: 286\n",
" Subj: 201\n",
" Cmpl: 185\n",
" Objc: 117\n",
" PreC: 85\n",
" Adju: 38\n",
" Nega: 35\n",
" Loca: 31\n",
" Rela: 27\n"
]
}
],
"source": [
"# Count phrase functions in the parasha\n",
"function_counts = Counter(F.function.v(ph) for ph in phrases_in_parasha if F.function.v(ph))\n",
"# Exclude Unknown or None\n",
"if None in function_counts: \n",
" del function_counts[None]\n",
"if \"Unkn\" in function_counts: \n",
" del function_counts[\"Unkn\"]\n",
"\n",
"# Get the most frequent functions\n",
"common_funcs = function_counts.most_common(10)\n",
"print(\"Most common phrase functions:\")\n",
"for func, count in common_funcs:\n",
" print(f\" {func}: {count}\")"
]
},
{
"cell_type": "markdown",
"id": "87984c34-aca0-4057-88d5-5493770bce43",
"metadata": {},
"source": [
"Let's visualize this in a bar chart:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "19a33478-6fb7-473b-bf62-914a4af6725a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"(function(root) {\n",
" function embed_document(root) {\n",
" const docs_json = {\"e13278fe-2151-4265-8c5c-bb7c43170c49\":{\"version\":\"3.6.0\",\"title\":\"Bokeh Application\",\"roots\":[{\"type\":\"object\",\"name\":\"Figure\",\"id\":\"p1245\",\"attributes\":{\"width\":400,\"height\":300,\"x_range\":{\"type\":\"object\",\"name\":\"FactorRange\",\"id\":\"p1255\",\"attributes\":{\"factors\":[\"Pred\",\"Conj\",\"Subj\",\"Cmpl\",\"Objc\",\"PreC\",\"Adju\",\"Nega\",\"Loca\",\"Rela\"]}},\"y_range\":{\"type\":\"object\",\"name\":\"Range1d\",\"id\":\"p1244\",\"attributes\":{\"end\":377.3,\"bounds\":[0,377.3]}},\"x_scale\":{\"type\":\"object\",\"name\":\"CategoricalScale\",\"id\":\"p1256\"},\"y_scale\":{\"type\":\"object\",\"name\":\"LinearScale\",\"id\":\"p1257\"},\"title\":{\"type\":\"object\",\"name\":\"Title\",\"id\":\"p1248\",\"attributes\":{\"text\":\"Parasha 14: Va\\u2019era - Phrase function distribution\"}},\"renderers\":[{\"type\":\"object\",\"name\":\"GlyphRenderer\",\"id\":\"p1287\",\"attributes\":{\"data_source\":{\"type\":\"object\",\"name\":\"ColumnDataSource\",\"id\":\"p1281\",\"attributes\":{\"selected\":{\"type\":\"object\",\"name\":\"Selection\",\"id\":\"p1282\",\"attributes\":{\"indices\":[],\"line_indices\":[]}},\"selection_policy\":{\"type\":\"object\",\"name\":\"UnionRenderers\",\"id\":\"p1283\"},\"data\":{\"type\":\"map\",\"entries\":[[\"x\",[\"Pred\",\"Conj\",\"Subj\",\"Cmpl\",\"Objc\",\"PreC\",\"Adju\",\"Nega\",\"Loca\",\"Rela\"]],[\"top\",[343,286,201,185,117,85,38,35,31,27]]]}}},\"view\":{\"type\":\"object\",\"name\":\"CDSView\",\"id\":\"p1288\",\"attributes\":{\"filter\":{\"type\":\"object\",\"name\":\"AllIndices\",\"id\":\"p1289\"}}},\"glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1284\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"x\"},\"width\":{\"type\":\"value\",\"value\":0.6},\"top\":{\"type\":\"field\",\"field\":\"top\"},\"line_color\":{\"type\":\"value\",\"value\":\"#8c564b\"},\"fill_color\":{\"type\":\"value\",\"value\":\"#8c564b\"},\"hatch_color\":{\"type\":\"value\",\"value\":\"#8c564b\"}}},\"nonselection_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1285\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"x\"},\"width\":{\"type\":\"value\",\"value\":0.6},\"top\":{\"type\":\"field\",\"field\":\"top\"},\"line_color\":{\"type\":\"value\",\"value\":\"#8c564b\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.1},\"fill_color\":{\"type\":\"value\",\"value\":\"#8c564b\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.1},\"hatch_color\":{\"type\":\"value\",\"value\":\"#8c564b\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.1}}},\"muted_glyph\":{\"type\":\"object\",\"name\":\"VBar\",\"id\":\"p1286\",\"attributes\":{\"x\":{\"type\":\"field\",\"field\":\"x\"},\"width\":{\"type\":\"value\",\"value\":0.6},\"top\":{\"type\":\"field\",\"field\":\"top\"},\"line_color\":{\"type\":\"value\",\"value\":\"#8c564b\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.2},\"fill_color\":{\"type\":\"value\",\"value\":\"#8c564b\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.2},\"hatch_color\":{\"type\":\"value\",\"value\":\"#8c564b\"},\"hatch_alpha\":{\"type\":\"value\",\"value\":0.2}}}}}],\"toolbar\":{\"type\":\"object\",\"name\":\"Toolbar\",\"id\":\"p1254\",\"attributes\":{\"tools\":[{\"type\":\"object\",\"name\":\"PanTool\",\"id\":\"p1268\"},{\"type\":\"object\",\"name\":\"WheelZoomTool\",\"id\":\"p1269\",\"attributes\":{\"renderers\":\"auto\"}},{\"type\":\"object\",\"name\":\"BoxZoomTool\",\"id\":\"p1270\",\"attributes\":{\"overlay\":{\"type\":\"object\",\"name\":\"BoxAnnotation\",\"id\":\"p1271\",\"attributes\":{\"syncable\":false,\"line_color\":\"black\",\"line_alpha\":1.0,\"line_width\":2,\"line_dash\":[4,4],\"fill_color\":\"lightgrey\",\"fill_alpha\":0.5,\"level\":\"overlay\",\"visible\":false,\"left\":{\"type\":\"number\",\"value\":\"nan\"},\"right\":{\"type\":\"number\",\"value\":\"nan\"},\"top\":{\"type\":\"number\",\"value\":\"nan\"},\"bottom\":{\"type\":\"number\",\"value\":\"nan\"},\"left_units\":\"canvas\",\"right_units\":\"canvas\",\"top_units\":\"canvas\",\"bottom_units\":\"canvas\",\"handles\":{\"type\":\"object\",\"name\":\"BoxInteractionHandles\",\"id\":\"p1277\",\"attributes\":{\"all\":{\"type\":\"object\",\"name\":\"AreaVisuals\",\"id\":\"p1276\",\"attributes\":{\"fill_color\":\"white\",\"hover_fill_color\":\"lightgray\"}}}}}}}},{\"type\":\"object\",\"name\":\"SaveTool\",\"id\":\"p1278\"},{\"type\":\"object\",\"name\":\"ResetTool\",\"id\":\"p1279\"},{\"type\":\"object\",\"name\":\"HelpTool\",\"id\":\"p1280\"}],\"active_drag\":null,\"active_scroll\":null}},\"left\":[{\"type\":\"object\",\"name\":\"LinearAxis\",\"id\":\"p1263\",\"attributes\":{\"ticker\":{\"type\":\"object\",\"name\":\"BasicTicker\",\"id\":\"p1264\",\"attributes\":{\"mantissas\":[1,2,5]}},\"formatter\":{\"type\":\"object\",\"name\":\"BasicTickFormatter\",\"id\":\"p1265\"},\"axis_label\":\"Count\",\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1266\"}}}],\"below\":[{\"type\":\"object\",\"name\":\"CategoricalAxis\",\"id\":\"p1258\",\"attributes\":{\"ticker\":{\"type\":\"object\",\"name\":\"CategoricalTicker\",\"id\":\"p1259\"},\"formatter\":{\"type\":\"object\",\"name\":\"CategoricalTickFormatter\",\"id\":\"p1260\"},\"axis_label\":\"Phrase Function\",\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p1261\"}}}],\"center\":[{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1262\",\"attributes\":{\"axis\":{\"id\":\"p1258\"}}},{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p1267\",\"attributes\":{\"dimension\":1,\"axis\":{\"id\":\"p1263\"}}},{\"type\":\"object\",\"name\":\"LabelSet\",\"id\":\"p1293\",\"attributes\":{\"level\":\"glyph\",\"source\":{\"type\":\"object\",\"name\":\"ColumnDataSource\",\"id\":\"p1290\",\"attributes\":{\"selected\":{\"type\":\"object\",\"name\":\"Selection\",\"id\":\"p1291\",\"attributes\":{\"indices\":[],\"line_indices\":[]}},\"selection_policy\":{\"type\":\"object\",\"name\":\"UnionRenderers\",\"id\":\"p1292\"},\"data\":{\"type\":\"map\",\"entries\":[[\"func\",[\"Pred\",\"Conj\",\"Subj\",\"Cmpl\",\"Objc\",\"PreC\",\"Adju\",\"Nega\",\"Loca\",\"Rela\"]],[\"count\",[343,286,201,185,117,85,38,35,31,27]],[\"pos\",[343,286,201,185,117,85,38,35,31,27]]]}}},\"x\":{\"type\":\"field\",\"field\":\"func\"},\"y\":{\"type\":\"field\",\"field\":\"pos\"},\"text\":{\"type\":\"field\",\"field\":\"count\"},\"x_offset\":{\"type\":\"value\",\"value\":-13},\"y_offset\":{\"type\":\"value\",\"value\":3}}}]}}]}};\n",
" const render_items = [{\"docid\":\"e13278fe-2151-4265-8c5c-bb7c43170c49\",\"roots\":{\"p1245\":\"cf1b1ac5-d37a-40dc-a984-9e4a31d6ba64\"},\"root_ids\":[\"p1245\"]}];\n",
" void root.Bokeh.embed.embed_items_notebook(docs_json, render_items);\n",
" }\n",
" if (root.Bokeh !== undefined) {\n",
" embed_document(root);\n",
" } else {\n",
" let attempts = 0;\n",
" const timer = setInterval(function(root) {\n",
" if (root.Bokeh !== undefined) {\n",
" clearInterval(timer);\n",
" embed_document(root);\n",
" } else {\n",
" attempts++;\n",
" if (attempts > 100) {\n",
" clearInterval(timer);\n",
" console.log(\"Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing\");\n",
" }\n",
" }\n",
" }, 10, root)\n",
" }\n",
"})(window);"
],
"application/vnd.bokehjs_exec.v0+json": ""
},
"metadata": {
"application/vnd.bokehjs_exec.v0+json": {
"id": "p1245"
}
},
"output_type": "display_data"
}
],
"source": [
"from bokeh.layouts import column\n",
"from bokeh.models import Range1d, LabelSet, ColumnDataSource\n",
"from bokeh.plotting import figure, output_file, show\n",
"\n",
"# Prepare data for phrase function chart\n",
"func_labels = [func for func, cnt in common_funcs]\n",
"func_counts = [cnt for func, cnt in common_funcs]\n",
"\n",
"# Compute maximum count and add a little padding (e.g., 10%)\n",
"max_count = max(func_counts)\n",
"y_end = max_count * 1.1\n",
"\n",
"# Create the figure with a fixed y_range\n",
"p_funcs = figure(x_range=func_labels, height=300, width=400,\n",
" title=f\"Parasha {parasha_num}: {parasha_name_trans} - Phrase function distribution\",\n",
" x_axis_label=\"Phrase Function\", y_axis_label=\"Count\",\n",
" toolbar_location='right',\n",
" y_range=Range1d(start=0, end=y_end))\n",
"\n",
"p_funcs.vbar(x=func_labels, top=func_counts, width=0.6, color=\"#8c564b\")\n",
"\n",
"# Lock the y_range to prevent any zooming/panning adjustments\n",
"p_funcs.y_range.bounds = (0, y_end)\n",
"\n",
"# If a toolbar exists, deactivate any active drag or scroll tools\n",
"if p_funcs.toolbar:\n",
" p_funcs.toolbar.active_drag = None\n",
" p_funcs.toolbar.active_scroll = None\n",
"\n",
"# Add labels above bars (optional)\n",
"func_source = ColumnDataSource(data={'func': func_labels, 'count': func_counts, 'pos': func_counts})\n",
"labels = LabelSet(x='func', y='pos', text='count', level='glyph', x_offset=-13, y_offset=3, source=func_source)\n",
"p_funcs.add_layout(labels)\n",
"\n",
"show(p_funcs)\n",
"\n",
"# note: save image as 'phrase_function_distribution.png'"
]
},
{
"cell_type": "markdown",
"id": "f39a9e96-1486-4bf8-8fde-4c4ae2f2cf77",
"metadata": {},
"source": [
"# 5 - References \n",
"\n",
"For more details on the BHSA dataset and features, see the [ETCBC BHSA documentation](https://etcbc.github.io/bhsa/).\n",
"\n",
"The BHSaddons repository ([GitHub - tonyjurg/BHSaddons](https://tonyjurg.github.io/BHSaddons/)) provides a list of the parasha specific features we use.\n",
"\n",
"Additionally, the BHSA feature documentation covers morphological and syntactic features, e.g.:\n",
" - the `vt` codes for verb tense ([Vt - BHSA](https://etcbc.github.io/bhsa/features/vt/)),\n",
" - the `domain` clause text-type codes ([Domain - BHSA](https://etcbc.github.io/bhsa/features/domain/)),\n",
" - the `function` codes for phrase roles ([Function - BHSA](https://etcbc.github.io/bhsa/features/function/))."
]
},
{
"cell_type": "markdown",
"id": "0475f4a9-5e70-40f5-81e7-c04ed8ccdca3",
"metadata": {},
"source": [
"# 6 - Notebook version details\n",
"##### [Back to ToC](#TOC)\n",
"\n",
"\n",
"
\n",
" \n",
" | Author | \n",
" Tony Jurg | \n",
"
\n",
" \n",
" | Version | \n",
" 1.0 | \n",
"
\n",
" \n",
" | Date | \n",
" 26 March 2025 | \n",
"
\n",
"
\n",
"
"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}