{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Social Network Analysis of Leviticus 17-26\n",
    "\n",
    "This notebook combines the participant references and semantic roles computed in other phases of this research project. The two datatypes are combined to create a social network model of the data and to explore this model by social network analytical tools. The first SNA-measures are given in this notobook, while more detailed studies of participant roles are reserved for other notebooks in this repo.\n",
    "\n",
    "**Content**\n",
    "1. Import of data\n",
    "2. Cross-tabulating participant and semantic roles\n",
    "3. Creation of network model\n",
    "4. Validation of the model\n",
    "5. First social network analyses"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning: uncompiled fa2util module.  Compile with cython for a 10-100x speed boost.\n"
     ]
    }
   ],
   "source": [
    "#Dataset path\n",
    "PATH = 'datasets/'\n",
    "\n",
    "import csv, collections, html\n",
    "from operator import itemgetter\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import scipy\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "from adjustText import adjust_text\n",
    "import networkx as nx\n",
    "import forceatlas2\n",
    "import random"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  1. Import data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<b title=\"local commit\">TF-app:</b> <span title=\"#113c0687cfce3077734dac1844d244d20f4ace6f offline under ~/text-fabric-data\">C:\\Users\\Ejer/text-fabric-data/annotation/app-bhsa/code</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">data:</b> <span title=\"rv1.6=#bac4a9f5a2bbdede96ba6caea45e762fe88f88c5 offline under ~/text-fabric-data\">C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">data:</b> <span title=\"r1.2=#1ac68e976ee4a7f23eb6bb4c6f401a033d0ec169 offline under ~/text-fabric-data\">C:\\Users\\Ejer/text-fabric-data/etcbc/phono/tf/c</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">data:</b> <span title=\"r1.2=#395dfe2cb69c261862fab9f0289e594a52121d5c offline under ~/text-fabric-data\">C:\\Users\\Ejer/text-fabric-data/etcbc/parallels/tf/c</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">data:</b> <span title=\"rv.1.3.1=#6efbd7adb3ccfd3d4dcd780f76c5f86672395eb8 offline under ~/text-fabric-data\">C:\\Users\\Ejer/text-fabric-data/etcbc/heads/tf/c</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b>Text-Fabric:</b> <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/cheatsheet.html\" title=\"text-fabric-api\">Text-Fabric API 8.3.3</a>, <a target=\"_blank\" href=\"https://github.com/annotation/app-bhsa\" title=\"bhsa TF-app\">app-bhsa</a>, <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/about/searchusage.html\" title=\"Search Templates Introduction and Reference\">Search Reference</a><br><b>Data:</b> <a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/0_home\" title=\"provenance of BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis\">BHSA</a>, <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/writing/hebrew.html\" title=\"How TF features represent text\">Character table</a>, <a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/0_home\" title=\"BHSA feature documentation\">Feature docs</a><br><b>Features:</b><br><details><summary><b>etcbc/heads/tf</b></summary><a target=\"_blank\" href=\"https://github.com/etcbc/heads/tree/master/tf\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/heads/tf/c/sem_set.tf\">sem_set</a><br><b><i><a target=\"_blank\" href=\"https://github.com/etcbc/heads/tree/master/tf\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/heads/tf/c/head.tf\">head</a></i></b><br><b><i><a target=\"_blank\" href=\"https://github.com/etcbc/heads/tree/master/tf\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/heads/tf/c/nhead.tf\">nhead</a></i></b><br><b><i><a target=\"_blank\" href=\"https://github.com/etcbc/heads/tree/master/tf\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/heads/tf/c/obj_prep.tf\">obj_prep</a></i></b><br></details><details><summary><b>BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis</b></summary><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/book\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/book.tf\">book</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/book@ll\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/book@am.tf\">book@ll</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/chapter\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/chapter.tf\">chapter</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/code\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/code.tf\">code</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/det\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/det.tf\">det</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/domain\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/domain.tf\">domain</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/freq_lex\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/freq_lex.tf\">freq_lex</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/function\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/function.tf\">function</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/g_cons\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/g_cons.tf\">g_cons</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/g_cons_utf8\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/g_cons_utf8.tf\">g_cons_utf8</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/g_lex\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/g_lex.tf\">g_lex</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/g_lex_utf8\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/g_lex_utf8.tf\">g_lex_utf8</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/g_word\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/g_word.tf\">g_word</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/g_word_utf8\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/g_word_utf8.tf\">g_word_utf8</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/gloss\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/gloss.tf\">gloss</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/gn\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/gn.tf\">gn</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/label\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/label.tf\">label</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/language\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/language.tf\">language</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/lex\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/lex.tf\">lex</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/lex_utf8\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/lex_utf8.tf\">lex_utf8</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/ls\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/ls.tf\">ls</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/nametype\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/nametype.tf\">nametype</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/nme\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/nme.tf\">nme</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/nu\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/nu.tf\">nu</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/number\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/number.tf\">number</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/otype\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/otype.tf\">otype</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/pargr\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/pargr.tf\">pargr</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/pdp\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/pdp.tf\">pdp</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/pfm\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/pfm.tf\">pfm</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/prs\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/prs.tf\">prs</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/prs_gn\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/prs_gn.tf\">prs_gn</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/prs_nu\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/prs_nu.tf\">prs_nu</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/prs_ps\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/prs_ps.tf\">prs_ps</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/ps\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/ps.tf\">ps</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/qere\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/qere.tf\">qere</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/qere_trailer\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/qere_trailer.tf\">qere_trailer</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/qere_trailer_utf8\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/qere_trailer_utf8.tf\">qere_trailer_utf8</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/qere_utf8\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/qere_utf8.tf\">qere_utf8</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/rank_lex\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/rank_lex.tf\">rank_lex</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/rela\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/rela.tf\">rela</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/sp\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/sp.tf\">sp</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/st\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/st.tf\">st</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/tab\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/tab.tf\">tab</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/trailer\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/trailer.tf\">trailer</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/trailer_utf8\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/trailer_utf8.tf\">trailer_utf8</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/txt\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/txt.tf\">txt</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/typ\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/typ.tf\">typ</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/uvf\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/uvf.tf\">uvf</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/vbe\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/vbe.tf\">vbe</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/vbs\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/vbs.tf\">vbs</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/verse\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/verse.tf\">verse</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/voc_lex\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/voc_lex.tf\">voc_lex</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/voc_lex_utf8\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/voc_lex_utf8.tf\">voc_lex_utf8</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/vs\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/vs.tf\">vs</a><br><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/vt\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/vt.tf\">vt</a><br><b><i><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/mother\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/mother.tf\">mother</a></i></b><br><b><i><a target=\"_blank\" href=\"https://etcbc.github.io/bhsa/features/oslots\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/bhsa/tf/c/oslots.tf\">oslots</a></i></b><br></details><details><summary><b>Parallel Passages</b></summary><b><i><a target=\"_blank\" href=\"https://nbviewer.jupyter.org/github/etcbc/parallels/blob/master/programs/parallels.ipynb\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/parallels/tf/c/crossref.tf\">crossref</a></i></b><br></details><details><summary><b>Phonetic Transcriptions</b></summary><a target=\"_blank\" href=\"https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/phono/tf/c/phono.tf\">phono</a><br><a target=\"_blank\" href=\"https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb\" title=\"C:\\Users\\Ejer/text-fabric-data/etcbc/phono/tf/c/phono_trailer.tf\">phono_trailer</a><br></details>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<style>tr.tf.ltr, td.tf.ltr, th.tf.ltr { text-align: left ! important;}\n",
       "tr.tf.rtl, td.tf.rtl, th.tf.rtl { text-align: right ! important;}\n",
       "@font-face {\n",
       "  font-family: \"Gentium Plus\";\n",
       "  src: local('Gentium Plus'), local('GentiumPlus'),\n",
       "    url('/server/static/fonts/GentiumPlus-R.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/server/static/fonts/GentiumPlus-R.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"Ezra SIL\";\n",
       "  src: local('Ezra SIL'), local('EzraSIL'),\n",
       "    url('/server/static/fonts/SILEOT.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/server/static/fonts/SILEOT.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"SBL Hebrew\";\n",
       "  src: local('SBL Hebrew'), local('SBLHebrew'),\n",
       "    url('/server/static/fonts/SBL_Hbrw.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/server/static/fonts/SBL_Hbrw.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"Estrangelo Edessa\";\n",
       "  src: local('Estrangelo Edessa'), local('EstrangeloEdessa');\n",
       "    url('/server/static/fonts/SyrCOMEdessa.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/server/static/fonts/SyrCOMEdessa.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: AmiriQuran;\n",
       "  font-style: normal;\n",
       "  font-weight: 400;\n",
       "  src: local('Amiri Quran'), local('AmiriQuran'),\n",
       "    url('/server/static/fonts/AmiriQuran.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/server/static/fonts/AmiriQuran.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: AmiriQuranColored;\n",
       "  font-style: normal;\n",
       "  font-weight: 400;\n",
       "  src: local('Amiri Quran Colored'), local('AmiriQuranColored'),\n",
       "    url('/server/static/fonts/AmiriQuranColored.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/server/static/fonts/AmiriQuranColored.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"Santakku\";\n",
       "  src: local('Santakku'),\n",
       "    url('/server/static/fonts/Santakku.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/server/static/fonts/Santakku.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"SantakkuM\";\n",
       "  src: local('SantakkuM'),\n",
       "    url('/server/static/fonts/SantakkuM.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/server/static/fonts/SantakkuM.woff?raw=true') format('woff');\n",
       "}\n",
       "/* bypassing some classical notebook settings */\n",
       "div#notebook {\n",
       "  line-height: unset;\n",
       "}\n",
       "/* neutral text */\n",
       ".txtn,.txtn a:visited,.txtn a:link {\n",
       "    font-family: sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* transcription text */\n",
       ".txtt,.txtt a:visited,.txtt a:link {\n",
       "    font-family: monospace;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* source text */\n",
       ".txto,.txto a:visited,.txto a:link {\n",
       "    font-family: serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* phonetic text */\n",
       ".txtp,.txtp a:visited,.txtp a:link {\n",
       "    font-family: Gentium, sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* original script text */\n",
       ".txtu,.txtu a:visited,.txtu a:link {\n",
       "    font-family: Gentium, sans-serif;\n",
       "    font-size: medium;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* hebrew */\n",
       ".txtu.hbo,.lex.hbo {\n",
       "    font-family: \"Ezra SIL\", \"SBL Hebrew\", sans-serif;\n",
       "    font-size: large;\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* syriac */\n",
       ".txtu.syc,.lex.syc {\n",
       "    font-family: \"Estrangelo Edessa\", sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* neo aramaic */\n",
       ".txtu.cld,.lex.cld {\n",
       "    font-family: \"CharisSIL-R\", sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* standard arabic */\n",
       ".txtu.ara,.lex.ara {\n",
       "    font-family: \"AmiriQuran\", sans-serif;\n",
       "    font-size: large;\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* cuneiform */\n",
       ".txtu.akk,.lex.akk {\n",
       "    font-family: Santakku, sans-serif;\n",
       "    font-size: large;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* greek */\n",
       ".txtu.grc,.lex.grc a:link {\n",
       "    font-family: Gentium, sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "a:hover {\n",
       "    text-decoration: underline | important;\n",
       "    color: #0000ff | important;\n",
       "}\n",
       ".ltr {\n",
       "    direction: ltr ! important;\n",
       "}\n",
       ".rtl {\n",
       "    direction: rtl ! important;\n",
       "}\n",
       ".features {\n",
       "    font-family: monospace;\n",
       "    font-size: medium;\n",
       "    font-weight: bold;\n",
       "    color: var(--features);\n",
       "    display: flex;\n",
       "    flex-flow: column nowrap;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "    padding: 0.1rem;\n",
       "    margin: 0.1rem;\n",
       "    direction: ltr;\n",
       "    border: var(--meta-width) solid var(--meta-color);\n",
       "    border-radius: var(--meta-width);\n",
       "}\n",
       ".features div,.features span {\n",
       "    padding: 0;\n",
       "    margin: -0.1rem 0;\n",
       "}\n",
       ".features .f {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    font-weight: normal;\n",
       "    color: #5555bb;\n",
       "}\n",
       ".features .xft {\n",
       "  color: #000000;\n",
       "  background-color: #eeeeee;\n",
       "  font-size: medium;\n",
       "  margin: 0.1rem 0rem;\n",
       "}\n",
       ".features .xft .f {\n",
       "  color: #000000;\n",
       "  background-color: #eeeeee;\n",
       "  font-size: small;\n",
       "  font-weight: normal;\n",
       "}\n",
       ".section {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    font-weight: bold;\n",
       "    color: var(--section);\n",
       "    unicode-bidi: embed;\n",
       "    text-align: start;\n",
       "}\n",
       ".structure {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    font-weight: bold;\n",
       "    color: var(--structure);\n",
       "    unicode-bidi: embed;\n",
       "    text-align: start;\n",
       "}\n",
       ".comments {\n",
       "    display: flex;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "    flex-flow: column nowrap;\n",
       "}\n",
       ".nd, a:link.nd {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    color: var(--node);\n",
       "    vertical-align: super;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".lex {\n",
       "  color: var(--lex-color);;\n",
       "}\n",
       ".children,.children.ltr {\n",
       "    display: flex;\n",
       "    border: 0;\n",
       "    background-color: #ffffff;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "}\n",
       ".children.stretch {\n",
       "    align-items: stretch;\n",
       "}\n",
       ".children.hor {\n",
       "    flex-flow: row nowrap;\n",
       "}\n",
       ".children.hor.wrap {\n",
       "    flex-flow: row wrap;\n",
       "}\n",
       ".children.ver {\n",
       "    flex-flow: column nowrap;\n",
       "}\n",
       ".children.ver.wrap {\n",
       "    flex-flow: column wrap;\n",
       "}\n",
       ".contnr {\n",
       "    width: fit-content;\n",
       "    display: flex;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "    flex-flow: column nowrap;\n",
       "    background: #ffffff none repeat scroll 0 0;\n",
       "    padding:  0.5rem 0.1rem 0.1rem 0.1rem;\n",
       "    margin: 0.8rem 0.1rem 0.1rem 0.1rem;\n",
       "    border-style: solid;\n",
       "    font-size: small;\n",
       "}\n",
       ".contnr.trm {\n",
       "    background-attachment: local;\n",
       "}\n",
       ".contnr.cnul {\n",
       "    padding:  0;\n",
       "    margin: 0;\n",
       "    border-style: solid;\n",
       "    font-size: xx-small;\n",
       "}\n",
       ".contnr.cnul,.lbl.cnul {\n",
       "    border-color: var(--border-color-nul);\n",
       "    border-width: var(--border-width-nul);\n",
       "    border-radius: var(--border-width-nul);\n",
       "}\n",
       ".contnr.c0,.lbl.c0 {\n",
       "    border-color: var(--border-color0);\n",
       "    border-width: var(--border-width0);\n",
       "    border-radius: var(--border-width0);\n",
       "}\n",
       ".contnr.c1,.lbl.c1 {\n",
       "    border-color: var(--border-color1);\n",
       "    border-width: var(--border-width1);\n",
       "    border-radius: var(--border-width1);\n",
       "}\n",
       ".contnr.c2,.lbl.c2 {\n",
       "    border-color: var(--border-color2);\n",
       "    border-width: var(--border-width2);\n",
       "    border-radius: var(--border-width2);\n",
       "}\n",
       ".contnr.c3,.lbl.c3 {\n",
       "    border-color: var(--border-color3);\n",
       "    border-width: var(--border-width3);\n",
       "    border-radius: var(--border-width3);\n",
       "}\n",
       ".contnr.c4,.lbl.c4 {\n",
       "    border-color: var(--border-color4);\n",
       "    border-width: var(--border-width4);\n",
       "    border-radius: var(--border-width4);\n",
       "}\n",
       "span.plain {\n",
       "    display: inline-block;\n",
       "    white-space: pre-wrap;\n",
       "}\n",
       ".plain {\n",
       "    background-color: #ffffff;\n",
       "}\n",
       ".plain.l,.contnr.l,.contnr.l>.lbl {\n",
       "    border-left-style: dotted\n",
       "}\n",
       ".plain.r,.contnr.r,.contnr.r>.lbl {\n",
       "    border-right-style: dotted\n",
       "}\n",
       ".plain.lno,.contnr.lno,.contnr.lno>.lbl {\n",
       "    border-left-style: none\n",
       "}\n",
       ".plain.rno,.contnr.rno,.contnr.rno>.lbl {\n",
       "    border-right-style: none\n",
       "}\n",
       ".plain.l {\n",
       "    padding-left: 0.2rem;\n",
       "    margin-left: 0.1rem;\n",
       "    border-width: var(--border-width-plain);\n",
       "}\n",
       ".plain.r {\n",
       "    padding-right: 0.2rem;\n",
       "    margin-right: 0.1rem;\n",
       "    border-width: var(--border-width-plain);\n",
       "}\n",
       ".lbl {\n",
       "    font-family: monospace;\n",
       "    margin-top: -1.2rem;\n",
       "    margin-left: 1rem;\n",
       "    background: #ffffff none repeat scroll 0 0;\n",
       "    padding: 0 0.3rem;\n",
       "    border-style: solid;\n",
       "    display: block;\n",
       "    color: var(--label)\n",
       "}\n",
       ".lbl.trm {\n",
       "    background-attachment: local;\n",
       "    margin-top: 0.1rem;\n",
       "    margin-left: 0.1rem;\n",
       "    padding: 0.1rem 0.1rem;\n",
       "    border-style: none;\n",
       "}\n",
       ".lbl.cnul {\n",
       "    font-size: xx-small;\n",
       "}\n",
       ".lbl.c0 {\n",
       "    font-size: small;\n",
       "}\n",
       ".lbl.c1 {\n",
       "    font-size: small;\n",
       "}\n",
       ".lbl.c2 {\n",
       "    font-size: medium;\n",
       "}\n",
       ".lbl.c3 {\n",
       "    font-size: medium;\n",
       "}\n",
       ".lbl.c4 {\n",
       "    font-size: large;\n",
       "}\n",
       ".occs, a:link.occs {\n",
       "    font-size: small;\n",
       "}\n",
       "\n",
       "/* PROVENANCE */\n",
       "\n",
       "div.prov {\n",
       "\tmargin: 2rem;\n",
       "\tpadding: 1rem;\n",
       "\tborder: 0.1rem solid var(--fog-rim);\n",
       "}\n",
       "div.pline {\n",
       "\tdisplay: flex;\n",
       "\tflex-flow: row nowrap;\n",
       "\tjustify-content: stretch;\n",
       "\talign-items: baseline;\n",
       "}\n",
       "div.p2line {\n",
       "\tmargin-left: 2em;\n",
       "\tdisplay: flex;\n",
       "\tflex-flow: row nowrap;\n",
       "\tjustify-content: stretch;\n",
       "\talign-items: baseline;\n",
       "}\n",
       "div.psline {\n",
       "\tdisplay: flex;\n",
       "\tflex-flow: row nowrap;\n",
       "\tjustify-content: stretch;\n",
       "\talign-items: baseline;\n",
       "\tbackground-color: var(--gold-mist-back);\n",
       "}\n",
       "div.pname {\n",
       "\tflex: 0 0 5rem;\n",
       "\tfont-weight: bold;\n",
       "}\n",
       "div.pval {\n",
       "    flex: 1 1 auto;\n",
       "}\n",
       "\n",
       ":root {\n",
       "\t--node:               hsla(120, 100%,  20%, 1.0  );\n",
       "\t--label:              hsla(  0, 100%,  20%, 1.0  );\n",
       "\t--section:            hsla(  0, 100%,  25%, 1.0  );\n",
       "\t--structure:          hsla(120, 100%,  25%, 1.0  );\n",
       "\t--features:           hsla(  0,   0%,  30%, 1.0  );\n",
       "  --text-color:         hsla( 60,  80%,  10%, 1.0  );\n",
       "  --lex-color:          hsla(220,  90%,  60%, 1.0  );\n",
       "  --meta-color:         hsla(  0,   0%,  90%, 0.7  );\n",
       "  --meta-width:         0.15rem;\n",
       "  --border-color-nul:   hsla(  0,   0%,  90%, 0.5  );\n",
       "  --border-color0:      hsla(  0,   0%,  90%, 0.9  );\n",
       "  --border-color1:      hsla(  0,   0%,  80%, 0.9  );\n",
       "  --border-color2:      hsla(  0,   0%,  70%, 0.9  );\n",
       "  --border-color3:      hsla(  0,   0%,  80%, 0.8  );\n",
       "  --border-color4:      hsla(  0,   0%,  60%, 0.9  );\n",
       "  --border-width-nul:   0.1rem;\n",
       "  --border-width0:      0.1rem;\n",
       "  --border-width1:      0.15rem;\n",
       "  --border-width2:      0.2rem;\n",
       "  --border-width3:      0.3rem;\n",
       "  --border-width4:      0.25rem;\n",
       "  --border-width-plain: 0.1rem;\n",
       "}\n",
       ".hl {\n",
       "  background-color: var(--hl-strong);\n",
       "}\n",
       "span.hl {\n",
       "\tbackground-color: var(--hl-strong);\n",
       "\tborder-width: 0;\n",
       "\tborder-radius: 0.1rem;\n",
       "\tborder-style: solid;\n",
       "}\n",
       "div.contnr.hl,div.lbl.hl {\n",
       "  background-color: var(--hl-strong);\n",
       "}\n",
       "div.contnr.hl {\n",
       "  border-color: var(--hl-rim) ! important;\n",
       "\tborder-width: 0.2rem ! important;\n",
       "}\n",
       "\n",
       "span.hlbx {\n",
       "\tborder-color: var(--hl-rim);\n",
       "\tborder-width: 0.2rem ! important;\n",
       "\tborder-style: solid;\n",
       "\tborder-radius: 0.3rem;\n",
       "  padding: 0.2rem;\n",
       "  margin: 0.2rem;\n",
       "}\n",
       "\n",
       "span.plain {\n",
       "  display: inline-block;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "\n",
       ":root {\n",
       "\t--hl-strong:        hsla( 60, 100%,  70%, 0.9  );\n",
       "\t--hl-rim:           hsla( 55,  80%,  50%, 1.0  );\n",
       "}\n",
       "</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div><b>Text-Fabric API:</b> names <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/cheatsheet.html\" title=\"doc\">N F E L T S C TF</a> directly usable</div><hr>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#Importing the Hebrew data and Text-Fabric\n",
    "from tf.app import use\n",
    "A = use('bhsa', hoist=globals(), mod='etcbc/heads/tf')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.a Import of participant reference data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>participant</th>\n",
       "      <th>refs</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>JHWH</td>\n",
       "      <td>944128 946176 946179 946182 946184 944142 9441...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MCH=</td>\n",
       "      <td>945152 945537 945155 945540 945547 945555 9449...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>&gt;HRN</td>\n",
       "      <td>944640 944641 65555 944662 65561 944666 944667...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BN JFR&gt;L</td>\n",
       "      <td>67584 944132 944133 944139 946216 946217 94417...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>&gt;JC &gt;JC</td>\n",
       "      <td>945664 64514 945666 945668 944135 944136 94567...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  participant                                               refs\n",
       "0        JHWH  944128 946176 946179 946182 946184 944142 9441...\n",
       "1        MCH=  945152 945537 945155 945540 945547 945555 9449...\n",
       "2        >HRN  944640 944641 65555 944662 65561 944666 944667...\n",
       "3    BN JFR>L  67584 944132 944133 944139 946216 946217 94417...\n",
       "4     >JC >JC  945664 64514 945666 945668 944135 944136 94567..."
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.read_csv(f'{PATH}participants_FINAL.csv')\n",
    "df.columns = ['participant','refs']\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The references are transformed to lists and their respective frequencies in the corpus are counted "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "ref_list = []\n",
    "participant_freq = []\n",
    "\n",
    "for row in df.iterrows():\n",
    "    refs = [int(r) for r in row[1].refs.split()]\n",
    "    ref_list.append(refs)\n",
    "    participant_freq.append(len(refs))\n",
    "    \n",
    "df.insert(2, 'ref_list', ref_list)\n",
    "df.insert(3, 'freq', participant_freq)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>participant</th>\n",
       "      <th>refs</th>\n",
       "      <th>ref_list</th>\n",
       "      <th>freq</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>JHWH</td>\n",
       "      <td>944128 946176 946179 946182 946184 944142 9441...</td>\n",
       "      <td>[944128, 946176, 946179, 946182, 946184, 94414...</td>\n",
       "      <td>476</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MCH=</td>\n",
       "      <td>945152 945537 945155 945540 945547 945555 9449...</td>\n",
       "      <td>[945152, 945537, 945155, 945540, 945547, 94555...</td>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>&gt;HRN</td>\n",
       "      <td>944640 944641 65555 944662 65561 944666 944667...</td>\n",
       "      <td>[944640, 944641, 65555, 944662, 65561, 944666,...</td>\n",
       "      <td>164</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BN JFR&gt;L</td>\n",
       "      <td>67584 944132 944133 944139 946216 946217 94417...</td>\n",
       "      <td>[67584, 944132, 944133, 944139, 946216, 946217...</td>\n",
       "      <td>579</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>&gt;JC &gt;JC</td>\n",
       "      <td>945664 64514 945666 945668 944135 944136 94567...</td>\n",
       "      <td>[945664, 64514, 945666, 945668, 944135, 944136...</td>\n",
       "      <td>277</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  participant                                               refs  \\\n",
       "0        JHWH  944128 946176 946179 946182 946184 944142 9441...   \n",
       "1        MCH=  945152 945537 945155 945540 945547 945555 9449...   \n",
       "2        >HRN  944640 944641 65555 944662 65561 944666 944667...   \n",
       "3    BN JFR>L  67584 944132 944133 944139 946216 946217 94417...   \n",
       "4     >JC >JC  945664 64514 945666 945668 944135 944136 94567...   \n",
       "\n",
       "                                            ref_list  freq  \n",
       "0  [944128, 946176, 946179, 946182, 946184, 94414...   476  \n",
       "1  [945152, 945537, 945155, 945540, 945547, 94555...    60  \n",
       "2  [944640, 944641, 65555, 944662, 65561, 944666,...   164  \n",
       "3  [67584, 944132, 944133, 944139, 946216, 946217...   579  \n",
       "4  [945664, 64514, 945666, 945668, 944135, 944136...   277  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of participants: 75\n"
     ]
    }
   ],
   "source": [
    "print(f'Number of participants: {len(df)}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Two functions fetch the participant label from any given word or phrase in the text."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def getLabel(ref, df=df):\n",
    "    '''\n",
    "    This function fetches the actor/participant reference from the participant dataframe.\n",
    "    '''\n",
    "    \n",
    "    actor_list = []\n",
    "    \n",
    "    for row in df.iterrows():\n",
    "        if ref in row[1].ref_list:\n",
    "            actor_list.append(row[1].participant)\n",
    "    \n",
    "    return actor_list\n",
    "\n",
    "def Actor(ref, df=df):\n",
    "    '''\n",
    "    This function takes a reference as input and returns the participant label. Phrases are treated differently, becuase \n",
    "    non-verbal phrases require additional measures to find the nominal head of the phrase and return the label for that \n",
    "    particular constituent.\n",
    "    '''\n",
    "    \n",
    "    nom_head = E.nhead.t(ref) #Finding the nominal head(s) of the phrase\n",
    "    \n",
    "    if F.otype.v(ref) == 'word': #Identifying object suffixes\n",
    "        return getLabel(ref, df=df)\n",
    "    \n",
    "    elif F.typ.v(ref) == 'VP':\n",
    "        return getLabel(L.d(ref, 'phrase_atom')[0], df=df)\n",
    "    \n",
    "    elif F.typ.v(ref) == 'PP':\n",
    "        if len(nom_head) > 1:\n",
    "            return getLabel(L.d(ref, 'phrase_atom')[0], df=df)\n",
    "        if nom_head != E.head.t(ref): #If equal, the reference is a simple preposition with a suffix\n",
    "            return getLabel(L.u(nom_head[0], 'phrase_atom')[0], df=df)\n",
    "        else:\n",
    "            if getLabel(E.head.t(ref)[0], df=df):\n",
    "                return getLabel(E.head.t(ref)[0], df=df)\n",
    "            else:\n",
    "                return getLabel(L.u(nom_head[0], 'phrase_atom')[0], df=df)\n",
    "        \n",
    "    elif F.typ.v(ref) in {'NP','PrNP','PPrP','DPrP','CP'}:\n",
    "        return getLabel(L.u(nom_head[0], 'phrase_atom')[0], df=df)\n",
    "        \n",
    "    else:\n",
    "        return \"error\"\n",
    "\n",
    "#Actor(65418)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.b Import agency ranks of participants"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Vol</th>\n",
       "      <th>Inst</th>\n",
       "      <th>Aff</th>\n",
       "      <th>neg</th>\n",
       "      <th>role</th>\n",
       "      <th>new_role</th>\n",
       "      <th>new_rank</th>\n",
       "      <th>rank</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>688348</th>\n",
       "      <td>y</td>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Agent</td>\n",
       "      <td>Agent</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>688349</th>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>y</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Volitional Undergoer</td>\n",
       "      <td>Volitional Undergoer</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>688350</th>\n",
       "      <td>y</td>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Agent</td>\n",
       "      <td>Agent</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>688351</th>\n",
       "      <td>y</td>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Agent</td>\n",
       "      <td>Agent</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>688352</th>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>y</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Volitional Undergoer</td>\n",
       "      <td>Volitional Undergoer</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       Vol Inst Aff  neg                  role              new_role  \\\n",
       "688348   y    y   n  NaN                 Agent                 Agent   \n",
       "688349   y    n   y  NaN  Volitional Undergoer  Volitional Undergoer   \n",
       "688350   y    y   n  NaN                 Agent                 Agent   \n",
       "688351   y    y   n  NaN                 Agent                 Agent   \n",
       "688352   y    n   y  NaN  Volitional Undergoer  Volitional Undergoer   \n",
       "\n",
       "        new_rank  rank  \n",
       "688348         5     5  \n",
       "688349        -1    -1  \n",
       "688350         5     5  \n",
       "688351         5     5  \n",
       "688352        -1    -1  "
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ranks_df = pd.read_csv(f'{PATH}role_ranks.csv', index_col=0)\n",
    "ranks_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A function is defined to return the agency of any given reference"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def Agency(ref, colname, df=ranks_df):\n",
    "    \n",
    "    if ref in list(df.index):\n",
    "        return df[df.index == ref][colname].item()\n",
    "\n",
    "#Agency(68032, 'new_rank')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Cross-tabulating participants and roles\n",
    "\n",
    "This section cross-tabulates the participant and role data to calculate the mean agency of each participant."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "actor_list = [Actor(ph) for ph in list(ranks_df.index)]\n",
    "ranks_df.insert(8, 'Actor', actor_list) #The actor is inserted as a new column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Vol</th>\n",
       "      <th>Inst</th>\n",
       "      <th>Aff</th>\n",
       "      <th>neg</th>\n",
       "      <th>role</th>\n",
       "      <th>new_role</th>\n",
       "      <th>new_rank</th>\n",
       "      <th>rank</th>\n",
       "      <th>Actor</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>688348</th>\n",
       "      <td>y</td>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Agent</td>\n",
       "      <td>Agent</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>[JHWH]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>688349</th>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>y</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Volitional Undergoer</td>\n",
       "      <td>Volitional Undergoer</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>[MCH=]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>688350</th>\n",
       "      <td>y</td>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Agent</td>\n",
       "      <td>Agent</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>[JHWH]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>688351</th>\n",
       "      <td>y</td>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Agent</td>\n",
       "      <td>Agent</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>[MCH=]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>688352</th>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>y</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Volitional Undergoer</td>\n",
       "      <td>Volitional Undergoer</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>[&gt;HRN, BN JFR&gt;L, BN &gt;HRN]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       Vol Inst Aff  neg                  role              new_role  \\\n",
       "688348   y    y   n  NaN                 Agent                 Agent   \n",
       "688349   y    n   y  NaN  Volitional Undergoer  Volitional Undergoer   \n",
       "688350   y    y   n  NaN                 Agent                 Agent   \n",
       "688351   y    y   n  NaN                 Agent                 Agent   \n",
       "688352   y    n   y  NaN  Volitional Undergoer  Volitional Undergoer   \n",
       "\n",
       "        new_rank  rank                      Actor  \n",
       "688348         5     5                     [JHWH]  \n",
       "688349        -1    -1                     [MCH=]  \n",
       "688350         5     5                     [JHWH]  \n",
       "688351         5     5                     [MCH=]  \n",
       "688352        -1    -1  [>HRN, BN JFR>L, BN >HRN]  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ranks_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Cross-tabulation of the data to count how often each participant obtains a certain agency level:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>5</th>\n",
       "      <th>4</th>\n",
       "      <th>3</th>\n",
       "      <th>1</th>\n",
       "      <th>0</th>\n",
       "      <th>-1</th>\n",
       "      <th>-2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>JHWH</th>\n",
       "      <td>118</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>8</td>\n",
       "      <td>29</td>\n",
       "      <td>30</td>\n",
       "      <td>17</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MCH=</th>\n",
       "      <td>36</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>19</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>&gt;HRN</th>\n",
       "      <td>16</td>\n",
       "      <td>0</td>\n",
       "      <td>11</td>\n",
       "      <td>31</td>\n",
       "      <td>1</td>\n",
       "      <td>19</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BN JFR&gt;L</th>\n",
       "      <td>99</td>\n",
       "      <td>0</td>\n",
       "      <td>44</td>\n",
       "      <td>72</td>\n",
       "      <td>28</td>\n",
       "      <td>83</td>\n",
       "      <td>31</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BN &gt;HRN</th>\n",
       "      <td>16</td>\n",
       "      <td>0</td>\n",
       "      <td>6</td>\n",
       "      <td>22</td>\n",
       "      <td>5</td>\n",
       "      <td>17</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            5   4   3   1   0  -1  -2\n",
       "JHWH      118   0   1   8  29  30  17\n",
       "MCH=       36   0   1   0   1  19   0\n",
       ">HRN       16   0  11  31   1  19  10\n",
       "BN JFR>L   99   0  44  72  28  83  31\n",
       "BN >HRN    16   0   6  22   5  17   5"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dic = collections.defaultdict(lambda: collections.defaultdict(int))\n",
    "\n",
    "for row in ranks_df.iterrows():\n",
    "    for n in row[1].Actor:\n",
    "        dic[n][row[1].new_rank] += 1\n",
    "        \n",
    "agency_df = pd.DataFrame(dic).fillna(0).astype('Int64').T\n",
    "agency_df = agency_df[[5,4,3,1,0,-1,-2]]\n",
    "agency_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The mean agency is calculated"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "agency_mean = []\n",
    "\n",
    "for row in agency_df.iterrows():\n",
    "    n=0\n",
    "    total = 0\n",
    "    for v in row[1]:\n",
    "        total += (v * agency_df.columns[n])\n",
    "        n+=1\n",
    "    agency_mean.append(round(total/row[1].sum(), 3))\n",
    "    \n",
    "agency_df.insert(7, 'mean', agency_mean)\n",
    "\n",
    "#Inserting labels\n",
    "labels = [label_gloss[l] if l in label_gloss else l for l in list(agency_df.index)]\n",
    "agency_df.insert(0, 'label', labels)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>label</th>\n",
       "      <th>5</th>\n",
       "      <th>4</th>\n",
       "      <th>3</th>\n",
       "      <th>1</th>\n",
       "      <th>0</th>\n",
       "      <th>-1</th>\n",
       "      <th>-2</th>\n",
       "      <th>mean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>MCH=</th>\n",
       "      <td>Moses</td>\n",
       "      <td>36</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>19</td>\n",
       "      <td>0</td>\n",
       "      <td>2.877</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>JHWH</th>\n",
       "      <td>YHWH</td>\n",
       "      <td>118</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>8</td>\n",
       "      <td>29</td>\n",
       "      <td>30</td>\n",
       "      <td>17</td>\n",
       "      <td>2.645</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>&gt;JC &gt;JC</th>\n",
       "      <td>an_Israelite</td>\n",
       "      <td>60</td>\n",
       "      <td>0</td>\n",
       "      <td>22</td>\n",
       "      <td>7</td>\n",
       "      <td>4</td>\n",
       "      <td>6</td>\n",
       "      <td>38</td>\n",
       "      <td>2.124</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2ms</th>\n",
       "      <td>2msg</td>\n",
       "      <td>21</td>\n",
       "      <td>0</td>\n",
       "      <td>10</td>\n",
       "      <td>57</td>\n",
       "      <td>8</td>\n",
       "      <td>8</td>\n",
       "      <td>2</td>\n",
       "      <td>1.698</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BN JFR&gt;L</th>\n",
       "      <td>Israelites</td>\n",
       "      <td>99</td>\n",
       "      <td>0</td>\n",
       "      <td>44</td>\n",
       "      <td>72</td>\n",
       "      <td>28</td>\n",
       "      <td>83</td>\n",
       "      <td>31</td>\n",
       "      <td>1.552</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>GR</th>\n",
       "      <td>sojourner</td>\n",
       "      <td>45</td>\n",
       "      <td>0</td>\n",
       "      <td>16</td>\n",
       "      <td>5</td>\n",
       "      <td>13</td>\n",
       "      <td>9</td>\n",
       "      <td>38</td>\n",
       "      <td>1.532</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BN &gt;HRN</th>\n",
       "      <td>Aaron's_sons</td>\n",
       "      <td>16</td>\n",
       "      <td>0</td>\n",
       "      <td>6</td>\n",
       "      <td>22</td>\n",
       "      <td>5</td>\n",
       "      <td>17</td>\n",
       "      <td>5</td>\n",
       "      <td>1.310</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>&gt;HRN</th>\n",
       "      <td>Aaron</td>\n",
       "      <td>16</td>\n",
       "      <td>0</td>\n",
       "      <td>11</td>\n",
       "      <td>31</td>\n",
       "      <td>1</td>\n",
       "      <td>19</td>\n",
       "      <td>10</td>\n",
       "      <td>1.193</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>&gt;X -2ms</th>\n",
       "      <td>brother</td>\n",
       "      <td>11</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>16</td>\n",
       "      <td>10</td>\n",
       "      <td>13</td>\n",
       "      <td>0.537</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>HM</th>\n",
       "      <td>remnants</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>5</td>\n",
       "      <td>13</td>\n",
       "      <td>0.138</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>&lt;M</th>\n",
       "      <td>foreign_nations</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>10</td>\n",
       "      <td>-0.227</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                    label    5  4   3   1   0  -1  -2   mean\n",
       "MCH=                Moses   36  0   1   0   1  19   0  2.877\n",
       "JHWH                 YHWH  118  0   1   8  29  30  17  2.645\n",
       ">JC >JC      an_Israelite   60  0  22   7   4   6  38  2.124\n",
       "2ms                  2msg   21  0  10  57   8   8   2  1.698\n",
       "BN JFR>L       Israelites   99  0  44  72  28  83  31  1.552\n",
       "GR              sojourner   45  0  16   5  13   9  38  1.532\n",
       "BN >HRN      Aaron's_sons   16  0   6  22   5  17   5  1.310\n",
       ">HRN                Aaron   16  0  11  31   1  19  10  1.193\n",
       ">X -2ms           brother   11  0   3   1  16  10  13  0.537\n",
       "HM               remnants    3  2   4   0   2   5  13  0.138\n",
       "<M        foreign_nations    3  0   1   0   5   3  10 -0.227"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "agency_df = agency_df[agency_df.sum(axis=1) > 20]\n",
    "agency_df.sort_values(by='mean', ascending=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Creating nodes and edges\n",
    "\n",
    "The network model combines participant data and semantic roles. The primary principle is to isolate those clauses where at least two participants occur (they can be identical) which means that isolated participants are ignored. Secondly, the edges are made from the participant with the highest agency level toward the participant with the lowest agency level within the same clause. We can assume that the participant with the highest agency level is also most active in the event and therefore the source of the event."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "def createEdges(colname, df=df, ranks_df=ranks_df, verb_list = [], relation='function', label_text='gloss', mode=str()):\n",
    "    '''\n",
    "    Input: dictionary of actors + nodes (references), plus preferred text type, that is, English gloss (default)\n",
    "        or transcription of the Hebrew lexeme (= trans)\n",
    "        colname is name of the rank column (usually \"rank\" or \"new_rank\")\n",
    "    Output: dictionary of edges and labels\n",
    "    '''\n",
    "    \n",
    "    error_list = []\n",
    "    \n",
    "    #Finding intersection between nodes\n",
    "    clause_node_list = []\n",
    "    for i, row in df.iterrows():\n",
    "        refs = [int(r) for r in row.refs.split()]\n",
    "        clause_node_list += list(set([L.u(n, 'clause')[0] for n in refs]))\n",
    "        \n",
    "    #Intersections are calculated by counting the frequency of unique clauses. If a clause appears more than once, there is\n",
    "    #an intersection\n",
    "    counter = collections.Counter(clause_node_list)\n",
    "    intersection = [n for n in counter if counter[n] > 1]\n",
    "         \n",
    "    edges = []\n",
    "    \n",
    "    if intersection:\n",
    "        \n",
    "        for cl in intersection: #Looping over clauses with intersecting actors\n",
    "            \n",
    "            clause_inventory = []\n",
    "            pred = False\n",
    "            \n",
    "            for ph in L.d(cl, 'phrase'):\n",
    "                ph_info = {}\n",
    "                sfx_info = {} #Directory for object suffixes\n",
    "                \n",
    "                rank = Agency(ph, colname, ranks_df)\n",
    "                \n",
    "                #Get verb gloss if Predicate\n",
    "                if F.function.v(ph) in {'Pred','PreS','PreO','PtcO','PreC'}:\n",
    "                    pred = True\n",
    "                    \n",
    "                    #Finding verb gloss:\n",
    "                    for w in L.d(ph, 'word'):\n",
    "                        if F.sp.v(w) == 'verb':\n",
    "                            pred_gloss, pred_lex = F.gloss.v(L.u(w, 'lex')[0]), F.lex.v(w)\n",
    "                \n",
    "                #If the phrase is annotated with a rank (agency), it is fetched.\n",
    "                if rank or rank == 0:\n",
    "                    \n",
    "                    ph_info['ref'] = ph\n",
    "                    ph_info['function'] = F.function.v(ph)\n",
    "                    ph_info['rank'] = rank\n",
    "                    \n",
    "                    clause_inventory.append(ph_info)\n",
    "                    \n",
    "                #If object suffix, the suffix info is stored separately and added to the clause inventory\n",
    "                if F.function.v(ph) in {'PreO','PtcO'}:\n",
    "                    for w in L.d(ph, 'word'):\n",
    "                        if F.sp.v(w) == 'verb' and (Agency(w, colname, ranks_df) or Agency(w, colname, ranks_df) == 0):\n",
    "                            sfx_info['ref'] = w\n",
    "                            sfx_info['function'] = F.function.v(ph)\n",
    "                            sfx_info['rank'] = Agency(w, colname, ranks_df)\n",
    "                                \n",
    "                            clause_inventory.append(sfx_info)\n",
    "            \n",
    "            if pred == True and pred_lex!= 'HJH[' and len(clause_inventory) > 1:\n",
    "                ranked = sorted(clause_inventory, key=itemgetter('rank'), reverse = True)                    \n",
    "                    \n",
    "                #Getting Actor and labels\n",
    "                Actor_ref = ranked[0]['ref']\n",
    "                Actor_rank = ranked[0]['rank']\n",
    "                Actors = Actor(Actor_ref, df=df) #A list of Actors\n",
    "                \n",
    "                if Actors == 'error':\n",
    "                    error_list.append((cl, Actor_ref))\n",
    "                        \n",
    "                #Creating edges from Actor to Undergoer(s)\n",
    "                for Undergoer in ranked[1:]:\n",
    "                    Undergoer_ref = Undergoer['ref']\n",
    "                    Undergoer_rank = Undergoer['rank']\n",
    "                    Undergoers = Actor(Undergoer_ref, df=df)\n",
    "                    \n",
    "                    if Undergoers == 'error':\n",
    "                        error_list.append((cl, Undergoer_ref))\n",
    "                    \n",
    "                    if (Actors and Undergoers) and (Undergoers != 'error') and (Actors != 'error'):\n",
    "                        for A in Actors:\n",
    "                            for U in Undergoers:\n",
    "                                \n",
    "                                if mode == 'one-mode':\n",
    "                                    edge = (A, Actor_ref, Actor_rank, U, Undergoer_ref, Undergoer_rank, pred_gloss, cl)\n",
    "                                    edges.append(edge)\n",
    "                                elif mode == 'two-mode':\n",
    "                                    Actor_edge = (A, Actor_ref, Actor_rank, pred_gloss, cl)\n",
    "                                    Undergoer_edge = (pred_gloss, U, Undergoer_ref, Undergoer_rank, cl)\n",
    "                                    edges.append(Actor_edge), edges.append(Undergoer_edge)\n",
    "                                else:\n",
    "                                    print(\"You need to specify mode\")\n",
    "                                \n",
    "        return edges, error_list"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Two models are created to account for two versions of the agency data. The 'old' data does not account negations in the clause, while the 'new' data involves a recalculation of the agency (NB: the recalculation is done in another notebook)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "482\n"
     ]
    }
   ],
   "source": [
    "old_edges = createEdges(colname='rank',df=df, mode='one-mode')\n",
    "print(len(old_edges[0]))\n",
    "\n",
    "#With new ranks because of negatives (e.g. Agent -> Frustrative)\n",
    "new_edges = createEdges(colname='new_rank',df=df,mode='one-mode')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Explore errors:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "errors = old_edges[1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for e in errors:\n",
    "    A.pretty(e[0], highlights={e[1]:'gold'})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Both errors concern adverbial phrases, both referring to a location, so they are not important."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will remove edges for which both the Actor and Undergoer i 0 (Neutral) in Agency. In these cases, there is no interaction so those relations are not important:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "def removeNeutral(edge_list):\n",
    "    upd_edge_list = []\n",
    "    \n",
    "    for e in edge_list[0]:\n",
    "        Actor_rank = e[2]\n",
    "        Undergoer_rank = e[5]\n",
    "        \n",
    "        if Actor_rank == 0 and Undergoer_rank == 0:\n",
    "            continue\n",
    "        else:\n",
    "            upd_edge_list.append(e)\n",
    "            \n",
    "    return upd_edge_list\n",
    "            \n",
    "old_edges = removeNeutral(old_edges)\n",
    "new_edges = removeNeutral(new_edges)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Validation and export of the network model\n",
    "\n",
    "#### 4a. Validation\n",
    "\n",
    "Before the final export the edges need review. Several issues need validation:\n",
    "\n",
    "* Are all relevant clauses included?\n",
    "* Are the participants annotated correctly?\n",
    "* Are the roles annotated correctly?\n",
    "\n",
    "The review is carried out manually but assisted by an interface and colorcoding. 'Green' signals that the clause is included in the network, 'salmon' signals absence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "first_verse = T.nodeFromSection(('Leviticus',17,1))\n",
    "last_verse = T.nodeFromSection(('Leviticus',26,46))\n",
    "\n",
    "clauses = range(L.d(first_verse, 'clause')[0], L.d(last_verse, 'clause')[0]+2)\n",
    "verbal_clauses = []\n",
    "for cl in clauses:\n",
    "    pred = False\n",
    "    for ph in L.d(cl, 'phrase'):\n",
    "        if F.function.v(ph) in {'Pred','PreS','PreO','PtcO','PreC'}:\n",
    "            pred = True\n",
    "            for w in L.d(ph, 'word'):\n",
    "                if F.sp.v(w) == 'verb' and F.lex.v(w) != 'HJH[':\n",
    "                    verbal_clauses.append(cl)\n",
    "\n",
    "print(f'Number of clauses to review: {len(verbal_clauses)}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def validate(clauses, edges, n):\n",
    "    print(f'Nr {n}: {clauses[n]}')\n",
    "    \n",
    "    df = pd.DataFrame(edges)\n",
    "    edge_clauses = list(df[7])\n",
    "    \n",
    "    if clauses[n] in edge_clauses:\n",
    "        subset = df[df[7] == clauses[n]]\n",
    "        \n",
    "        for i, row in subset.iterrows():\n",
    "            print(f'Actor: {row[0]} - Agency: {row[2]}')\n",
    "            print(f'Undergoer: {row[3]} - Agency: {row[5]}\\n')\n",
    "        \n",
    "        A.pretty(clauses[n], highlights={clauses[n]:'lightgreen'})\n",
    "    \n",
    "    else:\n",
    "        A.pretty(clauses[n], highlights={clauses[n]:'salmon'})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "n=0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "validate(verbal_clauses, old_edges, n)\n",
    "n+=1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### -----Update: All corrections made -----\n",
    "\n",
    "\n",
    "\n",
    "**Lev 17**\n",
    "* BN >HRN added to participant: Need to be listed as part-whole relations across the entire text\n",
    "* 'MN QRB/ <M/ ->JC >JC' added to Nodes\n",
    "* KHN added to roles\n",
    "* 'F<JR=' added to Nodes\n",
    "* 'B TWK/ -BJT JFR>L' and 'B TWK/ -BJT JFR>L#2' added to Nodes\n",
    "* L KM added to Roles\n",
    "\n",
    "**Lev 18**\n",
    "* '<RWH >CH >B -2ms' added to Nodes\n",
    "* '<RWH >CH >X -2ms' added to Nodes\n",
    "* '<RWH/ ->CH#2' corrected in Nodes\n",
    "* 'MN QRB/ <M/ -NPC' added in Nodes\n",
    "* 'L PNH/ -BN JFR>L' added in Nodes\n",
    "    \n",
    "**Lev 19**\n",
    "* '>B >JC' and >M >JC added to participants: Need to be listed as part-whole relations across the entire text\n",
    "* '<NJ' added to participants\n",
    "* 'B CM/ -JHWH' and '>T CM/ >LHJM/ -2ms' added to Nodes\n",
    "* 'XRC=/' added to Nodes\n",
    "* 'PNH/ DL/' and 'PNH/ GDWL/' added to Nodes\n",
    "* '>T BN/ <M/ -2ms' changed in collocations\n",
    "* Changing Aktionsart of QWM (4_Export_Aktionsart)\n",
    "* 'PNH/ ZQN/' added to Nodes\n",
    "\n",
    "**Lev 20**\n",
    "* '>T CM/ QDC/ -JHWH' added to Nodes\n",
    "* '>JC' changed role\n",
    "* '>T >CH/ >JC/' changed role\n",
    "* 'MN QRB/ <M/ ->JC', 'MN QRB/ <M/ -KL', 'MN QRB/ <M/ -NPC', and 'MN QRB/ <M/ -CNJM -' added to Nodes\n",
    "\n",
    "**Lev 21**\n",
    "* Passive corrected in 2c_Instigation\n",
    "* MCH changed in participants\n",
    "* '>CH_2' added in participants\n",
    "* BN JFR>L added in participants: Need to be listed as part-whole relations across the entire text\n",
    "\n",
    "**Lev 22**\n",
    "* NPC_2 added in participants (Aaron's offspring)\n",
    "* NPC_3 added in participants (A chattel-slave)\n",
    "* Make sure that the compound reference \">HRN BN >HRN\" is only deleted if the references have been succesfully distributed to either >HRN or BN >HRN\n",
    "* BN JFR>L changed in participants\n",
    "* Hypernyms accross the text: >JC (top-level) refers both to GR and a native\n",
    "* '>JC#2' added to Nodes: Refers to \"any man\" within the household of the priest\n",
    "\n",
    "**Lev 23**\n",
    "* Hypernyms accross the text: 'L H <NJ/ W L H GR/' refers to the poor and the stranger. If this hypernym is removed before the reference are distributed to the involved participants, the references will be missing.\n",
    "\n",
    "**Lev 24**\n",
    "* BN >CH changed in Affectedness\n",
    "* 3mp removed from Participants (one instance)\n",
    "* JHWH changed in Affectedness\n",
    "\n",
    "**Lev 25**\n",
    "* When '<BD -2ms>MH -2ms' is removed (because it is a hypernym) the participants are missing. Hypernyms need to be constructed on top-level before removal.\n",
    "* Skip clauses with HJH? They are not interactions\n",
    "* 'JD ->JC', 'JD ->X -2ms' and 'JD GR TWCB' added to Nodes (as synonyms)\n",
    "* '>X -2ms' changed in Affectedness\n",
    "* GR TWCB#2 has been changed in Participants to distinguish \"your brother\" from \"foreigners\" although they are sometimes given the same label.\n",
    "\n",
    "**Lev 26**\n",
    "* NPL changed in Akstionsart to stative\n",
    "* XMC changed in Nodes from XMC/\n",
    "* PNH changed in Affectedness\n",
    "* 'L PNH/ >JB[ -<M' changed in Nodes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 4b. Export\n",
    "\n",
    "#### Export nodes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "new_df = pd.DataFrame(new_edges)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>new_rank_Actor</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>new_rank_Undergoer</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>BN &gt;HRN</td>\n",
       "      <td>690343</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>JHWH</td>\n",
       "      <td>690347</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>swing</td>\n",
       "      <td>440323</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>JHWH</td>\n",
       "      <td>690383</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>MCH=</td>\n",
       "      <td>690384</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>speak</td>\n",
       "      <td>440335</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>BN JFR&gt;L</td>\n",
       "      <td>690397</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>JHWH</td>\n",
       "      <td>690399</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>approach</td>\n",
       "      <td>440341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>JHWH</td>\n",
       "      <td>690402</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>MCH=</td>\n",
       "      <td>690403</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>speak</td>\n",
       "      <td>440342</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>BN JFR&gt;L</td>\n",
       "      <td>690415</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>JHWH</td>\n",
       "      <td>690417</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>approach</td>\n",
       "      <td>440347</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          0       1  2  new_rank_Actor     3       4  5  new_rank_Undergoer  \\\n",
       "0   BN >HRN  690343  5               5  JHWH  690347  0                   0   \n",
       "1      JHWH  690383  5               5  MCH=  690384 -1                  -1   \n",
       "2  BN JFR>L  690397  5               5  JHWH  690399 -1                  -1   \n",
       "3      JHWH  690402  5               5  MCH=  690403 -1                  -1   \n",
       "4  BN JFR>L  690415  5               5  JHWH  690417 -1                  -1   \n",
       "\n",
       "          6       7  \n",
       "0     swing  440323  \n",
       "1     speak  440335  \n",
       "2  approach  440341  \n",
       "3     speak  440342  \n",
       "4  approach  440347  "
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "old_df = pd.DataFrame(old_edges)\n",
    "old_df.insert(3, 'new_rank_Actor', new_df.iloc[:,2])\n",
    "old_df.insert(7, 'new_rank_Undergoer', new_df.iloc[:,5])\n",
    "old_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The labels (generated from the ETCBC-transliteration) will be replaced more readable ones:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "label_gloss = {'>CH BN -2ms': 'daughter-in-law',\n",
    "               '>DM': 'human_being',\n",
    "               'GR': 'sojourner',\n",
    "               '>CH#2': 'woman_in_menstruation',\n",
    "               '>X -2ms': 'brother',\n",
    "               'BFR/ BN/ -<M': 'children',\n",
    "               '>T BT/ BN/ ->CH W >T BT/ BT/ ->CH': 'granddaughter_of_woman',\n",
    "               'MLK=': 'idols',\n",
    "               'NPC#3': 'slave',\n",
    "               '<M': 'foreign_nations',\n",
    "               '>T >CH/ >JC/': \"fellow's_wife\",\n",
    "               'ZR': 'lay-person',\n",
    "               '>B -2ms': 'father',\n",
    "               'JHWH': 'YHWH',\n",
    "               '2mp_sfx': '2mpl',\n",
    "               '>HRN': 'Aaron',\n",
    "               'MN >JC/ ->CH': 'husband',\n",
    "               'DWDH -2ms': 'aunt-in-law',\n",
    "               '>CH >M ->CH': 'woman_and_her_mother',\n",
    "               'RDP': 'no-one',\n",
    "               'BN >CH': 'blasphemer',\n",
    "               'XRC=/': 'deaf',\n",
    "               'BN JFR>L': 'Israelites',\n",
    "               'C>R >B -2ms': 'aunt',\n",
    "               'KL': 'group_of_people',\n",
    "               '<RWH/ -<RWH -2ms': 'granddaughter',\n",
    "               'PNH/ ZQN/': 'elderly',\n",
    "               'BTWLH/': 'virgin',\n",
    "               '>JC >JC': 'an_Israelite',\n",
    "               'BN ->X -2ms': 'son_of_brother',\n",
    "               'QNH': 'purchaser',\n",
    "               '>JC >CH': 'man/woman',\n",
    "               '<RWH/ >CH/ W BT/ ->CH': 'woman_and_her_daughter',\n",
    "               '3mp': 'witnesses',\n",
    "               '>L MCPXT/ ->JC': 'clan',\n",
    "               'BT >B -2msBT >M -2ms': 'sister',\n",
    "               'PNH/ GDWL/': 'rich',\n",
    "               '>XD': \"brother's_brother\",\n",
    "               '>T== ZKR=/': 'male',\n",
    "               '2ms': '2msg',\n",
    "               '>XWT ->CH': 'sister_of_woman',\n",
    "               'BN TWCB': 'sons_of_sojourners',\n",
    "               '>M -2ms': 'mother',\n",
    "               'L >JC/': 'man',\n",
    "               'ZR< ->JC': 'offspring',\n",
    "               'PNH/ DL/': 'poor',\n",
    "               'L PNH/ <WR/': 'blind',\n",
    "               '>CH': 'woman',\n",
    "               '>CH >B -2ms': \"father's_wife\",\n",
    "               'MCH=': 'Moses',\n",
    "               'BN >HRN': \"Aaron's_sons\",\n",
    "               'BT -2ms': 'daughter',\n",
    "               'CPXH': 'handmaid',\n",
    "               'C>R -HW>': 'relative',\n",
    "               '>LMNH GRC XLL': 'widowed/expelled/defiled_woman',\n",
    "               'HM': 'remnants',\n",
    "               '>T PGR/ -<M': 'corpse',\n",
    "               'DWD ->X -2ms': \"brother's_uncle\",\n",
    "               'B <M/ -2ms': 'kinsmen'\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "edges_df = old_df\n",
    "\n",
    "Source = []\n",
    "Target = []\n",
    "\n",
    "for n, row in edges_df.iterrows():\n",
    "    source = row[0]\n",
    "    target = row[3]\n",
    "    \n",
    "    Source.append(label_gloss[source])\n",
    "    Target.append(label_gloss[target])\n",
    "    \n",
    "edges_df.insert(1, 'Source', Source)\n",
    "edges_df.insert(6, 'Target', Target)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>Source</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>new_rank_Actor</th>\n",
       "      <th>Target</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>new_rank_Undergoer</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>BN &gt;HRN</td>\n",
       "      <td>Aaron's_sons</td>\n",
       "      <td>690343</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>JHWH</td>\n",
       "      <td>690347</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>swing</td>\n",
       "      <td>440323</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>JHWH</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>690383</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>Moses</td>\n",
       "      <td>MCH=</td>\n",
       "      <td>690384</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>speak</td>\n",
       "      <td>440335</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>BN JFR&gt;L</td>\n",
       "      <td>Israelites</td>\n",
       "      <td>690397</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>JHWH</td>\n",
       "      <td>690399</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>approach</td>\n",
       "      <td>440341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>JHWH</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>690402</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>Moses</td>\n",
       "      <td>MCH=</td>\n",
       "      <td>690403</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>speak</td>\n",
       "      <td>440342</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>BN JFR&gt;L</td>\n",
       "      <td>Israelites</td>\n",
       "      <td>690415</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>JHWH</td>\n",
       "      <td>690417</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>approach</td>\n",
       "      <td>440347</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>472</th>\n",
       "      <td>DWD -&gt;X -2ms</td>\n",
       "      <td>brother's_uncle</td>\n",
       "      <td>691326</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>brother</td>\n",
       "      <td>&gt;X -2ms</td>\n",
       "      <td>68032</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>redeem</td>\n",
       "      <td>440637</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>473</th>\n",
       "      <td>L &gt;JC/</td>\n",
       "      <td>man</td>\n",
       "      <td>689041</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>handmaid</td>\n",
       "      <td>CPXH</td>\n",
       "      <td>689040</td>\n",
       "      <td>-2</td>\n",
       "      <td>-2</td>\n",
       "      <td>spend autumn</td>\n",
       "      <td>439885</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>474</th>\n",
       "      <td>MN &gt;JC/ -&gt;CH</td>\n",
       "      <td>husband</td>\n",
       "      <td>689652</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>widowed/expelled/defiled_woman</td>\n",
       "      <td>&gt;LMNH GRC XLL</td>\n",
       "      <td>689651</td>\n",
       "      <td>-2</td>\n",
       "      <td>-2</td>\n",
       "      <td>drive out</td>\n",
       "      <td>440088</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>475</th>\n",
       "      <td>3mp</td>\n",
       "      <td>witnesses</td>\n",
       "      <td>690660</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>blasphemer</td>\n",
       "      <td>BN &gt;CH</td>\n",
       "      <td>66980</td>\n",
       "      <td>-2</td>\n",
       "      <td>-2</td>\n",
       "      <td>settle</td>\n",
       "      <td>440424</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>476</th>\n",
       "      <td>3mp</td>\n",
       "      <td>witnesses</td>\n",
       "      <td>690675</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>blasphemer</td>\n",
       "      <td>BN &gt;CH</td>\n",
       "      <td>690677</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>support</td>\n",
       "      <td>440429</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>477 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                0           Source       1  2  new_rank_Actor  \\\n",
       "0         BN >HRN     Aaron's_sons  690343  5               5   \n",
       "1            JHWH             YHWH  690383  5               5   \n",
       "2        BN JFR>L       Israelites  690397  5               5   \n",
       "3            JHWH             YHWH  690402  5               5   \n",
       "4        BN JFR>L       Israelites  690415  5               5   \n",
       "..            ...              ...     ... ..             ...   \n",
       "472  DWD ->X -2ms  brother's_uncle  691326  5               5   \n",
       "473        L >JC/              man  689041  0               0   \n",
       "474  MN >JC/ ->CH          husband  689652  5               5   \n",
       "475           3mp        witnesses  690660  5               5   \n",
       "476           3mp        witnesses  690675  5               5   \n",
       "\n",
       "                             Target              3       4  5  \\\n",
       "0                              YHWH           JHWH  690347  0   \n",
       "1                             Moses           MCH=  690384 -1   \n",
       "2                              YHWH           JHWH  690399 -1   \n",
       "3                             Moses           MCH=  690403 -1   \n",
       "4                              YHWH           JHWH  690417 -1   \n",
       "..                              ...            ...     ... ..   \n",
       "472                         brother        >X -2ms   68032 -1   \n",
       "473                        handmaid           CPXH  689040 -2   \n",
       "474  widowed/expelled/defiled_woman  >LMNH GRC XLL  689651 -2   \n",
       "475                      blasphemer         BN >CH   66980 -2   \n",
       "476                      blasphemer         BN >CH  690677  0   \n",
       "\n",
       "     new_rank_Undergoer             6       7  \n",
       "0                     0         swing  440323  \n",
       "1                    -1         speak  440335  \n",
       "2                    -1      approach  440341  \n",
       "3                    -1         speak  440342  \n",
       "4                    -1      approach  440347  \n",
       "..                  ...           ...     ...  \n",
       "472                  -1        redeem  440637  \n",
       "473                  -2  spend autumn  439885  \n",
       "474                  -2     drive out  440088  \n",
       "475                  -2        settle  440424  \n",
       "476                   0       support  440429  \n",
       "\n",
       "[477 rows x 12 columns]"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "edges_df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The weight of the ties between the participants is defined as the difference between Actor and Undergoer Rank. We create time stamps to include original rank and new rank (new rank takes negations into account): "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "old_weight = (edges_df[2]-edges_df[5])**2\n",
    "new_weight = (edges_df['new_rank_Actor']-edges_df['new_rank_Undergoer'])**2\n",
    "\n",
    "#Insert Weight: calculated as the difference between the Actor rank and the Undergoer rank\n",
    "edges_df.insert(12, 'old_weight', old_weight)\n",
    "edges_df.insert(13, 'new_weight', new_weight)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>Source</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>new_rank_Actor</th>\n",
       "      <th>Target</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>new_rank_Undergoer</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "      <th>old_weight</th>\n",
       "      <th>new_weight</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>BN &gt;HRN</td>\n",
       "      <td>Aaron's_sons</td>\n",
       "      <td>690343</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>JHWH</td>\n",
       "      <td>690347</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>swing</td>\n",
       "      <td>440323</td>\n",
       "      <td>25</td>\n",
       "      <td>25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>JHWH</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>690383</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>Moses</td>\n",
       "      <td>MCH=</td>\n",
       "      <td>690384</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>speak</td>\n",
       "      <td>440335</td>\n",
       "      <td>36</td>\n",
       "      <td>36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>BN JFR&gt;L</td>\n",
       "      <td>Israelites</td>\n",
       "      <td>690397</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>JHWH</td>\n",
       "      <td>690399</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>approach</td>\n",
       "      <td>440341</td>\n",
       "      <td>36</td>\n",
       "      <td>36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>JHWH</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>690402</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>Moses</td>\n",
       "      <td>MCH=</td>\n",
       "      <td>690403</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>speak</td>\n",
       "      <td>440342</td>\n",
       "      <td>36</td>\n",
       "      <td>36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>BN JFR&gt;L</td>\n",
       "      <td>Israelites</td>\n",
       "      <td>690415</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>YHWH</td>\n",
       "      <td>JHWH</td>\n",
       "      <td>690417</td>\n",
       "      <td>-1</td>\n",
       "      <td>-1</td>\n",
       "      <td>approach</td>\n",
       "      <td>440347</td>\n",
       "      <td>36</td>\n",
       "      <td>36</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          0        Source       1  2  new_rank_Actor Target     3       4  5  \\\n",
       "0   BN >HRN  Aaron's_sons  690343  5               5   YHWH  JHWH  690347  0   \n",
       "1      JHWH          YHWH  690383  5               5  Moses  MCH=  690384 -1   \n",
       "2  BN JFR>L    Israelites  690397  5               5   YHWH  JHWH  690399 -1   \n",
       "3      JHWH          YHWH  690402  5               5  Moses  MCH=  690403 -1   \n",
       "4  BN JFR>L    Israelites  690415  5               5   YHWH  JHWH  690417 -1   \n",
       "\n",
       "   new_rank_Undergoer         6       7  old_weight  new_weight  \n",
       "0                   0     swing  440323          25          25  \n",
       "1                  -1     speak  440335          36          36  \n",
       "2                  -1  approach  440341          36          36  \n",
       "3                  -1     speak  440342          36          36  \n",
       "4                  -1  approach  440347          36          36  "
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "edges_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We produce two files, one for dynamic networks and one for static networks:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "static = edges_df[['Source','new_rank_Actor','Target','new_rank_Undergoer',6,'new_weight',7]]\n",
    "static.columns = ['Source','Source_agency','Target','Target_agency','Label','Weight','Clause']\n",
    "\n",
    "#Export\n",
    "static.to_excel('Lev17-26.edges.Static.xlsx', index=None)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 4.c Compare with older datasets\n",
    "\n",
    "For the sake of consistency, it is possible to easily compare the changes that are made in new models in comparison to old ones. This helps to update the data without going through a manual validation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_old = pd.read_excel('Lev17-26.edges.Static_Old.xlsx')\n",
    "data_new = pd.read_excel('Lev17-26.edges.Static.xlsx')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_new.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "len(data_new)-len(data_old)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### i. Check if edges have been removed or added"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "review_edges1 = []\n",
    "review_edges2 = []\n",
    "\n",
    "for n, row in data_new.iterrows():\n",
    "    if row.Clause in list(data_old.Clause):\n",
    "        subset_old = data_old[data_old.Clause == row.Clause]\n",
    "        match = False\n",
    "        for n1, row1 in subset_old.iterrows():\n",
    "            if row1.Source_label == row.Source_label and row1.Target_label == row.Target_label and row1.Label == row.Label:\n",
    "                match = True\n",
    "        if not match:\n",
    "            review_edges1.append(row.Clause)        \n",
    "    else:\n",
    "        review_edges1.append(row.Clause) #Clause is added in new dataset\n",
    "        \n",
    "for n, row in data_old.iterrows():\n",
    "    if row.Clause in list(data_old.Clause):\n",
    "        subset_new = data_new[data_new.Clause == row.Clause]\n",
    "        match = False\n",
    "        for n1, row1 in subset_new.iterrows():\n",
    "            if row1.Source_label == row.Source_label and row1.Target_label == row.Target_label and row1.Label == row.Label:\n",
    "                match = True\n",
    "        if not match:\n",
    "            review_edges2.append(row.Clause)        \n",
    "    else:\n",
    "        review_edges2.append(row.Clause) #Clause is added in new dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#review_edges1\n",
    "#review_edges2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### ii. Check if identical edges have same weight"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "review_edges3 = []\n",
    "\n",
    "for n, row in data_new.iterrows():\n",
    "    if row.Clause in list(data_old.Clause):\n",
    "        subset_old = data_old[data_old.Clause == row.Clause]\n",
    "        match = False\n",
    "        for n1, row1 in subset_old.iterrows():\n",
    "            if row1.Source_label == row.Source_label and row1.Target_label == row.Target_label and row1.Label == row.Label and row1.Weight == row.Weight:\n",
    "                match = True\n",
    "        if not match:\n",
    "            review_edges3.append(row.Clause)        \n",
    "    else:\n",
    "        review_edges3.append(row.Clause) #Clause is added in new dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "review_edges3 = [e for e in review_edges3 if e not in review_edges1 and e not in review_edges2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "review_edges3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Social Network Analysis\n",
    "\n",
    "The network model can now be explored with SNA-tools, in this case NetworkX.\n",
    "\n",
    "### 5.a Visualization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = pd.read_excel('Lev17-26.edges.Static.xlsx')\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "G = nx.MultiGraph()\n",
    "\n",
    "for n, row in data.iterrows():  \n",
    "    G.add_edge(row.Source_label, row.Target_label)\n",
    "    \n",
    "pos = { i : (random.random(), random.random()) for i in G.nodes()}\n",
    "l = forceatlas2.forceatlas2_networkx_layout(G, pos, niter=2000, gravity=30, scalingRatio=2.0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "weight = collections.Counter(G.edges())\n",
    "\n",
    "for u, v, d in G.edges(data=True):\n",
    "    d['weight'] = weight[u, v]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize = (15,15))\n",
    "\n",
    "nx.draw_networkx(G, l, node_color='violet', node_size=[n[1]*10 for n in G.degree()], \n",
    "                 edge_color='grey', width=[d['weight']/3 for _, _, d in G.edges(data=True)])\n",
    "\n",
    "plt.axis('off')\n",
    "plt.margins(x=0.1, y=0.1)\n",
    "\n",
    "plt.savefig('screenshots/Leviticus_SNA.png', dpi=500)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Number of nodes and edges:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f'Nodes: {len(G.nodes())}\\nEdges: {len(G.edges())}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Having created the edges and computed a multiple directed graph (MultiDiGraph), we can now explore the resulting network. We will begin with a general inspection:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5.b Cohesion and network density\n",
    "\n",
    "One of the simplest measures of cohession (\"knittedness\") is probably density. Density is simply the number of ties in the network proportional to the possible number of ties."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nx.density(G)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Density is sensitive to the size of the network, and large networks tend to have lower density than small networks, simply because it is more realistic for a member of a small network to be connected with most of the remaining participants than in a large network.\n",
    "\n",
    "Therefore, another approach is average degree:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "degree = G.degree()\n",
    "sum_degree = sum(dict(degree).values())\n",
    "print(f'Average degree: {sum_degree/len(G.nodes())}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "G = nx.MultiDiGraph()\n",
    "\n",
    "for n, row in data.iterrows():  \n",
    "    G.add_edge(row.Source_label, row.Target_label)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "outdegree_sequence = collections.Counter(sorted([d for n, d in G.out_degree()], reverse=True))\n",
    "indegree_sequence = collections.Counter(sorted([d for n, d in G.in_degree()], reverse=True))\n",
    "\n",
    "outdegree_df = pd.DataFrame(outdegree_sequence, index=[0]).T\n",
    "indegree_df = pd.DataFrame([indegree_sequence]).T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "degree_df = pd.concat([indegree_df, outdegree_df], axis=1, sort=False)\n",
    "degree_df.columns = ['indegree','outdegree']\n",
    "degree_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(figsize=(15,7))\n",
    "\n",
    "plt.bar(degree_df.index, degree_df.indegree, width=0.33)\n",
    "plt.bar(degree_df.index+0.33, degree_df.outdegree, color='tomato', width=0.33)\n",
    "\n",
    "ax.legend(labels=['indegree', 'outdegree'], fontsize=14)\n",
    "plt.ylabel(\"Count\", size=14)\n",
    "plt.xlabel(\"Degree\", size=14)\n",
    "plt.xticks(size=12)\n",
    "plt.yticks(size=12)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Cumulative:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "len(G.nodes())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "indegree_cum = [n/len(G.nodes())*100 for n in np.cumsum(degree_df.fillna(0).indegree)]\n",
    "outdegree_cum = [n/len(G.nodes())*100 for n in np.cumsum(degree_df.fillna(0).outdegree)]\n",
    "degree_df.insert(2, \"indegree_cum (%)\", indegree_cum)\n",
    "degree_df.insert(3, \"outdegree_cum (%)\", outdegree_cum)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "degree_df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Most connected participants:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "top_degree = sorted(dict(degree).items(), key=itemgetter(1), reverse=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A cummulative view:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cum_degree = pd.DataFrame(top_degree)\n",
    "cum_degree.columns = ['participant','degree']\n",
    "\n",
    "degree_cum = [n/(len((G.edges()))*2)*100 for n in np.cumsum(cum_degree.degree)]\n",
    "cum_degree.insert(2, \"degree_cum (%)\", degree_cum)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cum_degree.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Updated graph:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax1 = plt.subplots(figsize=(15,7))\n",
    "ax2 = ax1.twinx()\n",
    "\n",
    "ax1.bar(degree_df.index, degree_df.indegree, width=0.33)\n",
    "ax1.bar(degree_df.index+0.33, degree_df.outdegree, color='tomato', width=0.33)\n",
    "\n",
    "ax2.plot(degree_df.index, degree_df['indegree_cum (%)'], linestyle='--', alpha=0.5)\n",
    "ax2.plot(degree_df.index, degree_df['outdegree_cum (%)'], linestyle='--', alpha=0.5)\n",
    "\n",
    "ax1.legend(frameon=1, labels=['indegree', 'outdegree'], fontsize=14, facecolor='white', framealpha=1)\n",
    "ax1.set_ylabel(\"Count\", size=14)\n",
    "ax2.set_ylabel(\"Cumulative %\", size=14)\n",
    "ax1.set_xlabel(\"Degree\", size=14)\n",
    "plt.xticks(size=12)\n",
    "plt.yticks(size=12)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Inspect values:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "G.degree()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "G.out_degree()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Degree proportion of selected participants:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sel_part = sum(dict(G.degree(['YHWH', 'Moses','Israelites','sojourner','2ms','an_Israelite'])).values())\n",
    "\n",
    "print(f'{round(sel_part/sum(dict(G.degree()).values())*100, 2)}%')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5.c Reciprocity\n",
    "\n",
    "Reciprocity concerns whether an interaction from one actor to another is returned, or whether the relation is one-sided. A simple measure of reciprocity is to count the number of reciprocal ties and divide these by the total number of ties. For this analysis, we are not interested in the weights of the edges but simply the binary value (connected or not)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "digraph = nx.DiGraph()\n",
    "\n",
    "for n, row in data.iterrows():\n",
    "    digraph.add_edge(row.Source_label, row.Target_label)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nx.reciprocity(digraph)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "reci_df = pd.DataFrame([nx.reciprocity(digraph, digraph.nodes())]).T.sort_values(by=0, ascending=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(figsize=(15,5))\n",
    "\n",
    "plt.bar(reci_df.index, reci_df[0], width=0.33)\n",
    "plt.ylabel(\"fraction\", size=14)\n",
    "plt.xticks(size=11, rotation=45, ha='right')\n",
    "plt.yticks(size=12)\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5.d Centrality\n",
    "\n",
    "We use 4 measures for measuring the centrality of individual nodes. That will give an image of core and periphery of the network. The four measures are Degree, Closeness, Betweenness, and Eigenvector."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "indegree = nx.in_degree_centrality(digraph)\n",
    "outdegree = nx.out_degree_centrality(digraph)\n",
    "betweenness = nx.betweenness_centrality(digraph)\n",
    "pagerank = nx.pagerank(digraph)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "centrality = pd.DataFrame([indegree, outdegree, betweenness, pagerank]).T\n",
    "centrality.columns = ['indegree','outdegree','betweeness','pagerank']\n",
    "centrality"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Top five scores for centrality measures:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def top(measure, df=centrality):\n",
    "    return df.sort_values(by=measure, ascending=False)[measure][:10]\n",
    "\n",
    "fig, (ax1, ax2, ax3, ax4) = plt.subplots(1, 4, figsize=(15,5), sharey=True)\n",
    "\n",
    "ax1.bar(top('outdegree').index, top('outdegree'))\n",
    "ax1.set_title(\"Outdegree\", size=16)\n",
    "ax2.bar(top('indegree').index, top('indegree'))\n",
    "ax2.set_title(\"Indegree\", size=16)\n",
    "ax3.bar(top('betweeness').index, top('betweeness'))\n",
    "ax3.set_title(\"Betweenness\", size=16)\n",
    "ax4.bar(top('pagerank').index, top('pagerank'))\n",
    "ax4.set_title(\"PageRank\", size=16)\n",
    "\n",
    "for ax in fig.axes:\n",
    "    plt.sca(ax)\n",
    "    plt.xticks(rotation=45, ha='right', size=12)\n",
    "\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}