{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Step - 3: Legacy of Slavery Certificate of Freedom - Data Visualization and Analysis\n", "### Computing the Legacy of Slavery: Applying Computational Thinking to an Archival Dataset\n", "* **Student Contributors:** K. Sarah Ostrach, Natalie Salive, Olivia Isaacs\n", "* **Faculty Mentor:** Richard Marciano\n", "* **Community Mentor:** Ryan Cox (Maryland State Archives)\n", "* **Source Available:** https://github.com/cases-umd/Legacy-of-Slavery\n", "* **License:** [Creative Commons - Attribute 4.0 Intl](https://creativecommons.org/licenses/by/4.0/)\n", "* [Lesson Plan for Instructors](./lesson-plan.ipynb)\n", "* **Related Publications:**\n", " * **IEEE Big Data 2019 CAS Workshop:** [A Case Study in Creating Transparency in Using Cultural Big Data: The Legacy of Slavery Project](https://dcicblog.umd.edu/cas/wp-content/uploads/sites/13/2018/12/12.Cox_-2.pdf)\n", "* **More Information:**\n", " * **SAA Outlook March/April 2019:** [Turning Data into People in Maryland's Slave Records](https://twitter.com/archivists_org/status/1116132520255479809)\n", "\n", "We organized the data preparation step around [David Weintrop’s model of computation thinking] (https://link.springer.com/content/pdf/10.1007%2Fs10956-015-9581-5.pdf) and worked based on a [questionnaire] (TNA_Questionnaire.ipynb) developed by The National Archives, London, UK to document this step as well. \n", "\n", "\n", "\n", "### **C**omputational Thinking Practices\n", "* Data Practices\n", " * Visualizing Data\n", " * Mainpulating Data\n", "* Systems Thinking Practices\n", " * Thinking in Levels\n", "\n", "### **E**thics and Values Considerations\n", " * Historical and Cultural Context Based Exploration and Cleaning\n", " * Understanding the sensitivity of the data\n", "\n", "### **A**rchival Practices\n", " * Digital Records and Access Systems\n", "\n", "### Learning Goals\n", "A step-by-step understanding of using computational thinking practices on a digitally archived Maryland State Archives Legacy of Slavery dataset collection" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import networkx\n", "import matplotlib.pyplot as plt\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " DataID DataItem County Owner_FirstName Owner_LastName \\\n", "0 AR7-46 1 AA Ann Ailsworth \n", "1 AR7-46 2 AA Ann Ailsworth \n", "2 AR7-46 3 AA Ann Ailsworth \n", "3 AR7-46 4 AA William Alexander \n", "4 AR7-46 5 AA Thomas Allen \n", "5 AR7-46 6 AA Thomas Allen \n", "6 AR7-46 7 AA James Alleson \n", "7 AR7-46 8 AA Mary Alwell \n", "8 AR7-46 9 AA Mary Armiger \n", "9 AR7-46 10 AA Mary Atcock \n", "\n", " Witness Date Freed_FirstName Freed_LastName Alias ... \\\n", "0 NaN None Keziah Cromwell NaN ... \n", "1 Zachariah Duvall 1811-06-24 Resiah Cromwell NaN ... \n", "2 Jenifer Duvall 1811-06-24 Kesiah Cromwell NaN ... \n", "3 NaN 1815-03-28 Handy McCeomey NaN ... \n", "4 NaN 1837-07-10 Nancy Ennis NaN ... \n", "5 NaN 1837-08-03 Jim Sharpe NaN ... \n", "6 NaN 1826-10-28 Belly NaN NaN ... \n", "7 NaN 1844-11-08 Howard Davis NaN ... \n", "8 NaN 1819-01-27 Abigail NaN NaN ... \n", "9 Jacob Franklin, Jr. 1812-12-30 Ned NaN NaN ... \n", "\n", " Page Entry DatasetName \\\n", "0 42686.0 12.0 FF \n", "1 24.0 3.0 FF \n", "2 NaN NaN FF \n", "3 50.0 2.0 FF \n", "4 257.0 1.0 FF \n", "5 257.0 2.0 FF \n", "6 242.0 1.0 FF \n", "7 372.0 1.0 FF \n", "8 126.0 2.0 FF \n", "9 31.0 3.0 FF \n", "\n", " Notes isWorking isError \\\n", "0 NaN 0 0 \n", "1 NaN 0 0 \n", "2 Freed by will of Mrs. Ann Ailsworth. 0 0 \n", "3 Freed by manumission, dated 27 March 1815. Rai... 0 0 \n", "4 Freed by petition to Anne Arundel County Court... 0 0 \n", "5 Freed by petition to Anne Arundel County Court... 0 0 \n", "6 Freed by manumission, dated 28 Oct 1826. Raise... 0 0 \n", "7 son of Nelly. Freed by manumission, dated 12 A... 0 0 \n", "8 along with Richard G. Stetton. Freed by manumi... 0 0 \n", "9 NaN 0 0 \n", "\n", " ChangeDate CreateDate DateFormatted Height_Inches \n", "0 39:20.3 39:20.3 NaN 63.00 \n", "1 39:20.3 39:20.3 1811-06-24 63.00 \n", "2 39:20.3 39:20.3 1811-06-24 63.00 \n", "3 39:20.3 39:20.3 1815-03-28 67.75 \n", "4 39:20.3 39:20.3 1837-07-10 57.50 \n", "5 39:20.3 39:20.3 1837-08-03 61.50 \n", "6 39:20.3 39:20.3 1826-10-28 61.50 \n", "7 39:20.3 39:20.3 1844-11-08 66.50 \n", "8 39:20.3 39:20.3 1819-01-27 61.00 \n", "9 39:20.3 39:20.3 1812-12-30 66.25 \n", "\n", "[10 rows x 30 columns]\n" ] } ], "source": [ "#reimport the csv saved from the previous step 2\n", "#code to import the csv saved from the previous step\n", "df = pd.read_csv(\"Datasets\\LoS_Prep_Output.csv\") \n", "print(df.head(10))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "from bokeh.io import output_notebook, show, save" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Standard plotly imports\n", "# import plotly.chart-studio as py\n", "import plotly.graph_objs as go\n", "from plotly.offline import iplot, init_notebook_mode\n", "# Using plotly + cufflinks in offline mode\n", "import cufflinks\n", "cufflinks.go_offline(connected=True)\n", "init_notebook_mode(connected=True)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": "\n(function(root) {\n function now() {\n return new Date();\n }\n\n var force = true;\n\n if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n var JS_MIME_TYPE = 'application/javascript';\n var HTML_MIME_TYPE = 'text/html';\n var EXEC_MIME_TYPE = 'application/vnd.bokehjs_exec.v0+json';\n var CLASS_NAME = 'output_bokeh rendered_html';\n\n /**\n * Render data to the DOM node\n */\n function render(props, node) {\n var script = document.createElement(\"script\");\n node.appendChild(script);\n }\n\n /**\n * Handle when an output is cleared or removed\n */\n function handleClearOutput(event, handle) {\n var cell = handle.cell;\n\n var id = cell.output_area._bokeh_element_id;\n var server_id = cell.output_area._bokeh_server_id;\n // Clean up Bokeh references\n if (id != null && id in Bokeh.index) {\n Bokeh.index[id].model.document.clear();\n delete Bokeh.index[id];\n }\n\n if (server_id !== undefined) {\n // Clean up Bokeh references\n var cmd = \"from bokeh.io.state import curstate; print(curstate().uuid_to_server['\" + server_id + \"'].get_sessions()[0].document.roots[0]._id)\";\n cell.notebook.kernel.execute(cmd, {\n iopub: {\n output: function(msg) {\n var id = msg.content.text.trim();\n if (id in Bokeh.index) {\n Bokeh.index[id].model.document.clear();\n delete Bokeh.index[id];\n }\n }\n }\n });\n // Destroy server and session\n var cmd = \"import bokeh.io.notebook as ion; ion.destroy_server('\" + server_id + \"')\";\n cell.notebook.kernel.execute(cmd);\n }\n }\n\n /**\n * Handle when a new output is added\n */\n function handleAddOutput(event, handle) {\n var output_area = handle.output_area;\n var output = handle.output;\n\n // limit handleAddOutput to display_data with EXEC_MIME_TYPE content only\n if ((output.output_type != \"display_data\") || (!Object.prototype.hasOwnProperty.call(output.data, EXEC_MIME_TYPE))) {\n return\n }\n\n var toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n\n if (output.metadata[EXEC_MIME_TYPE][\"id\"] !== undefined) {\n toinsert[toinsert.length - 1].firstChild.textContent = output.data[JS_MIME_TYPE];\n // store reference to embed id on output_area\n output_area._bokeh_element_id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n }\n if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n var bk_div = document.createElement(\"div\");\n bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n var script_attrs = bk_div.children[0].attributes;\n for (var i = 0; i < script_attrs.length; i++) {\n toinsert[toinsert.length - 1].firstChild.setAttribute(script_attrs[i].name, script_attrs[i].value);\n toinsert[toinsert.length - 1].firstChild.textContent = bk_div.children[0].textContent\n }\n // store reference to server id on output_area\n output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n }\n }\n\n function register_renderer(events, OutputArea) {\n\n function append_mime(data, metadata, element) {\n // create a DOM node to render to\n var toinsert = this.create_output_subarea(\n metadata,\n CLASS_NAME,\n EXEC_MIME_TYPE\n );\n this.keyboard_manager.register_events(toinsert);\n // Render to node\n var props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n render(props, toinsert[toinsert.length - 1]);\n element.append(toinsert);\n return toinsert\n }\n\n /* Handle when an output is cleared or removed */\n events.on('clear_output.CodeCell', handleClearOutput);\n events.on('delete.Cell', handleClearOutput);\n\n /* Handle when a new output is added */\n events.on('output_added.OutputArea', handleAddOutput);\n\n /**\n * Register the mime type and append_mime function with output_area\n */\n OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n /* Is output safe? */\n safe: true,\n /* Index of renderer in `output_area.display_order` */\n index: 0\n });\n }\n\n // register the mime type if in Jupyter Notebook environment and previously unregistered\n if (root.Jupyter !== undefined) {\n var events = require('base/js/events');\n var OutputArea = require('notebook/js/outputarea').OutputArea;\n\n if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n register_renderer(events, OutputArea);\n }\n }\n\n \n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n var NB_LOAD_WARNING = {'data': {'text/html':\n \"\\n\"+\n \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n \"
\\n\"+\n \"\\n\"+\n \"from bokeh.resources import INLINE\\n\"+\n \"output_notebook(resources=INLINE)\\n\"+\n \"\\n\"+\n \"