{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Reading Punchcards\n", "### Using Simple Image-Processing Techniques to Interpret Punchcard Images\n", "* Contributors: Gregory N. Jansen\n", "* Source Available: https://github.com/cases-umd/reading-punchcards\n", "* License: [Creative Commons - Attribute 4.0 Intl](https://creativecommons.org/licenses/by/4.0/)\n", "* [Lesson Plan for Instructors (coming soon..)](./lesson-plan.ipynb)\n", "* Attribution: This work is based in part upon [blog posts and GPL licensed code](http://codeincluded.blogspot.com/2012/07/punchcard-reader-software.html) by Michael Hamilton (2011)\n", "\n", "## Introduction\n", "Until the mid-70s most computer data and programs were stored on punched paper cards. These fed into \n", "machines which read the data mechanically into the computer. A common card format was the IBM 029\n", "Punched Card, which held 80 columns of text. Other kinds of punched cards have been used as ballots in\n", "elections (famously in Florida with \"hanging chads\").\n", "\n", "\"Example\n", "\n", "[This video](https://www.youtube.com/watch?v=oaVwzYN6BP4&list=PLTtdqWIuOb6ekBQWfskfcCFJPyvjmZkKf) explains the process of preparing punchcards for computer work in the 60s and 70s.\n", "\n", "### Further Information\n", "\n", "To learn more about punchcards, please see the Columbia Computing History Project [website](http://www.columbia.edu/cu/computinghistory/cards.html)\n", "\n", "## Objectives\n", "The goal of this CASES project is to demonstrate some simple image processing techniques that can be used to extract data from a known legacy format. In the data preparation notebook we will explore how to detect the edges of a document and crop, how to normalize for white or black backgrounds, and how to flip and/or rotate an image to achieve a known orientation. Then we will apply some pre-existing Python code to these normalized images in order to read the data on them.\n", "\n", "## Learning Goals\n", "* Computational Practices:\n", " * [Collecting Data](#collecting_data)\n", " * [Modeling Data](#modeling_Data)\n", "* Archival Practices:\n", " * ??" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Software and Tools\n", "\n", "* [Python 2.7.x](http://python.org/tool1)\n", "* [Pillow Image Processing Library](http://pillow.readthedocs.io/)\n", "* [Punchcards Python Module](https://github.com/UMD-DCIC/punchcards)\n", "\n", "## Installing\n", "These notebooks use a requirements.txt file to install the punchcards module and it's dependencies in your JupyterLab kernel.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Punchcard Images\n", "This notebook project contains a small sample of punchcard images obtained from the United States' National Archives and Records Administration. You can find these in the `cards` subfolder.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Notebooks\n", "1. [Data Overview](notebook-1.ipynb)\n", "1. [Image Preparation](notebook-2.ipynb)\n", "1. [Data Extraction](notebook-3.ipynb)\n", "\n", "Next: [Data Overview](notebook-1.ipynb)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "cases_info": { "computational_practices": [ "cas:dp_collecting_data", "cas:dp_creating_data" ] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }