{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Volumes 26-27 (1 July 1944 - 23 July 1945): “Security Bonds”\n",
"by CA, BM, EO \n",
"\n",
"## Data Overview, Telling Our Story\n",
"This computational analysis of Volumes 26 and 27 the Henry Morgenthau, Jr. Press Conferences collection will find useful data points to manipulate, extract them for refinement, and use them to visually express the source content, as well as generate avenues of expansive research and analysis. The collection is primarily the transcripts of Henry Morgenthau’s press conferences as Secretary of the Treasury under Franklin Roosevelt. Volumes 26 and 27 are the final volumes, covering July 1944 to July 1945, as Morgenthau resigned after Roosevelt’s death.\n",
"\n",
"Our first goal was identifying the most prominent subjects and computing the relevant text into quantifiable keyword data, followed by finding supplemental data to expand on the context. Each volume begins with a table of contents listed by subject instead of chronology, providing a convenient means of identifying a pool of the most important topics. Because the sources are microfilm copies scanned to PDF files, the full text had to be extracted before specific usable data points could be extracted. That was difficult as they were low-resolution scans of originals which appear to have been of inconsistent physical condition (EO). It was accomplished by running the PDF files through the Docdrop OCR tool, then transferring it into a Microsoft Word document. The document was then manually proofread for errors in the OCR processing to optimize searchability. With a reliably searchable text, keyword data could then be measured, and was manipulated by **coding** a Python script in Google Colab to visualize it in a word cloud – with war bonds and Bretton Woods standing out – as well as using supplemental data to visually chart the goals and achievements of **each war bond drive**. Findings from this story allow for insight into future projects based on gaps located (CA). Lastly, points of consideration regarding topical or archival ethics are explored (EO).\n",
"\n",
"*Morgenthau's Legacy: Security Bonds & Securing Post-War World Finance*. The years covered in Volumes 26-27 represent a transitional period for the United States both intra- and internationally. The Roosevelt administration came to an end with the President’s death in April of 1945, and government representatives suddenly had to adjust to the policies of his successor, President Harry S. Truman (Duke University Libraries, n.d.). The end of the war in Europe followed just one month later, when Germany announced its unconditional surrender to the Allies (Duke University Libraries, n.d.). Even though the war with Japan was far from over, the former events shifted the mindset of the U.S. federal government (at least in part) and prompted a forward-looking legislative campaign to reinstate the international economy. Our datasets reflect that through their steady references to the Bretton Woods System and the Morgenthau Plan (Morgenthau, 1944-1945). For a more detailed description of those plans, see the below Modeling: Visualization section (BM)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dataset Exploration related to Scale and Levels within Collection \r\n",
"Each press conference transcript begins with a title, accompanied by the time and date. From there, lines of dialogue assigned to Morgenthau are either designated with the tag “H.M.JR” or “A,” while the reporters posing questions are tagged merely as “Q.” Occasionally, other figures participate in these conferences, but they are always denoted by their name (i.e. “Mr. O’Connell” or “Mr. D.W. Bell\").\r\n",
"\r\n",
"In addition to the transcripts, there are three other documents worth noting. The first occurs at the end of Volume 26 and contains a report on the appointment of E. F. Bartelt to the position of Fiscal Assistant Secretary of the Treasury (Morgenthau, 1944-1945). The second and third documents are linked to Morgenthau’s resignation. One is titled “Digest of Report to Congress,” and gives a broad explanation as to why he had chosen to end his work with the federal government. The other is a summary report he wrote before he was replaced, to be presented by the next Secretary of the Treasury. Morgenthau characterizes it as an annual report but includes information dating all the way back to 1934, when he was first appointed (Morgenthau, 1945) (BM).\r\n",
"\r\n",
"We consider three **research questions**:\r\n",
"\r\n",
"1. How should the subjects addressed in this data be quantified?\r\n",
"2. By translating these subjects into keywords, is it possible to see which issues were the most pressing to Morgenthau, the government, and the public during this time period?\r\n",
"3. Can supplemental data help place these datasets within the larger context of WWII? (BM)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data Cleaning and Preparation\r\n",
"After identifying recurring subjects in the Morgenthau datasets by examining the volumes’ tables of contents, the group populated a selection pool of keywords. In doing so, it was possible to transform the legislative issues of this time into quantitative data which could then be more effectively visualized. The selection pool for this study includes the following 13 **keywords**: Tax(es), International Monetary Fund, Reconstruction, Gold, Silver, Bonds, Black market, Resignation, International Bank, Inflation, Currency, Morgenthau Plan, & Truman (BM).\r\n",
"\r\n",
"## Modeling: Computation and Transformation\r\n",
"Using Docdrop’s OCR, we transcribed Volume 26 (Morgenthau, 1944-1945) and Volume 27 (Morgenthau, 1945) of Henry Morgenthau Jr.’s Press Conferences into a Word file. From there, we meticulously went through the transcription from the OCR and fixed any spelling errors present that would hinder data collection. This was to ensure that we received an accurate reading when using the find function on Microsoft Word to compile the numerical amount each time a key word or phrase was used in the document. These key words and phrases were gathered using the Table of Contents to discover prominent themes within the document.\r\n",
"\r\n",
"Once receiving these values after utilizing the search function of Microsoft Word, we began writing a Python script that would allow this data to be properly visualized by a reader. The code used allows readers to see the prevalent words that exist within the document and thus the “weight” that these words hold within the document. The more frequently a word was used within the document, the larger it would appear in the word cloud, which helps viewers understand the prevalence of these words and their importance during this period. That process was done in **Google Colab**, a resource with a built in Python reader, as well as a safe place to host additional add-ons to Python that allows users access to tools such as graphs where data visualization can take place (CA).\r\n",
"\r\n",
"## Modeling: Visualization\r\n",
"After the word cloud was created (**Figure 1**), it was determined that it may be important to underscore the importance of the War Bond Drives. To accomplish that, data was used from Duke University Libraries, which had data regarding how much money each War Bond Drive was projected to make versus the actual amount of money received (Duke University Libraries, 2011). The task required using Google Colab and Python once more, and the result was a side-by-side bar graph (**Figure 2**). This graph highlights how American citizens often surpassed the goals of a War Bond Drive, which in turn allowed the government to increase spending during the War. One thing of note from this process was that a data point was missing for the Sixth Drive, and as such one of the data bars is missing.\r\n",
"\r\n",
"
\r\n",
"
\r\n",
"\r\n",
"