{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\"Army\"Army\n", "\n", "

Calculating and Comparing Product Production Time for VS One-Health Products

\n", "Sheldon Waugh MSc, PhD
\n", "Epidemiologist
\n", "One-Health Division
\n", "Veterinary Services and Public Health Sanitation Directorate
\n", "US Army Public Health Center
\n", "\n", "### This Python notebook is a workable and scalable example on how to make a notebook that cleans and analyzes data in a transparent way.\n", "We have an excel workbook that details the progress of all One-Health/Veterinary Services products to include posters, brochures and newsletters. \n", "Taking a look at the initial structure of this spreadsheet, we have missing dates or progress for certain products. This may be due to certain products not needing certain levels of approval. It may be advantageous for us to split the data by product and then look at the time related to each product.
\n", "For my sanity, I will be getting rid of the incomplete records." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First let's go ahead and load the .csv into python" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TypeDraft ReviewSME Content ReviewPMD Content ReviewVID Design Work/EditsSubmitted to CRCCRC Review Complete
0BrochureNaNNaN5/17/20165/23/20166/1/20166/7/2016
1BrochureNaNNaN5/17/20165/23/20166/3/20166/13/2016
2BrochureNaNNaN5/17/201610/20/201611/1/201612/20/2016
3BrochureNaNNaN7/5/20167/5/20168/19/20169/2/2016
4Brochure5/9/20175/16/20176/6/20176/23/20178/4/201710/10/2017
\n", "
" ], "text/plain": [ " Type Draft Review SME Content Review PMD Content Review \\\n", "0 Brochure NaN NaN 5/17/2016 \n", "1 Brochure NaN NaN 5/17/2016 \n", "2 Brochure NaN NaN 5/17/2016 \n", "3 Brochure NaN NaN 7/5/2016 \n", "4 Brochure 5/9/2017 5/16/2017 6/6/2017 \n", "\n", " VID Design Work/Edits Submitted to CRC CRC Review Complete \n", "0 5/23/2016 6/1/2016 6/7/2016 \n", "1 5/23/2016 6/3/2016 6/13/2016 \n", "2 10/20/2016 11/1/2016 12/20/2016 \n", "3 7/5/2016 8/19/2016 9/2/2016 \n", "4 6/23/2017 8/4/2017 10/10/2017 " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd # import the module and alias it as pd\n", "\n", "timedateVOH_data = pd.read_csv('C:\\Users\\waugh\\Dropbox\\Documents\\One_Health\\VS_Product_Analysis_dates.csv', parse_dates=True)\n", "timedateVOH_data.head()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TypeDraft ReviewSME Content ReviewPMD Content ReviewVID Design Work/EditsSubmitted to CRCCRC Review Complete
count81524862707569
unique3362931333321
topNewsletter6/1/20174/20/20167/11/20164/20/20168/29/20178/15/2016
freq41458689
\n", "
" ], "text/plain": [ " Type Draft Review SME Content Review PMD Content Review \\\n", "count 81 52 48 62 \n", "unique 3 36 29 31 \n", "top Newsletter 6/1/2017 4/20/2016 7/11/2016 \n", "freq 41 4 5 8 \n", "\n", " VID Design Work/Edits Submitted to CRC CRC Review Complete \n", "count 70 75 69 \n", "unique 33 33 21 \n", "top 4/20/2016 8/29/2017 8/15/2016 \n", "freq 6 8 9 " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "timedateVOH_data.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that the dates we have on the csv file did not read into python as a datetime object. Without this, we're unable to determine the elapsed time for each column.\n", "\n", "Lets fix for the CRC columns at least so since the product wouldn't exist without CRC approval." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TypeDraft ReviewSME Content ReviewPMD Content ReviewVID Design Work/EditsSubmitted to CRCCRC Review CompleteCRCtime
0BrochureNaNNaN5/17/20165/23/20162016-06-012016-06-076 days
1BrochureNaNNaN5/17/20165/23/20162016-06-032016-06-1310 days
2BrochureNaNNaN5/17/201610/20/20162016-11-012016-12-2049 days
3BrochureNaNNaN7/5/20167/5/20162016-08-192016-09-0214 days
4Brochure5/9/20175/16/20176/6/20176/23/20172017-08-042017-10-1067 days
\n", "
" ], "text/plain": [ " Type Draft Review SME Content Review PMD Content Review \\\n", "0 Brochure NaN NaN 5/17/2016 \n", "1 Brochure NaN NaN 5/17/2016 \n", "2 Brochure NaN NaN 5/17/2016 \n", "3 Brochure NaN NaN 7/5/2016 \n", "4 Brochure 5/9/2017 5/16/2017 6/6/2017 \n", "\n", " VID Design Work/Edits Submitted to CRC CRC Review Complete CRCtime \n", "0 5/23/2016 2016-06-01 2016-06-07 6 days \n", "1 5/23/2016 2016-06-03 2016-06-13 10 days \n", "2 10/20/2016 2016-11-01 2016-12-20 49 days \n", "3 7/5/2016 2016-08-19 2016-09-02 14 days \n", "4 6/23/2017 2017-08-04 2017-10-10 67 days " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_clean = timedateVOH_data[pd.notnull(timedateVOH_data['Submitted to CRC '])]\n", "df_clean_1 = df_clean[pd.notnull(df_clean['CRC Review Complete'])]\n", "df_clean_1\n", "df_clean_2 = df_clean_1.copy()\n", "\n", "df_clean_2['CRC Review Complete'] = pd.to_datetime(df_clean_2['CRC Review Complete'], format = '%m/%d/%Y')\n", "df_clean_2['Submitted to CRC '] = pd.to_datetime(df_clean_2['Submitted to CRC '], format = '%m/%d/%Y')\n", "\n", "df_clean_2['CRCtime'] = df_clean_2['CRC Review Complete'] - df_clean_2['Submitted to CRC ']\n", "\n", "df_clean_2.head()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we finished the column, we now have the calculated the elapsed time!\n", "We see, however, that certain products don't have full processing times through all columns. \n", "This probably means that certain products may not need certain approvals. \n", "So from now on, let's go ahead and split the created data.frame by the two main One-Health Products (Newsletters and Brochures)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TypeDraft ReviewSME Content ReviewPMD Content ReviewVID Design Work/EditsSubmitted to CRCCRC Review CompleteCRCtimeVID_EdittimePMD_ReviewtimeSME_ReviewtimeDraft_Reviewtime
40Newsletter2016-04-132016-04-202016-04-202016-04-202016-04-282016-05-0911 days8 days0 days0 days7 days
41Newsletter2016-04-152016-04-202016-04-202016-04-202016-04-282016-05-0911 days8 days0 days0 days5 days
42Newsletter2016-04-152016-04-202016-04-202016-04-202016-04-282016-05-0911 days8 days0 days0 days5 days
43Newsletter2016-04-152016-04-202016-04-202016-04-202016-04-282016-05-0911 days8 days0 days0 days5 days
48Newsletter2016-05-172016-05-172016-05-172016-05-172016-06-032016-06-074 days17 days0 days0 days0 days
54Newsletter2016-07-012016-07-052016-07-112016-07-112016-07-222016-08-1524 days11 days0 days6 days4 days
56Newsletter2016-06-272016-06-272016-07-112016-07-112016-07-222016-08-1524 days11 days0 days14 days0 days
58Newsletter2016-11-172016-11-232016-11-282016-11-292017-01-272017-03-3062 days59 days1 days5 days6 days
59Newsletter2016-11-082016-11-232016-11-282016-11-292017-01-272017-03-3062 days59 days1 days5 days15 days
61Newsletter2017-02-252017-03-152017-03-152017-03-172017-03-172017-05-0852 days0 days2 days0 days18 days
62Newsletter2017-02-102017-03-132017-03-152017-03-202017-04-042017-06-0158 days15 days5 days2 days31 days
63Newsletter2017-02-102017-03-102017-03-132017-03-152017-04-042017-06-0158 days20 days2 days3 days28 days
64Newsletter2017-03-012017-03-162017-03-162017-03-202017-04-042017-06-0158 days15 days4 days0 days15 days
66Newsletter2017-06-012017-06-232017-06-292017-07-252017-08-292017-08-290 days35 days26 days6 days22 days
68Newsletter2017-06-012017-06-232017-06-282017-07-252017-08-292017-08-290 days35 days27 days5 days22 days
70Newsletter2016-07-012016-07-052016-07-112017-08-282017-08-292017-08-290 days1 days413 days6 days4 days
71Newsletter2016-06-072016-06-302016-07-112017-08-282017-08-292017-08-290 days1 days413 days11 days23 days
72Newsletter2016-06-272016-06-272016-07-112017-08-282017-08-292017-08-290 days1 days413 days14 days0 days
77Newsletter2018-08-152018-08-242018-08-242018-08-292018-08-302018-09-045 days1 days5 days0 days9 days
78Newsletter2018-08-192018-08-302018-08-292018-08-292018-08-302018-09-045 days1 days0 days-1 days11 days
79Newsletter2018-08-152018-08-242018-08-242018-08-292018-08-302018-09-045 days1 days5 days0 days9 days
80Newsletter2018-08-162018-08-232018-08-242018-08-292018-08-302018-09-045 days1 days5 days1 days7 days
\n", "
" ], "text/plain": [ " Type Draft Review SME Content Review PMD Content Review \\\n", "40 Newsletter 2016-04-13 2016-04-20 2016-04-20 \n", "41 Newsletter 2016-04-15 2016-04-20 2016-04-20 \n", "42 Newsletter 2016-04-15 2016-04-20 2016-04-20 \n", "43 Newsletter 2016-04-15 2016-04-20 2016-04-20 \n", "48 Newsletter 2016-05-17 2016-05-17 2016-05-17 \n", "54 Newsletter 2016-07-01 2016-07-05 2016-07-11 \n", "56 Newsletter 2016-06-27 2016-06-27 2016-07-11 \n", "58 Newsletter 2016-11-17 2016-11-23 2016-11-28 \n", "59 Newsletter 2016-11-08 2016-11-23 2016-11-28 \n", "61 Newsletter 2017-02-25 2017-03-15 2017-03-15 \n", "62 Newsletter 2017-02-10 2017-03-13 2017-03-15 \n", "63 Newsletter 2017-02-10 2017-03-10 2017-03-13 \n", "64 Newsletter 2017-03-01 2017-03-16 2017-03-16 \n", "66 Newsletter 2017-06-01 2017-06-23 2017-06-29 \n", "68 Newsletter 2017-06-01 2017-06-23 2017-06-28 \n", "70 Newsletter 2016-07-01 2016-07-05 2016-07-11 \n", "71 Newsletter 2016-06-07 2016-06-30 2016-07-11 \n", "72 Newsletter 2016-06-27 2016-06-27 2016-07-11 \n", "77 Newsletter 2018-08-15 2018-08-24 2018-08-24 \n", "78 Newsletter 2018-08-19 2018-08-30 2018-08-29 \n", "79 Newsletter 2018-08-15 2018-08-24 2018-08-24 \n", "80 Newsletter 2018-08-16 2018-08-23 2018-08-24 \n", "\n", " VID Design Work/Edits Submitted to CRC CRC Review Complete CRCtime \\\n", "40 2016-04-20 2016-04-28 2016-05-09 11 days \n", "41 2016-04-20 2016-04-28 2016-05-09 11 days \n", "42 2016-04-20 2016-04-28 2016-05-09 11 days \n", "43 2016-04-20 2016-04-28 2016-05-09 11 days \n", "48 2016-05-17 2016-06-03 2016-06-07 4 days \n", "54 2016-07-11 2016-07-22 2016-08-15 24 days \n", "56 2016-07-11 2016-07-22 2016-08-15 24 days \n", "58 2016-11-29 2017-01-27 2017-03-30 62 days \n", "59 2016-11-29 2017-01-27 2017-03-30 62 days \n", "61 2017-03-17 2017-03-17 2017-05-08 52 days \n", "62 2017-03-20 2017-04-04 2017-06-01 58 days \n", "63 2017-03-15 2017-04-04 2017-06-01 58 days \n", "64 2017-03-20 2017-04-04 2017-06-01 58 days \n", "66 2017-07-25 2017-08-29 2017-08-29 0 days \n", "68 2017-07-25 2017-08-29 2017-08-29 0 days \n", "70 2017-08-28 2017-08-29 2017-08-29 0 days \n", "71 2017-08-28 2017-08-29 2017-08-29 0 days \n", "72 2017-08-28 2017-08-29 2017-08-29 0 days \n", "77 2018-08-29 2018-08-30 2018-09-04 5 days \n", "78 2018-08-29 2018-08-30 2018-09-04 5 days \n", "79 2018-08-29 2018-08-30 2018-09-04 5 days \n", "80 2018-08-29 2018-08-30 2018-09-04 5 days \n", "\n", " VID_Edittime PMD_Reviewtime SME_Reviewtime Draft_Reviewtime \n", "40 8 days 0 days 0 days 7 days \n", "41 8 days 0 days 0 days 5 days \n", "42 8 days 0 days 0 days 5 days \n", "43 8 days 0 days 0 days 5 days \n", "48 17 days 0 days 0 days 0 days \n", "54 11 days 0 days 6 days 4 days \n", "56 11 days 0 days 14 days 0 days \n", "58 59 days 1 days 5 days 6 days \n", "59 59 days 1 days 5 days 15 days \n", "61 0 days 2 days 0 days 18 days \n", "62 15 days 5 days 2 days 31 days \n", "63 20 days 2 days 3 days 28 days \n", "64 15 days 4 days 0 days 15 days \n", "66 35 days 26 days 6 days 22 days \n", "68 35 days 27 days 5 days 22 days \n", "70 1 days 413 days 6 days 4 days \n", "71 1 days 413 days 11 days 23 days \n", "72 1 days 413 days 14 days 0 days \n", "77 1 days 5 days 0 days 9 days \n", "78 1 days 0 days -1 days 11 days \n", "79 1 days 5 days 0 days 9 days \n", "80 1 days 5 days 1 days 7 days " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Create the dataframe subset\n", "df_clean_2_News = df_clean_2.loc[df_clean_2['Type'] == 'Newsletter']\n", "df_clean_2_News.head()\n", "\n", "#Erase rows with NaN values\n", "df_clean_3_News = df_clean_2_News[pd.notnull(df_clean_2_News['VID Design Work/Edits'])]\n", "df_clean_4_News = df_clean_3_News[pd.notnull(df_clean_3_News['PMD Content Review'])]\n", "df_clean_5_News = df_clean_4_News[pd.notnull(df_clean_4_News['SME Content Review'])]\n", "df_clean_6_News = df_clean_5_News[pd.notnull(df_clean_5_News['Draft Review'])]\n", "\n", "#Cenvert strings to datetime objects\n", "df_clean_6_News['VID Design Work/Edits'] = pd.to_datetime(df_clean_6_News['VID Design Work/Edits'], format = '%m/%d/%Y')\n", "df_clean_6_News['PMD Content Review'] = pd.to_datetime(df_clean_6_News['PMD Content Review'], format = '%m/%d/%Y')\n", "df_clean_6_News['SME Content Review'] = pd.to_datetime(df_clean_6_News['SME Content Review'], format = '%m/%d/%Y')\n", "df_clean_6_News['Draft Review'] = pd.to_datetime(df_clean_6_News['Draft Review'], format = '%m/%d/%Y')\n", "\n", "df_clean_6_News.head()\n", "\n", "#Create the elapsed time columns\n", "df_clean_6_News['VID_Edittime'] = df_clean_6_News['Submitted to CRC '] - df_clean_6_News['VID Design Work/Edits']\n", "df_clean_6_News['PMD_Reviewtime'] = df_clean_6_News['VID Design Work/Edits'] - df_clean_6_News['PMD Content Review']\n", "df_clean_6_News['SME_Reviewtime'] = df_clean_6_News['PMD Content Review'] - df_clean_6_News['SME Content Review']\n", "df_clean_6_News['Draft_Reviewtime'] = df_clean_6_News['SME Content Review'] - df_clean_6_News['Draft Review']\n", "\n", "df_clean_7_News = df_clean_6_News.drop(df_clean_6_News[df_clean_6_News.PMD_Reviewtime == '-10 days'].index)\n", "df_clean_8_News = df_clean_7_News.drop(df_clean_7_News[df_clean_7_News.PMD_Reviewtime == '-6 days'].index)\n", "#df_clean_6_News['PMD_Reviewtime'].total_seconds()\n", "\n", "#Good to go!\n", "df_clean_8_News" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now lets take a look at the summary data....." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CRCtimeVID_EdittimePMD_ReviewtimeSME_ReviewtimeDraft_Reviewtime
count2222222222
mean21 days 04:21:49.09090914 days 08:43:38.18181860 days 02:10:54.5454543 days 12:00:0011 days 04:21:49.090909
std24 days 06:22:33.47901917 days 14:47:37.238031143 days 17:24:49.0944714 days 14:02:38.9291469 days 05:32:01.438379
min0 days 00:00:000 days 00:00:000 days 00:00:00-1 days +00:00:000 days 00:00:00
25%4 days 06:00:001 days 00:00:000 days 00:00:000 days 00:00:005 days 00:00:00
50%11 days 00:00:008 days 00:00:002 days 00:00:001 days 12:00:008 days 00:00:00
75%45 days 00:00:0016 days 12:00:005 days 00:00:005 days 18:00:0017 days 06:00:00
max62 days 00:00:0059 days 00:00:00413 days 00:00:0014 days 00:00:0031 days 00:00:00
\n", "
" ], "text/plain": [ " CRCtime VID_Edittime \\\n", "count 22 22 \n", "mean 21 days 04:21:49.090909 14 days 08:43:38.181818 \n", "std 24 days 06:22:33.479019 17 days 14:47:37.238031 \n", "min 0 days 00:00:00 0 days 00:00:00 \n", "25% 4 days 06:00:00 1 days 00:00:00 \n", "50% 11 days 00:00:00 8 days 00:00:00 \n", "75% 45 days 00:00:00 16 days 12:00:00 \n", "max 62 days 00:00:00 59 days 00:00:00 \n", "\n", " PMD_Reviewtime SME_Reviewtime \\\n", "count 22 22 \n", "mean 60 days 02:10:54.545454 3 days 12:00:00 \n", "std 143 days 17:24:49.094471 4 days 14:02:38.929146 \n", "min 0 days 00:00:00 -1 days +00:00:00 \n", "25% 0 days 00:00:00 0 days 00:00:00 \n", "50% 2 days 00:00:00 1 days 12:00:00 \n", "75% 5 days 00:00:00 5 days 18:00:00 \n", "max 413 days 00:00:00 14 days 00:00:00 \n", "\n", " Draft_Reviewtime \n", "count 22 \n", "mean 11 days 04:21:49.090909 \n", "std 9 days 05:32:01.438379 \n", "min 0 days 00:00:00 \n", "25% 5 days 00:00:00 \n", "50% 8 days 00:00:00 \n", "75% 17 days 06:00:00 \n", "max 31 days 00:00:00 " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_clean_8_News.describe()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "11 days 00:00:00\n", "8 days 00:00:00\n", "2 days 00:00:00\n", "1 days 12:00:00\n", "8 days 00:00:00\n" ] } ], "source": [ "print(df_clean_8_News.CRCtime.median())\n", "print(df_clean_8_News.VID_Edittime.median())\n", "print(df_clean_8_News.PMD_Reviewtime.median())\n", "print(df_clean_8_News.SME_Reviewtime.median())\n", "print(df_clean_8_News.Draft_Reviewtime.median())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yikes, we have some weird data that led to some negative elapsed times. \n", "We could do one of three things:\n", "- Delete the records (rows) with the offending dates.\n", "- Delete the PMD_Reviewtime column and analyze the other criteria.\n", "- Leave the data unedited and ask MAJ Watkins about the dates.\n", "Let's go ahead and do the other products.\n", "\n", "This type of discrepancy also demonstrates the lack of an effective way of recording data. In order to effectively produce an analysis and report of any substance, our data must be reliable.
\n", "\n", "A recommendation for the future is that an accurate working document MUST be established in order to produce reliable production time data and reports.
\n", "\n", "For this analysis, I've decided to delete the offending products with a negative elapsed time." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TypeDraft ReviewSME Content ReviewPMD Content ReviewVID Design Work/EditsSubmitted to CRCCRC Review CompleteCRCtimeVID_EdittimePMD_ReviewtimeSME_ReviewtimeDraft_Reviewtime
4Brochure2017-05-092017-05-162017-06-062017-06-232017-08-042017-10-1067 days42 days17 days21 days7 days
7Brochure2016-11-232016-12-012016-12-022016-12-062017-01-302017-03-3059 days55 days4 days1 days8 days
9Brochure2016-12-072017-02-092017-02-102017-05-172017-06-062017-06-3024 days20 days96 days1 days64 days
10Brochure2016-12-092017-01-262017-01-272017-01-302017-02-022017-03-3056 days3 days3 days1 days48 days
11Brochure2016-09-062016-09-092016-09-092016-09-102016-11-292017-01-0234 days80 days1 days0 days3 days
13Brochure2017-03-012017-03-102017-03-132017-04-102017-06-052017-06-2722 days56 days28 days3 days9 days
\n", "
" ], "text/plain": [ " Type Draft Review SME Content Review PMD Content Review \\\n", "4 Brochure 2017-05-09 2017-05-16 2017-06-06 \n", "7 Brochure 2016-11-23 2016-12-01 2016-12-02 \n", "9 Brochure 2016-12-07 2017-02-09 2017-02-10 \n", "10 Brochure 2016-12-09 2017-01-26 2017-01-27 \n", "11 Brochure 2016-09-06 2016-09-09 2016-09-09 \n", "13 Brochure 2017-03-01 2017-03-10 2017-03-13 \n", "\n", " VID Design Work/Edits Submitted to CRC CRC Review Complete CRCtime \\\n", "4 2017-06-23 2017-08-04 2017-10-10 67 days \n", "7 2016-12-06 2017-01-30 2017-03-30 59 days \n", "9 2017-05-17 2017-06-06 2017-06-30 24 days \n", "10 2017-01-30 2017-02-02 2017-03-30 56 days \n", "11 2016-09-10 2016-11-29 2017-01-02 34 days \n", "13 2017-04-10 2017-06-05 2017-06-27 22 days \n", "\n", " VID_Edittime PMD_Reviewtime SME_Reviewtime Draft_Reviewtime \n", "4 42 days 17 days 21 days 7 days \n", "7 55 days 4 days 1 days 8 days \n", "9 20 days 96 days 1 days 64 days \n", "10 3 days 3 days 1 days 48 days \n", "11 80 days 1 days 0 days 3 days \n", "13 56 days 28 days 3 days 9 days " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Create the dataframe subset\n", "df_clean_2_1 = df_clean_2.copy()\n", "df_clean_2_bro = df_clean_2_1.loc[df_clean_2['Type'] == 'Brochure']\n", "df_clean_2_bro.head()\n", "pd.options.mode.chained_assignment = None\n", "\n", "#Erase rows with NaN values\n", "df_clean_3_bro = df_clean_2_bro[pd.notnull(df_clean_2_bro['VID Design Work/Edits'])]\n", "df_clean_4_bro = df_clean_3_bro[pd.notnull(df_clean_3_bro['PMD Content Review'])]\n", "df_clean_5_bro = df_clean_4_bro[pd.notnull(df_clean_4_bro['SME Content Review'])]\n", "df_clean_6_bro = df_clean_5_bro[pd.notnull(df_clean_5_bro['Draft Review'])]\n", "\n", "#Cenvert strings to datetime objects\n", "df_clean_6_bro['VID Design Work/Edits'] = pd.to_datetime(df_clean_6_bro['VID Design Work/Edits'], format = '%m/%d/%Y')\n", "df_clean_6_bro['PMD Content Review'] = pd.to_datetime(df_clean_6_bro['PMD Content Review'], format = '%m/%d/%Y')\n", "df_clean_6_bro['SME Content Review'] = pd.to_datetime(df_clean_6_bro['SME Content Review'], format = '%m/%d/%Y')\n", "df_clean_6_bro['Draft Review'] = pd.to_datetime(df_clean_6_bro['Draft Review'], format = '%m/%d/%Y')\n", "\n", "df_clean_6_News.head()\n", "\n", "#Create the elapsed time columns\n", "df_clean_6_bro['VID_Edittime'] = df_clean_6_bro['Submitted to CRC '] - df_clean_6_bro['VID Design Work/Edits']\n", "df_clean_6_bro['PMD_Reviewtime'] = df_clean_6_bro['VID Design Work/Edits'] - df_clean_6_bro['PMD Content Review']\n", "df_clean_6_bro['SME_Reviewtime'] = df_clean_6_bro['PMD Content Review'] - df_clean_6_bro['SME Content Review']\n", "df_clean_6_bro['Draft_Reviewtime'] = df_clean_6_bro['SME Content Review'] - df_clean_6_bro['Draft Review']\n", "\n", "#Good to go!\n", "df_clean_6_bro" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CRCtimeVID_EdittimePMD_ReviewtimeSME_ReviewtimeDraft_Reviewtime
count66666
mean43 days 16:00:0042 days 16:00:0024 days 20:00:004 days 12:00:0023 days 04:00:00
std19 days 09:32:32.15206127 days 14:37:12.91165336 days 08:56:33.3952708 days 03:25:10.37519026 days 00:15:41.341229
min22 days 00:00:003 days 00:00:001 days 00:00:000 days 00:00:003 days 00:00:00
25%26 days 12:00:0025 days 12:00:003 days 06:00:001 days 00:00:007 days 06:00:00
50%45 days 00:00:0048 days 12:00:0010 days 12:00:001 days 00:00:008 days 12:00:00
75%58 days 06:00:0055 days 18:00:0025 days 06:00:002 days 12:00:0038 days 06:00:00
max67 days 00:00:0080 days 00:00:0096 days 00:00:0021 days 00:00:0064 days 00:00:00
\n", "
" ], "text/plain": [ " CRCtime VID_Edittime \\\n", "count 6 6 \n", "mean 43 days 16:00:00 42 days 16:00:00 \n", "std 19 days 09:32:32.152061 27 days 14:37:12.911653 \n", "min 22 days 00:00:00 3 days 00:00:00 \n", "25% 26 days 12:00:00 25 days 12:00:00 \n", "50% 45 days 00:00:00 48 days 12:00:00 \n", "75% 58 days 06:00:00 55 days 18:00:00 \n", "max 67 days 00:00:00 80 days 00:00:00 \n", "\n", " PMD_Reviewtime SME_Reviewtime \\\n", "count 6 6 \n", "mean 24 days 20:00:00 4 days 12:00:00 \n", "std 36 days 08:56:33.395270 8 days 03:25:10.375190 \n", "min 1 days 00:00:00 0 days 00:00:00 \n", "25% 3 days 06:00:00 1 days 00:00:00 \n", "50% 10 days 12:00:00 1 days 00:00:00 \n", "75% 25 days 06:00:00 2 days 12:00:00 \n", "max 96 days 00:00:00 21 days 00:00:00 \n", "\n", " Draft_Reviewtime \n", "count 6 \n", "mean 23 days 04:00:00 \n", "std 26 days 00:15:41.341229 \n", "min 3 days 00:00:00 \n", "25% 7 days 06:00:00 \n", "50% 8 days 12:00:00 \n", "75% 38 days 06:00:00 \n", "max 64 days 00:00:00 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_clean_6_bro.describe()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "45 days 00:00:00\n", "48 days 12:00:00\n", "10 days 12:00:00\n", "1 days 00:00:00\n", "8 days 12:00:00\n" ] } ], "source": [ "print(df_clean_6_bro.CRCtime.median())\n", "print(df_clean_6_bro.VID_Edittime.median())\n", "print(df_clean_6_bro.PMD_Reviewtime.median())\n", "print(df_clean_6_bro.SME_Reviewtime.median())\n", "print(df_clean_6_bro.Draft_Reviewtime.median())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Plots and Figures


\n", "Now lets go ahead and produce some figures to visualize our data and demonstrate which production stage takes the most and least amount of time.
\n", "First let's take a look at the brochure products..." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt; plt.rcdefaults()\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "objects = ('CRC', 'VID', 'PMD Review', 'SME Review', 'Draft Review')\n", "y_pos = np.arange(len(objects))\n", "performance = [df_clean_6_bro.CRCtime.dt.days.median(),df_clean_6_bro.VID_Edittime.dt.days.median(),df_clean_6_bro.PMD_Reviewtime.dt.days.median(),df_clean_6_bro.SME_Reviewtime.dt.days.median(),df_clean_6_bro.Draft_Reviewtime.dt.days.median()]\n", "\n", "plt.bar(y_pos, performance, align='center', alpha=0.5)\n", "plt.xticks(y_pos, objects)\n", "plt.ylabel('Days (Median)')\n", "plt.title('VS (One-Health) Product Production Time (Brochure)')\n", "for i, v in enumerate(performance):\n", " plt.text(i-.15, v+1 , str(v), color='blue', fontweight='bold', va='center')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's take a look at the Newsletters..." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt; plt.rcdefaults()\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "objects = ('CRC', 'VID', 'PMD Review', 'SME Review', 'Draft Review')\n", "y_pos = np.arange(len(objects))\n", "performance = [df_clean_8_News.CRCtime.dt.days.median(),df_clean_8_News.VID_Edittime.dt.days.median(),df_clean_8_News.PMD_Reviewtime.dt.days.median(),df_clean_8_News.SME_Reviewtime.dt.days.median(),df_clean_8_News.Draft_Reviewtime.dt.days.median()]\n", "\n", "plt.bar(y_pos, performance, align='center', alpha=0.5, color=['green','green','green','green','green'])\n", "plt.xticks(y_pos, objects)\n", "plt.ylabel('Days (Median)')\n", "plt.title('VS (One-Health) Product Production Time (Newsletter)')\n", "for i, v in enumerate(performance):\n", " plt.text(i-.15, v+.18 , str(v), color='blue', fontweight='bold', va='center')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Discussion\n", "

So what can we say about these graphs and analyses?


\n", " With a first look, the data analysis and subsequent graphs, doesn't really tell us much about spent time due to the large discrepancies. We can see that VID processing and Review and the CRC Review process takes up the majority of time of One-Health Product development, with a wait time of 48.5 and 45 days, respectively.
\n", "
\n", " We see that comparing the brochures and newsletters, the newsletters take a shorter median amount of production time. The reasoning behind this may possibly be due to the higher frequency of newsletter production, increasing the potential efficiency of the newsletter production process.
\n", "
\n", " With a potential brochure program utilizing First Year Graduate Veterinary Education (FYGVE) students as potential authors, a strong and defined review process is necessary to eliminate potential hang up in the review process. \n", "\n", "#### This shows that our processing time is really dependent on the work and collaboration of other directorates regarding visual output and CRC review. \n", "\n", " With the subject matter, draft and PMD review lasting roughly 75% less than VID and CRC time, recommendations should:\n", "- Include discussions, focus groups and meetings to potentially streamline the CRC process; potentially decreasing the time at certain approval levels\n", "- Require submitters of VS One-Health products to have a clear visual outline of what their product should look like before VID work order submission\n", " - A clear and defined endstate by the customer will assist the VID designer in obtaining a product that the VID directorate and the customer will agree upon, potentially decreasing the production and processing time. Additionally, focus groups, surveys and interviews should be done within VID to discuss other potential bottlenecks during the production process.\n", " \n", " - Additionally, with our aforementioned FYGVE brochure program, a dedicated person within the One-Health Division should be responsible for the overall setup and design of the brochures. With this, this would increase positive interaction with VID and decrease the VID production time potentially decreasing the overall production time, significantly. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Conclusion\n", "In conclusion, my analysis shows that VID production and CRC approval take up more than 50 percent of the total production time in both One-health newsletters and brochures. I recommend the One-Health Division to open communication between VID and CRC to develop potential ideas to decrease processing and approval times. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }