# Nano Project : POTUS VISITORS

The purpose of this project is to find out *the most frequent visitor of the President of The US Barack OBAMA in 2015*. We will rely on the data shared by [Dataquest][1] availaible [here][2], that present the appointment schedule at the White house at that time.

[1]: https://www.dataquest.io/
[2]: https://app.dataquest.io/m/353/working-with-dates-and-times-in-python/4/the-datetime-class

Our analysis will proceed with different steps as follow:

* Firstly, we compute each appointment length, the minimum and maximum appointment time values.
* Then we figure out who spent the most amount of time at the white House.
* Finally, we find out who visited the White House the most each month.

## 1. Exploring the "potus_visitors_2015.csv" data set.

Let's start our journey by opening the data set and looking at the information available.


In [1]:
from csv import reader
opened_file = open("potus_visitors_2015.csv", encoding = "UTF-8")
#Reading the file
read_file = reader( opened_file )
#Convert the file to a list of lists
potus_visitors = list( read_file )

potus_visitors_header = potus_visitors[0]
potus_visitors = potus_visitors[ 1: ]

#Compute the number of appointment in 2015 with Barack OBAMA
n_appt_2015 = len( potus_visitors[1: ])
#print the header and number of appointments
print( " POTUS visitor information : " )
print( potus_visitors_header )

#Displaying the 5 fives rows
for visitor in potus_visitors[0:4] :
 print("\n", visitor,"\n")
 
print("Number of appointments with the US President in 2015 : ", n_appt_2015 )

 POTUS visitor information : 
['name', 'appt_made_date', 'appt_start_date', 'appt_end_date', 'visitee_namelast', 'visitee_namefirst', 'meeting_room', 'description']

 ['Joshua T. Blanton', '2014-12-18T00:00:00', '1/6/15 9:30', '1/6/15 23:59', '', 'potus', 'west wing', 'JointService Military Honor Guard'] 


 ['Jack T. Gutting', '2014-12-18T00:00:00', '1/6/15 9:30', '1/6/15 23:59', '', 'potus', 'west wing', 'JointService Military Honor Guard'] 


 ['Bradley T. Guiles', '2014-12-18T00:00:00', '1/6/15 9:30', '1/6/15 23:59', '', 'potus', 'west wing', 'JointService Military Honor Guard'] 


 ['Loryn F. Grieb', '2014-12-18T00:00:00', '1/6/15 9:30', '1/6/15 23:59', '', 'potus', 'west wing', 'JointService Military Honor Guard'] 

Number of appointments with the US President in 2015 : 47953


Above, we can observed that the President is a busy man. About **47953** arrangements in one year. This is equal to 131 meetings/day. For the purpose of our analysis, we will use the *name* column, the appointment start date column( *appt_start_date*), and the appointment end date columns( *appt_end_date*).

Let's continue to exploring the data by calculating each appointment length, and the minimum and the maximum arrangement time values.

## 2. Computing the appointment lengths.

To calculate the length of each appointment in our data set, we will subtract the start date in the "appt_end_date" column with the end date in appt_end_date" column. As the dates contain in the five rows above are unclear about the format, we need to avoid dealing with unexpected date format. For instance, we can't declare in confidence that *1/6/15 9:30* in row number 1 was typed in *month/day/year* format rather than *day/month/year*. The lack of an "a.m" or "p.m" indicates that the time is likely in 24-hour format.

Luckily, Python provides three standard modules designed to helph working with dates and times. these are:

* The Calender module
* The time module
* The datetime module

We will take advantages of the datetime class from the datetime module to deal with data holding dates and times. Below the specific datetime.datetime() class methods we are going to use :
* **datetime.strptime()** : This constructor returns a datetime object defined using a special syntax system to describe date and times formats.

Let's move on by converting all the dates and times in *'appt_start_date'* and *'appt_end_date'* to the proper format *day/month/Year Hour:minute:second*.

In [2]:
#Importing the datetime module as alias
import datetime as dt
date_format = "%m/%d/%y %H:%M"
#Looping through potus_visitors data set
for visitor in potus_visitors :
 #Assigning the appointment start date to a variable named appt_start_date
 appt_start_date = visitor[ 2 ] 
 appt_end_date = visitor [ 3 ]
 #Take advantage of the strptime constructor to convert the variable to a datetime object
 appt_start_date = dt.datetime.strptime( appt_start_date, date_format)
 appt_end_date = dt.datetime.strptime( appt_end_date, date_format)
 #Convert the variables to the format %d/%m/%y %H:%M
 visitor[ 2 ] = appt_start_date
 visitor[ 3 ] = appt_end_date
 
print( potus_visitors[ 5397 ] )

['Benjamin D. Kahl', '2015-03-19T13:24:00', datetime.datetime(2015, 3, 24, 9, 30), datetime.datetime(2015, 3, 24, 23, 59), '', 'POTUS', 'west wing drive', 'military honor guard']


So far, each date and time has the same date format in both *appt_start_date* and *appt_end_date* columns. We can calculate the length of each appointment.

In [3]:
appt_lengths = []
#Iterating over the potus visitors data set
for visitor in potus_visitors :
 #Assigned the name of the visitor in the variable named visitor_name
 visitor_name = visitor[0]
 #Stored the appointement start date to a variable named appt_start_data
 appt_start_date = visitor[ 2 ]
 #Stored the appointement end date to a variable named appt_end_data
 appt_end_date = visitor[ 3 ]
 #Compute the length of each meeting
 length = appt_end_date - appt_start_date
 #Saved the length of a visitor in a list called appt_length
 appt_length = [ visitor_name , appt_start_date, appt_end_date, length ]
 #Add the length to the appt_length data set
 appt_lengths.append( appt_length )
#print out an example
print("Visitor name | appointment time")
print(appt_lengths[5][0], appt_lengths[5][3])

Visitor name | appointment time
Taylor D. Gibbs 14:29:00


 
## 3. Figuring out the visitor who spent the most amount of time at the White House

In order to discover who spend the latest time at the White House, we need to build first of all a frequency table of appointments length. Then, we will compare the highest frequency with each visitor appointment length. An individual that possesses the largest appointment length is definetely the one who spent the most amount of time at the White House.

### 3.1 Building the frequency table of appointments length.

To perform this action, we will iterative over the *appt_lengths* data set, and calculate the frequency of each length. The number of occurence of each data point in the length column will be stored in a frequency dictionary.

In [4]:
#Creating an empty dictionary
appt_freq = {}

#Looping over the appt_lengths list of lists
for row in appt_lengths :
 #Assigned the length of each appointment to a variable named length
 length = row[ 3 ]
 #If the length is not present in appt_dictionary, initiate its frequency to 1
 if( length not in appt_freq ) :
 appt_freq[ length ] = 1
 
 #Else , increase the length occurence by 1
 else :
 appt_freq[ length ] +=1
 
#print the appt_freq table
print( "Appointment duration","|", "Frequency")
for length, freq in appt_freq.items() :
 print( length, freq )
#Computing and printing the maximum and minimum time period in an arrangement
max_length = max( appt_freq )
min_length = min( appt_freq )
print(" The lowest amount of time spent in a meeting : ", min_length )
print(" The greatest amount of time spent in a meeting : ", max_length )

Appointment duration | Frequency
14:29:00 1213
13:59:00 1543
13:29:00 696
12:59:00 681
12:29:00 357
11:29:00 1115
14:59:00 511
4:59:00 301
13:04:00 2
11:59:00 1041
10:59:00 1548
10:29:00 5897
9:59:00 996
9:29:00 921
5:59:00 8173
8:29:00 2855
7:59:00 2027
8:59:00 862
13:39:00 12
6:44:00 103
12:14:00 39
11:54:00 6
9:44:00 119
9:04:00 13
11:44:00 32
6:59:00 930
6:39:00 16
6:29:00 457
15:29:00 395
7:29:00 1144
16:59:00 1768
16:29:00 818
15:59:00 460
8:14:00 23
6:14:00 38
5:29:00 985
4:44:00 17
4:29:00 99
3:59:00 185
3:29:00 22
7:44:00 69
8:39:00 36
9:49:00 9
9:14:00 1434
11:04:00 1
10:50:00 1
10:39:00 36
7:04:00 5
13:14:00 347
12:09:00 4
10:44:00 272
6:34:00 24
14:44:00 6
2:29:00 9
6:54:00 6
12:44:00 3
5:44:00 390
16:44:00 40
15:14:00 249
8:44:00 3732
5:39:00 4
5:37:00 1
5:36:00 1
5:26:00 1
5:17:00 1
14:14:00 220
12:39:00 7
7:14:00 224
11:14:00 921
8:54:00 18
8:51:00 1
11:48:00 1
11:37:00 1
11:35:00 1
11:32:00 1
10:11:00 1
10:07:00 1
10:04:00 1
9:56:00 1
9:50:00 1
7:39:00 1
17:59:00 256
16

According to the frequency table of appointment length, **102** arrangments has been held during **16 days, 12:59:00** each, which it the greatest amount of time spent in a meeting. On the order hand, only *9* arrangments hung out at the minimum time frame of **2:29:00**.

### 3.2 Finding the visitor who spent the most amount of time at the White House in 2015

Based on this information, we can search between the 102 appointments the visitor who spend the most amount of time in the White House.

In [5]:
print("Visitor name | appt_start_date | appt_end_date | appt_length")
#Assigning the number of visitor to a variable
n_visitor = 0
#Iterating over the appt_length data set
for row in appt_lengths :
 #fetching the name of the visitor
 visitor_name = row[ 0 ]
 #fetching the start and end date values
 appt_start_date = row[ 1 ]
 appt_end_date = row[ 2 ]
 #fetching his/her appointment length
 length = row[ 3 ]
 #searching for the visitor with the highest appointment length
 
 if( length == max_length ) :
 print (visitor_name," | ",appt_start_date," | ", appt_end_date," | ", length )
 #Increase the number of visitor
 n_visitor += 1
 
print(" Number of visitors : ", n_visitor)

Visitor name | appt_start_date | appt_end_date | appt_length
Regino B. Madrid | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Glenn A. Dewey | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Matthew J. Harding | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Marquez D. Brown | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Mark A. Questad | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Marco A. Lopez | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Marcio G. Botelho | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Tilden E. Olsen | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Oscar Romano | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Glenn C. Paulson | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Oscar A. Vanegasgonzalez | 2015-12-02 11:00:00 | 2015-12-18 23:59:00 | 16 days, 12:59:00
Nicholas A. Hubbard | 2015-12-02 11:00:00 | 201

Unexpectedly, the 102 persons who spent the greatest amount of time in the White House have attended same appointment on **the 02 of September 2015 from 11:00am to 23:59 pm**. Moreover, the meeting had been held at the State Floo room with the President himself.

### 3.3 Figuring out the most frequent visitor per month

As we know who has spent the most amount of time at the White House, let's display the most frequent visitor per month. To do that, we will perform actions as follow:

* Firstly, we will a frequency table of the name of a visitor to count the occurences of his/her presence at the White House in 2015
* For each visitor, we'll compute his/her appointment length from each month.
* Then, we'll choose the visitor with the greatest appointment length in each month. 

In [13]:
for row in appt_lengths :
 name_1 = row[0]
 appt_start_date = row[1]
 appt_month = row[2].month
 