{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "3e8f407e",
"metadata": {},
"source": [
"# Title\n",
"Finding Heavy Traffic Indicators on I-94"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a9b4f3e6",
"metadata": {},
"source": [
"# Project Description\n",
"I'm going to analyze a dataset about the westbound traffic on the I-94 Interstate highway.\n",
"\n",
"My goal is to find out what factors affect heavy traffic on I-94. These factors can be weather type, time of the day, time of the week, etc. "
]
},
{
"cell_type": "markdown",
"id": "2844a930",
"metadata": {},
"source": [
"## Importation\n",
"Here is the section to import all the packages/libraries that will be used through this notebook."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "9b5e28cb",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Ellipsis"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Data handling\n",
"import pandas as pd\n",
"import numpy as np\n",
"from datetime import datetime\n",
"\n",
"# Vizualisation (Matplotlib, Plotly, Seaborn, etc. )\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# EDA (pandas-profiling, etc. )\n",
"...\n",
"\n",
"# Feature Processing (Scikit-learn processing, etc. )\n",
"...\n",
"\n",
"# Machine Learning (Scikit-learn Estimators, Catboost, LightGBM, etc. )\n",
"...\n",
"\n",
"# Hyperparameters Fine-tuning (Scikit-learn hp search, cross-validation, etc. )\n",
"...\n",
"\n",
"# Other packages\n"
]
},
{
"cell_type": "markdown",
"id": "c42eb2ec",
"metadata": {},
"source": [
"# Data Loading\n",
"Here is the section to load the datasets (train, eval, test) and the additional files"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "13e3ca92",
"metadata": {},
"outputs": [],
"source": [
"traffic=pd.read_csv(\"Metro_Interstate_Traffic_Volume.csv\")"
]
},
{
"cell_type": "markdown",
"id": "6f27de0e",
"metadata": {},
"source": [
"# Exploratory Data Analysis: EDA\n",
"Here is the section to **inspect** the datasets in depth, **present** it, make **hypotheses** and **think** the *cleaning, processing and features creation*."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "b387c858",
"metadata": {},
"source": [
"## Dataset overview\n",
"\n",
"Have a look at the loaded datsets using the following methods: `.head(), .info()`"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "9b213371",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(48204, 9)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"traffic.shape"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e5f1f876",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" holiday \n",
" temp \n",
" rain_1h \n",
" snow_1h \n",
" clouds_all \n",
" weather_main \n",
" weather_description \n",
" date_time \n",
" traffic_volume \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" None \n",
" 288.28 \n",
" 0.0 \n",
" 0.0 \n",
" 40 \n",
" Clouds \n",
" scattered clouds \n",
" 2012-10-02 09:00:00 \n",
" 5545 \n",
" \n",
" \n",
" 1 \n",
" None \n",
" 289.36 \n",
" 0.0 \n",
" 0.0 \n",
" 75 \n",
" Clouds \n",
" broken clouds \n",
" 2012-10-02 10:00:00 \n",
" 4516 \n",
" \n",
" \n",
" 2 \n",
" None \n",
" 289.58 \n",
" 0.0 \n",
" 0.0 \n",
" 90 \n",
" Clouds \n",
" overcast clouds \n",
" 2012-10-02 11:00:00 \n",
" 4767 \n",
" \n",
" \n",
" 3 \n",
" None \n",
" 290.13 \n",
" 0.0 \n",
" 0.0 \n",
" 90 \n",
" Clouds \n",
" overcast clouds \n",
" 2012-10-02 12:00:00 \n",
" 5026 \n",
" \n",
" \n",
" 4 \n",
" None \n",
" 291.14 \n",
" 0.0 \n",
" 0.0 \n",
" 75 \n",
" Clouds \n",
" broken clouds \n",
" 2012-10-02 13:00:00 \n",
" 4918 \n",
" \n",
" \n",
" 5 \n",
" None \n",
" 291.72 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-02 14:00:00 \n",
" 5181 \n",
" \n",
" \n",
" 6 \n",
" None \n",
" 293.17 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-02 15:00:00 \n",
" 5584 \n",
" \n",
" \n",
" 7 \n",
" None \n",
" 293.86 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-02 16:00:00 \n",
" 6015 \n",
" \n",
" \n",
" 8 \n",
" None \n",
" 294.14 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 2012-10-02 17:00:00 \n",
" 5791 \n",
" \n",
" \n",
" 9 \n",
" None \n",
" 293.10 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 2012-10-02 18:00:00 \n",
" 4770 \n",
" \n",
" \n",
" 10 \n",
" None \n",
" 290.97 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 2012-10-02 19:00:00 \n",
" 3539 \n",
" \n",
" \n",
" 11 \n",
" None \n",
" 289.38 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-02 20:00:00 \n",
" 2784 \n",
" \n",
" \n",
" 12 \n",
" None \n",
" 288.61 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-02 21:00:00 \n",
" 2361 \n",
" \n",
" \n",
" 13 \n",
" None \n",
" 287.16 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-02 22:00:00 \n",
" 1529 \n",
" \n",
" \n",
" 14 \n",
" None \n",
" 285.45 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-02 23:00:00 \n",
" 963 \n",
" \n",
" \n",
" 15 \n",
" None \n",
" 284.63 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-03 00:00:00 \n",
" 506 \n",
" \n",
" \n",
" 16 \n",
" None \n",
" 283.47 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-03 01:00:00 \n",
" 321 \n",
" \n",
" \n",
" 17 \n",
" None \n",
" 281.18 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-03 02:00:00 \n",
" 273 \n",
" \n",
" \n",
" 18 \n",
" None \n",
" 281.09 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-03 03:00:00 \n",
" 367 \n",
" \n",
" \n",
" 19 \n",
" None \n",
" 279.53 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2012-10-03 04:00:00 \n",
" 814 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" holiday temp rain_1h snow_1h clouds_all weather_main \\\n",
"0 None 288.28 0.0 0.0 40 Clouds \n",
"1 None 289.36 0.0 0.0 75 Clouds \n",
"2 None 289.58 0.0 0.0 90 Clouds \n",
"3 None 290.13 0.0 0.0 90 Clouds \n",
"4 None 291.14 0.0 0.0 75 Clouds \n",
"5 None 291.72 0.0 0.0 1 Clear \n",
"6 None 293.17 0.0 0.0 1 Clear \n",
"7 None 293.86 0.0 0.0 1 Clear \n",
"8 None 294.14 0.0 0.0 20 Clouds \n",
"9 None 293.10 0.0 0.0 20 Clouds \n",
"10 None 290.97 0.0 0.0 20 Clouds \n",
"11 None 289.38 0.0 0.0 1 Clear \n",
"12 None 288.61 0.0 0.0 1 Clear \n",
"13 None 287.16 0.0 0.0 1 Clear \n",
"14 None 285.45 0.0 0.0 1 Clear \n",
"15 None 284.63 0.0 0.0 1 Clear \n",
"16 None 283.47 0.0 0.0 1 Clear \n",
"17 None 281.18 0.0 0.0 1 Clear \n",
"18 None 281.09 0.0 0.0 1 Clear \n",
"19 None 279.53 0.0 0.0 1 Clear \n",
"\n",
" weather_description date_time traffic_volume \n",
"0 scattered clouds 2012-10-02 09:00:00 5545 \n",
"1 broken clouds 2012-10-02 10:00:00 4516 \n",
"2 overcast clouds 2012-10-02 11:00:00 4767 \n",
"3 overcast clouds 2012-10-02 12:00:00 5026 \n",
"4 broken clouds 2012-10-02 13:00:00 4918 \n",
"5 sky is clear 2012-10-02 14:00:00 5181 \n",
"6 sky is clear 2012-10-02 15:00:00 5584 \n",
"7 sky is clear 2012-10-02 16:00:00 6015 \n",
"8 few clouds 2012-10-02 17:00:00 5791 \n",
"9 few clouds 2012-10-02 18:00:00 4770 \n",
"10 few clouds 2012-10-02 19:00:00 3539 \n",
"11 sky is clear 2012-10-02 20:00:00 2784 \n",
"12 sky is clear 2012-10-02 21:00:00 2361 \n",
"13 sky is clear 2012-10-02 22:00:00 1529 \n",
"14 sky is clear 2012-10-02 23:00:00 963 \n",
"15 sky is clear 2012-10-03 00:00:00 506 \n",
"16 sky is clear 2012-10-03 01:00:00 321 \n",
"17 sky is clear 2012-10-03 02:00:00 273 \n",
"18 sky is clear 2012-10-03 03:00:00 367 \n",
"19 sky is clear 2012-10-03 04:00:00 814 "
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"traffic.head(20)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "a9226ef5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array(['None', 'Columbus Day', 'Veterans Day', 'Thanksgiving Day',\n",
" 'Christmas Day', 'New Years Day', 'Washingtons Birthday',\n",
" 'Memorial Day', 'Independence Day', 'State Fair', 'Labor Day',\n",
" 'Martin Luther King Jr Day'], dtype=object)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"traffic['holiday'].unique()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "24dea39b",
"metadata": {},
"source": [
"## Hypothesis\n",
"#### Null Hypothesis, HO\n",
"Time is the number one factor that affects traffic the most.\n",
"\n",
"#### AlternativeHypothesis, H1\n",
"Time is NOT the number one factor that affects traffic the most."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "91404ebc",
"metadata": {},
"source": [
"## Questions\n",
"1. Which holidays have the most traffic?\n",
"2. Which weekdays have the most traffic?\n",
"3. What time of day has the most traffic?(Is it morning, afternoon, evening, midnight or midday?)\n",
"\n",
"MORNING\n",
"This is the time from midnight to midday.\n",
"\n",
"AFTERNOON\n",
"This is the time from midday (noon) to evening.\n",
"From 12:00 hours to approximately 18:00 hours.\n",
"\n",
"EVENING\n",
"This is the time from the end of the afternoon to midnight.\n",
"From approximately 18:00 hours to 00:00 hours.\n",
"\n",
"MIDNIGHT\n",
"This is the middle of the night (00:00 hours).\n",
"\n",
"MIDDAY\n",
"This is the middle of the day, also called \"NOON\" (12:00 hours).\n",
"\n",
"4. Which factor affects traffic the most?\n",
"5. Compare rain, snow and temparature based on traffic_volume\n",
"6. What's the highest recorded traffic?\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "b3785b0d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 48204 entries, 0 to 48203\n",
"Data columns (total 9 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 holiday 48204 non-null object \n",
" 1 temp 48204 non-null float64\n",
" 2 rain_1h 48204 non-null float64\n",
" 3 snow_1h 48204 non-null float64\n",
" 4 clouds_all 48204 non-null int64 \n",
" 5 weather_main 48204 non-null object \n",
" 6 weather_description 48204 non-null object \n",
" 7 date_time 48204 non-null object \n",
" 8 traffic_volume 48204 non-null int64 \n",
"dtypes: float64(3), int64(2), object(4)\n",
"memory usage: 3.3+ MB\n"
]
}
],
"source": [
"traffic.info()"
]
},
{
"cell_type": "markdown",
"id": "ce14e851",
"metadata": {},
"source": [
"## Data Cleaning"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e8b71cde",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" holiday \n",
" temp \n",
" rain_1h \n",
" snow_1h \n",
" clouds_all \n",
" weather_main \n",
" weather_description \n",
" date_time \n",
" traffic_volume \n",
" date \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" None \n",
" 288.28 \n",
" 0.0 \n",
" 0.0 \n",
" 40 \n",
" Clouds \n",
" scattered clouds \n",
" 2012-10-02 09:00:00 \n",
" 5545 \n",
" 2012-10-02 \n",
" 09:00:00 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" holiday temp rain_1h snow_1h clouds_all weather_main \\\n",
"0 None 288.28 0.0 0.0 40 Clouds \n",
"\n",
" weather_description date_time traffic_volume date \\\n",
"0 scattered clouds 2012-10-02 09:00:00 5545 2012-10-02 \n",
"\n",
" time \n",
"0 09:00:00 "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#splitting the date_time column into date and time\n",
"traffic[['date','time']] = traffic['date_time'].str.split(' ',expand=True)\n",
"traffic.head(1)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "3e5402a5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array(['Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday',\n",
" 'Monday'], dtype=object)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#creating weekday column from date_time column\n",
"traffic['weekday'] =pd.to_datetime(traffic['date_time']).dt.day_name()\n",
"#.dt.dayofweek\n",
"traffic.drop(columns=['date_time'], axis=1, inplace = True)\n",
"\n",
"traffic['weekday'].unique()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "c2d9f7ae",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array(['morning', 'midday', 'afternoon', 'evening', 'midnight'],\n",
" dtype=object)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def stringToTime(timeString):\n",
" return datetime.strptime(timeString, '%H:%M:%S').time()\n",
"midnight=stringToTime('00:00:00')\n",
"midday=stringToTime('12:00:00')\n",
"sixpm=stringToTime('18:00:00')\n",
"\n",
"#creating time_of_day column from time column\n",
"traffic['time_of_day'] = traffic['time'].apply(\n",
" lambda x: 'morning' if midnightmidnight \n",
" else ('midday' if stringToTime(x)==midday\n",
" else ('midnight' if stringToTime(x)==midnight\n",
" else x)))) \n",
")\n",
"\n",
"#dropping time column\n",
"#traffic.drop(columns=['time'], axis=1, inplace = True)\n",
"traffic['time'] = traffic['time'].apply(lambda x: int(stringToTime(x).strftime(\"%H%M%S\")))\n",
"# int(current_date.strftime(\"%Y%m%d%H%M%S\")))\n",
"\n",
"traffic['time_of_day'].unique()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "d01a7a20",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" holiday \n",
" temp \n",
" rain_1h \n",
" snow_1h \n",
" clouds_all \n",
" weather_main \n",
" weather_description \n",
" traffic_volume \n",
" date \n",
" time \n",
" weekday \n",
" time_of_day \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" None \n",
" 288.28 \n",
" 0.0 \n",
" 0.0 \n",
" 40 \n",
" Clouds \n",
" scattered clouds \n",
" 5545 \n",
" 2012-10-02 \n",
" 90000 \n",
" Tuesday \n",
" morning \n",
" \n",
" \n",
" 1 \n",
" None \n",
" 289.36 \n",
" 0.0 \n",
" 0.0 \n",
" 75 \n",
" Clouds \n",
" broken clouds \n",
" 4516 \n",
" 2012-10-02 \n",
" 100000 \n",
" Tuesday \n",
" morning \n",
" \n",
" \n",
" 2 \n",
" None \n",
" 289.58 \n",
" 0.0 \n",
" 0.0 \n",
" 90 \n",
" Clouds \n",
" overcast clouds \n",
" 4767 \n",
" 2012-10-02 \n",
" 110000 \n",
" Tuesday \n",
" morning \n",
" \n",
" \n",
" 3 \n",
" None \n",
" 290.13 \n",
" 0.0 \n",
" 0.0 \n",
" 90 \n",
" Clouds \n",
" overcast clouds \n",
" 5026 \n",
" 2012-10-02 \n",
" 120000 \n",
" Tuesday \n",
" midday \n",
" \n",
" \n",
" 4 \n",
" None \n",
" 291.14 \n",
" 0.0 \n",
" 0.0 \n",
" 75 \n",
" Clouds \n",
" broken clouds \n",
" 4918 \n",
" 2012-10-02 \n",
" 130000 \n",
" Tuesday \n",
" afternoon \n",
" \n",
" \n",
" 5 \n",
" None \n",
" 291.72 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5181 \n",
" 2012-10-02 \n",
" 140000 \n",
" Tuesday \n",
" afternoon \n",
" \n",
" \n",
" 6 \n",
" None \n",
" 293.17 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5584 \n",
" 2012-10-02 \n",
" 150000 \n",
" Tuesday \n",
" afternoon \n",
" \n",
" \n",
" 7 \n",
" None \n",
" 293.86 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 6015 \n",
" 2012-10-02 \n",
" 160000 \n",
" Tuesday \n",
" afternoon \n",
" \n",
" \n",
" 8 \n",
" None \n",
" 294.14 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 5791 \n",
" 2012-10-02 \n",
" 170000 \n",
" Tuesday \n",
" afternoon \n",
" \n",
" \n",
" 9 \n",
" None \n",
" 293.10 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 4770 \n",
" 2012-10-02 \n",
" 180000 \n",
" Tuesday \n",
" afternoon \n",
" \n",
" \n",
" 10 \n",
" None \n",
" 290.97 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 3539 \n",
" 2012-10-02 \n",
" 190000 \n",
" Tuesday \n",
" evening \n",
" \n",
" \n",
" 11 \n",
" None \n",
" 289.38 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2784 \n",
" 2012-10-02 \n",
" 200000 \n",
" Tuesday \n",
" evening \n",
" \n",
" \n",
" 12 \n",
" None \n",
" 288.61 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2361 \n",
" 2012-10-02 \n",
" 210000 \n",
" Tuesday \n",
" evening \n",
" \n",
" \n",
" 13 \n",
" None \n",
" 287.16 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 1529 \n",
" 2012-10-02 \n",
" 220000 \n",
" Tuesday \n",
" evening \n",
" \n",
" \n",
" 14 \n",
" None \n",
" 285.45 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 963 \n",
" 2012-10-02 \n",
" 230000 \n",
" Tuesday \n",
" evening \n",
" \n",
" \n",
" 15 \n",
" None \n",
" 284.63 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 506 \n",
" 2012-10-03 \n",
" 0 \n",
" Wednesday \n",
" midnight \n",
" \n",
" \n",
" 16 \n",
" None \n",
" 283.47 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 321 \n",
" 2012-10-03 \n",
" 10000 \n",
" Wednesday \n",
" morning \n",
" \n",
" \n",
" 17 \n",
" None \n",
" 281.18 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 273 \n",
" 2012-10-03 \n",
" 20000 \n",
" Wednesday \n",
" morning \n",
" \n",
" \n",
" 18 \n",
" None \n",
" 281.09 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 367 \n",
" 2012-10-03 \n",
" 30000 \n",
" Wednesday \n",
" morning \n",
" \n",
" \n",
" 19 \n",
" None \n",
" 279.53 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 814 \n",
" 2012-10-03 \n",
" 40000 \n",
" Wednesday \n",
" morning \n",
" \n",
" \n",
" 20 \n",
" None \n",
" 278.62 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2718 \n",
" 2012-10-03 \n",
" 50000 \n",
" Wednesday \n",
" morning \n",
" \n",
" \n",
" 21 \n",
" None \n",
" 278.23 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5673 \n",
" 2012-10-03 \n",
" 60000 \n",
" Wednesday \n",
" morning \n",
" \n",
" \n",
" 22 \n",
" None \n",
" 278.12 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 6511 \n",
" 2012-10-03 \n",
" 80000 \n",
" Wednesday \n",
" morning \n",
" \n",
" \n",
" 23 \n",
" None \n",
" 282.48 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5471 \n",
" 2012-10-03 \n",
" 90000 \n",
" Wednesday \n",
" morning \n",
" \n",
" \n",
" 24 \n",
" None \n",
" 291.97 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5097 \n",
" 2012-10-03 \n",
" 120000 \n",
" Wednesday \n",
" midday \n",
" \n",
" \n",
" 25 \n",
" None \n",
" 293.23 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 4887 \n",
" 2012-10-03 \n",
" 130000 \n",
" Wednesday \n",
" afternoon \n",
" \n",
" \n",
" 26 \n",
" None \n",
" 294.31 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5337 \n",
" 2012-10-03 \n",
" 140000 \n",
" Wednesday \n",
" afternoon \n",
" \n",
" \n",
" 27 \n",
" None \n",
" 295.17 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5692 \n",
" 2012-10-03 \n",
" 150000 \n",
" Wednesday \n",
" afternoon \n",
" \n",
" \n",
" 28 \n",
" None \n",
" 295.13 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 6137 \n",
" 2012-10-03 \n",
" 160000 \n",
" Wednesday \n",
" afternoon \n",
" \n",
" \n",
" 29 \n",
" None \n",
" 293.66 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 4623 \n",
" 2012-10-03 \n",
" 180000 \n",
" Wednesday \n",
" afternoon \n",
" \n",
" \n",
" 30 \n",
" None \n",
" 290.65 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 3591 \n",
" 2012-10-03 \n",
" 190000 \n",
" Wednesday \n",
" evening \n",
" \n",
" \n",
" 31 \n",
" None \n",
" 288.19 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 2898 \n",
" 2012-10-03 \n",
" 200000 \n",
" Wednesday \n",
" evening \n",
" \n",
" \n",
" 32 \n",
" None \n",
" 287.10 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2637 \n",
" 2012-10-03 \n",
" 210000 \n",
" Wednesday \n",
" evening \n",
" \n",
" \n",
" 33 \n",
" None \n",
" 286.25 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 1777 \n",
" 2012-10-03 \n",
" 220000 \n",
" Wednesday \n",
" evening \n",
" \n",
" \n",
" 34 \n",
" None \n",
" 285.26 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 1015 \n",
" 2012-10-03 \n",
" 230000 \n",
" Wednesday \n",
" evening \n",
" \n",
" \n",
" 35 \n",
" None \n",
" 284.55 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 598 \n",
" 2012-10-04 \n",
" 0 \n",
" Thursday \n",
" midnight \n",
" \n",
" \n",
" 36 \n",
" None \n",
" 283.47 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 369 \n",
" 2012-10-04 \n",
" 10000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 37 \n",
" None \n",
" 283.17 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 312 \n",
" 2012-10-04 \n",
" 20000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 38 \n",
" None \n",
" 282.04 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 367 \n",
" 2012-10-04 \n",
" 30000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 39 \n",
" None \n",
" 281.69 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 835 \n",
" 2012-10-04 \n",
" 40000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 40 \n",
" None \n",
" 281.32 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 2726 \n",
" 2012-10-04 \n",
" 50000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 41 \n",
" None \n",
" 280.74 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5689 \n",
" 2012-10-04 \n",
" 60000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 42 \n",
" None \n",
" 280.57 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 6990 \n",
" 2012-10-04 \n",
" 70000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 43 \n",
" None \n",
" 281.86 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5985 \n",
" 2012-10-04 \n",
" 80000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 44 \n",
" None \n",
" 284.98 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5309 \n",
" 2012-10-04 \n",
" 90000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 45 \n",
" None \n",
" 289.18 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 4603 \n",
" 2012-10-04 \n",
" 100000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 46 \n",
" None \n",
" 291.55 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 4884 \n",
" 2012-10-04 \n",
" 110000 \n",
" Thursday \n",
" morning \n",
" \n",
" \n",
" 47 \n",
" None \n",
" 294.97 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5104 \n",
" 2012-10-04 \n",
" 120000 \n",
" Thursday \n",
" midday \n",
" \n",
" \n",
" 48 \n",
" None \n",
" 296.38 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5178 \n",
" 2012-10-04 \n",
" 130000 \n",
" Thursday \n",
" afternoon \n",
" \n",
" \n",
" 49 \n",
" None \n",
" 297.32 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5501 \n",
" 2012-10-04 \n",
" 140000 \n",
" Thursday \n",
" afternoon \n",
" \n",
" \n",
" 50 \n",
" None \n",
" 298.17 \n",
" 0.0 \n",
" 0.0 \n",
" 1 \n",
" Clear \n",
" sky is clear \n",
" 5713 \n",
" 2012-10-04 \n",
" 150000 \n",
" Thursday \n",
" afternoon \n",
" \n",
" \n",
" 51 \n",
" None \n",
" 298.06 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 6292 \n",
" 2012-10-04 \n",
" 160000 \n",
" Thursday \n",
" afternoon \n",
" \n",
" \n",
" 52 \n",
" None \n",
" 297.67 \n",
" 0.0 \n",
" 0.0 \n",
" 20 \n",
" Clouds \n",
" few clouds \n",
" 6057 \n",
" 2012-10-04 \n",
" 170000 \n",
" Thursday \n",
" afternoon \n",
" \n",
" \n",
" 53 \n",
" None \n",
" 296.36 \n",
" 0.0 \n",
" 0.0 \n",
" 40 \n",
" Clouds \n",
" scattered clouds \n",
" 4907 \n",
" 2012-10-04 \n",
" 180000 \n",
" Thursday \n",
" afternoon \n",
" \n",
" \n",
" 54 \n",
" None \n",
" 293.85 \n",
" 0.0 \n",
" 0.0 \n",
" 40 \n",
" Clouds \n",
" scattered clouds \n",
" 3503 \n",
" 2012-10-04 \n",
" 190000 \n",
" Thursday \n",
" evening \n",
" \n",
" \n",
" 55 \n",
" None \n",
" 292.43 \n",
" 0.0 \n",
" 0.0 \n",
" 40 \n",
" Clouds \n",
" scattered clouds \n",
" 3037 \n",
" 2012-10-04 \n",
" 200000 \n",
" Thursday \n",
" evening \n",
" \n",
" \n",
" 56 \n",
" None \n",
" 291.77 \n",
" 0.0 \n",
" 0.0 \n",
" 75 \n",
" Clouds \n",
" broken clouds \n",
" 2822 \n",
" 2012-10-04 \n",
" 210000 \n",
" Thursday \n",
" evening \n",
" \n",
" \n",
" 57 \n",
" None \n",
" 291.36 \n",
" 0.0 \n",
" 0.0 \n",
" 75 \n",
" Clouds \n",
" broken clouds \n",
" 1992 \n",
" 2012-10-04 \n",
" 220000 \n",
" Thursday \n",
" evening \n",
" \n",
" \n",
" 58 \n",
" None \n",
" 291.12 \n",
" 0.0 \n",
" 0.0 \n",
" 75 \n",
" Clouds \n",
" broken clouds \n",
" 1166 \n",
" 2012-10-04 \n",
" 230000 \n",
" Thursday \n",
" evening \n",
" \n",
" \n",
" 59 \n",
" None \n",
" 290.63 \n",
" 0.0 \n",
" 0.0 \n",
" 75 \n",
" Clouds \n",
" broken clouds \n",
" 627 \n",
" 2012-10-05 \n",
" 0 \n",
" Friday \n",
" midnight \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" holiday temp rain_1h snow_1h clouds_all weather_main \\\n",
"0 None 288.28 0.0 0.0 40 Clouds \n",
"1 None 289.36 0.0 0.0 75 Clouds \n",
"2 None 289.58 0.0 0.0 90 Clouds \n",
"3 None 290.13 0.0 0.0 90 Clouds \n",
"4 None 291.14 0.0 0.0 75 Clouds \n",
"5 None 291.72 0.0 0.0 1 Clear \n",
"6 None 293.17 0.0 0.0 1 Clear \n",
"7 None 293.86 0.0 0.0 1 Clear \n",
"8 None 294.14 0.0 0.0 20 Clouds \n",
"9 None 293.10 0.0 0.0 20 Clouds \n",
"10 None 290.97 0.0 0.0 20 Clouds \n",
"11 None 289.38 0.0 0.0 1 Clear \n",
"12 None 288.61 0.0 0.0 1 Clear \n",
"13 None 287.16 0.0 0.0 1 Clear \n",
"14 None 285.45 0.0 0.0 1 Clear \n",
"15 None 284.63 0.0 0.0 1 Clear \n",
"16 None 283.47 0.0 0.0 1 Clear \n",
"17 None 281.18 0.0 0.0 1 Clear \n",
"18 None 281.09 0.0 0.0 1 Clear \n",
"19 None 279.53 0.0 0.0 1 Clear \n",
"20 None 278.62 0.0 0.0 1 Clear \n",
"21 None 278.23 0.0 0.0 1 Clear \n",
"22 None 278.12 0.0 0.0 1 Clear \n",
"23 None 282.48 0.0 0.0 1 Clear \n",
"24 None 291.97 0.0 0.0 1 Clear \n",
"25 None 293.23 0.0 0.0 1 Clear \n",
"26 None 294.31 0.0 0.0 1 Clear \n",
"27 None 295.17 0.0 0.0 1 Clear \n",
"28 None 295.13 0.0 0.0 1 Clear \n",
"29 None 293.66 0.0 0.0 20 Clouds \n",
"30 None 290.65 0.0 0.0 20 Clouds \n",
"31 None 288.19 0.0 0.0 20 Clouds \n",
"32 None 287.10 0.0 0.0 1 Clear \n",
"33 None 286.25 0.0 0.0 1 Clear \n",
"34 None 285.26 0.0 0.0 1 Clear \n",
"35 None 284.55 0.0 0.0 1 Clear \n",
"36 None 283.47 0.0 0.0 1 Clear \n",
"37 None 283.17 0.0 0.0 1 Clear \n",
"38 None 282.04 0.0 0.0 1 Clear \n",
"39 None 281.69 0.0 0.0 1 Clear \n",
"40 None 281.32 0.0 0.0 1 Clear \n",
"41 None 280.74 0.0 0.0 1 Clear \n",
"42 None 280.57 0.0 0.0 1 Clear \n",
"43 None 281.86 0.0 0.0 1 Clear \n",
"44 None 284.98 0.0 0.0 1 Clear \n",
"45 None 289.18 0.0 0.0 1 Clear \n",
"46 None 291.55 0.0 0.0 1 Clear \n",
"47 None 294.97 0.0 0.0 1 Clear \n",
"48 None 296.38 0.0 0.0 1 Clear \n",
"49 None 297.32 0.0 0.0 1 Clear \n",
"50 None 298.17 0.0 0.0 1 Clear \n",
"51 None 298.06 0.0 0.0 20 Clouds \n",
"52 None 297.67 0.0 0.0 20 Clouds \n",
"53 None 296.36 0.0 0.0 40 Clouds \n",
"54 None 293.85 0.0 0.0 40 Clouds \n",
"55 None 292.43 0.0 0.0 40 Clouds \n",
"56 None 291.77 0.0 0.0 75 Clouds \n",
"57 None 291.36 0.0 0.0 75 Clouds \n",
"58 None 291.12 0.0 0.0 75 Clouds \n",
"59 None 290.63 0.0 0.0 75 Clouds \n",
"\n",
" weather_description traffic_volume date time weekday \\\n",
"0 scattered clouds 5545 2012-10-02 90000 Tuesday \n",
"1 broken clouds 4516 2012-10-02 100000 Tuesday \n",
"2 overcast clouds 4767 2012-10-02 110000 Tuesday \n",
"3 overcast clouds 5026 2012-10-02 120000 Tuesday \n",
"4 broken clouds 4918 2012-10-02 130000 Tuesday \n",
"5 sky is clear 5181 2012-10-02 140000 Tuesday \n",
"6 sky is clear 5584 2012-10-02 150000 Tuesday \n",
"7 sky is clear 6015 2012-10-02 160000 Tuesday \n",
"8 few clouds 5791 2012-10-02 170000 Tuesday \n",
"9 few clouds 4770 2012-10-02 180000 Tuesday \n",
"10 few clouds 3539 2012-10-02 190000 Tuesday \n",
"11 sky is clear 2784 2012-10-02 200000 Tuesday \n",
"12 sky is clear 2361 2012-10-02 210000 Tuesday \n",
"13 sky is clear 1529 2012-10-02 220000 Tuesday \n",
"14 sky is clear 963 2012-10-02 230000 Tuesday \n",
"15 sky is clear 506 2012-10-03 0 Wednesday \n",
"16 sky is clear 321 2012-10-03 10000 Wednesday \n",
"17 sky is clear 273 2012-10-03 20000 Wednesday \n",
"18 sky is clear 367 2012-10-03 30000 Wednesday \n",
"19 sky is clear 814 2012-10-03 40000 Wednesday \n",
"20 sky is clear 2718 2012-10-03 50000 Wednesday \n",
"21 sky is clear 5673 2012-10-03 60000 Wednesday \n",
"22 sky is clear 6511 2012-10-03 80000 Wednesday \n",
"23 sky is clear 5471 2012-10-03 90000 Wednesday \n",
"24 sky is clear 5097 2012-10-03 120000 Wednesday \n",
"25 sky is clear 4887 2012-10-03 130000 Wednesday \n",
"26 sky is clear 5337 2012-10-03 140000 Wednesday \n",
"27 sky is clear 5692 2012-10-03 150000 Wednesday \n",
"28 sky is clear 6137 2012-10-03 160000 Wednesday \n",
"29 few clouds 4623 2012-10-03 180000 Wednesday \n",
"30 few clouds 3591 2012-10-03 190000 Wednesday \n",
"31 few clouds 2898 2012-10-03 200000 Wednesday \n",
"32 sky is clear 2637 2012-10-03 210000 Wednesday \n",
"33 sky is clear 1777 2012-10-03 220000 Wednesday \n",
"34 sky is clear 1015 2012-10-03 230000 Wednesday \n",
"35 sky is clear 598 2012-10-04 0 Thursday \n",
"36 sky is clear 369 2012-10-04 10000 Thursday \n",
"37 sky is clear 312 2012-10-04 20000 Thursday \n",
"38 sky is clear 367 2012-10-04 30000 Thursday \n",
"39 sky is clear 835 2012-10-04 40000 Thursday \n",
"40 sky is clear 2726 2012-10-04 50000 Thursday \n",
"41 sky is clear 5689 2012-10-04 60000 Thursday \n",
"42 sky is clear 6990 2012-10-04 70000 Thursday \n",
"43 sky is clear 5985 2012-10-04 80000 Thursday \n",
"44 sky is clear 5309 2012-10-04 90000 Thursday \n",
"45 sky is clear 4603 2012-10-04 100000 Thursday \n",
"46 sky is clear 4884 2012-10-04 110000 Thursday \n",
"47 sky is clear 5104 2012-10-04 120000 Thursday \n",
"48 sky is clear 5178 2012-10-04 130000 Thursday \n",
"49 sky is clear 5501 2012-10-04 140000 Thursday \n",
"50 sky is clear 5713 2012-10-04 150000 Thursday \n",
"51 few clouds 6292 2012-10-04 160000 Thursday \n",
"52 few clouds 6057 2012-10-04 170000 Thursday \n",
"53 scattered clouds 4907 2012-10-04 180000 Thursday \n",
"54 scattered clouds 3503 2012-10-04 190000 Thursday \n",
"55 scattered clouds 3037 2012-10-04 200000 Thursday \n",
"56 broken clouds 2822 2012-10-04 210000 Thursday \n",
"57 broken clouds 1992 2012-10-04 220000 Thursday \n",
"58 broken clouds 1166 2012-10-04 230000 Thursday \n",
"59 broken clouds 627 2012-10-05 0 Friday \n",
"\n",
" time_of_day \n",
"0 morning \n",
"1 morning \n",
"2 morning \n",
"3 midday \n",
"4 afternoon \n",
"5 afternoon \n",
"6 afternoon \n",
"7 afternoon \n",
"8 afternoon \n",
"9 afternoon \n",
"10 evening \n",
"11 evening \n",
"12 evening \n",
"13 evening \n",
"14 evening \n",
"15 midnight \n",
"16 morning \n",
"17 morning \n",
"18 morning \n",
"19 morning \n",
"20 morning \n",
"21 morning \n",
"22 morning \n",
"23 morning \n",
"24 midday \n",
"25 afternoon \n",
"26 afternoon \n",
"27 afternoon \n",
"28 afternoon \n",
"29 afternoon \n",
"30 evening \n",
"31 evening \n",
"32 evening \n",
"33 evening \n",
"34 evening \n",
"35 midnight \n",
"36 morning \n",
"37 morning \n",
"38 morning \n",
"39 morning \n",
"40 morning \n",
"41 morning \n",
"42 morning \n",
"43 morning \n",
"44 morning \n",
"45 morning \n",
"46 morning \n",
"47 midday \n",
"48 afternoon \n",
"49 afternoon \n",
"50 afternoon \n",
"51 afternoon \n",
"52 afternoon \n",
"53 afternoon \n",
"54 evening \n",
"55 evening \n",
"56 evening \n",
"57 evening \n",
"58 evening \n",
"59 midnight "
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"traffic.head(60)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "234ed0f7",
"metadata": {},
"source": [
"## Analysis\n",
"\n",
"1. Which holidays have the most traffic?"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "25fe655c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" holiday \n",
" traffic_volume \n",
" \n",
" \n",
" \n",
" \n",
" 6 \n",
" New Years Day \n",
" 8136 \n",
" \n",
" \n",
" 3 \n",
" Labor Day \n",
" 7092 \n",
" \n",
" \n",
" 9 \n",
" Thanksgiving Day \n",
" 5601 \n",
" \n",
" \n",
" 5 \n",
" Memorial Day \n",
" 5538 \n",
" \n",
" \n",
" 2 \n",
" Independence Day \n",
" 5380 \n",
" \n",
" \n",
" 0 \n",
" Christmas Day \n",
" 4965 \n",
" \n",
" \n",
" 4 \n",
" Martin Luther King Jr Day \n",
" 3676 \n",
" \n",
" \n",
" 10 \n",
" Veterans Day \n",
" 3457 \n",
" \n",
" \n",
" 11 \n",
" Washingtons Birthday \n",
" 3176 \n",
" \n",
" \n",
" 8 \n",
" State Fair \n",
" 3174 \n",
" \n",
" \n",
" 1 \n",
" Columbus Day \n",
" 2597 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" holiday traffic_volume\n",
"6 New Years Day 8136\n",
"3 Labor Day 7092\n",
"9 Thanksgiving Day 5601\n",
"5 Memorial Day 5538\n",
"2 Independence Day 5380\n",
"0 Christmas Day 4965\n",
"4 Martin Luther King Jr Day 3676\n",
"10 Veterans Day 3457\n",
"11 Washingtons Birthday 3176\n",
"8 State Fair 3174\n",
"1 Columbus Day 2597"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"top_holidays=traffic.groupby(\"holiday\")[\"traffic_volume\"].sum().reset_index().sort_values(by=\"traffic_volume\",ascending=False)\n",
"top_holidays=top_holidays[top_holidays.holiday != 'None']\n",
"top_holidays"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "6f96ece4",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\My Pc\\AppData\\Local\\Temp\\ipykernel_6252\\3356038720.py:5: UserWarning: Matplotlib is currently using module://matplotlib_inline.backend_inline, which is a non-GUI backend, so cannot show the figure.\n",
" fig.show()\n"
]
},
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"fig = plt.figure(figsize=(12,5))\n",
"plt.title(\"Top Holidays by Traffic_Volume\")\n",
"sns.barplot(data=top_holidays.head(-7), y=\"holiday\", x=\"traffic_volume\", palette='Blues_d')\n",
"\n",
"fig.show()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "96c5abcf",
"metadata": {},
"source": [
"2. Which weekdays have the most traffic?"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "2cf7073a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" weekday \n",
" traffic_volume \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" Friday \n",
" 24994869 \n",
" \n",
" \n",
" 6 \n",
" Wednesday \n",
" 24831553 \n",
" \n",
" \n",
" 4 \n",
" Thursday \n",
" 24799562 \n",
" \n",
" \n",
" 5 \n",
" Tuesday \n",
" 23882653 \n",
" \n",
" \n",
" 1 \n",
" Monday \n",
" 23403986 \n",
" \n",
" \n",
" 2 \n",
" Saturday \n",
" 18946722 \n",
" \n",
" \n",
" 3 \n",
" Sunday \n",
" 16276939 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" weekday traffic_volume\n",
"0 Friday 24994869\n",
"6 Wednesday 24831553\n",
"4 Thursday 24799562\n",
"5 Tuesday 23882653\n",
"1 Monday 23403986\n",
"2 Saturday 18946722\n",
"3 Sunday 16276939"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"top_weekdays=traffic.groupby('weekday')[\"traffic_volume\"].sum().reset_index().sort_values(by=\"traffic_volume\",ascending=False)\n",
"top_weekdays"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "a1759ba3",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\My Pc\\AppData\\Local\\Temp\\ipykernel_6252\\610389464.py:5: UserWarning: Matplotlib is currently using module://matplotlib_inline.backend_inline, which is a non-GUI backend, so cannot show the figure.\n",
" fig.show()\n"
]
},
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"fig = plt.figure(figsize=(5,5))\n",
"plt.title(\"Top Weekdays by Traffic_Volume\")\n",
"sns.barplot(data=top_weekdays.head(-3), x=\"weekday\", y=\"traffic_volume\")\n",
"\n",
"fig.show()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "13f2df38",
"metadata": {},
"source": [
"3. What time of day has the most traffic?(Is it morning, afternoon, evening, midnight or midday?)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "643abff2",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" time_of_day \n",
" traffic_volume \n",
" \n",
" \n",
" \n",
" \n",
" 4 \n",
" morning \n",
" 62684570 \n",
" \n",
" \n",
" 0 \n",
" afternoon \n",
" 58819695 \n",
" \n",
" \n",
" 1 \n",
" evening \n",
" 24707307 \n",
" \n",
" \n",
" 2 \n",
" midday \n",
" 9224263 \n",
" \n",
" \n",
" 3 \n",
" midnight \n",
" 1700449 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" time_of_day traffic_volume\n",
"4 morning 62684570\n",
"0 afternoon 58819695\n",
"1 evening 24707307\n",
"2 midday 9224263\n",
"3 midnight 1700449"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"top_hours=traffic.groupby('time_of_day')[\"traffic_volume\"].sum().reset_index().sort_values(by=\"traffic_volume\",ascending=False)\n",
"top_hours"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "063b579c",
"metadata": {},
"source": [
"4. Which factor affects traffic the most?"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "2a42125f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"traffic_volume 1.000000\n",
"time 0.352401\n",
"temp 0.130299\n",
"clouds_all 0.067054\n",
"rain_1h 0.004714\n",
"snow_1h 0.000733\n",
"Name: traffic_volume, dtype: float64"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"corr_matrix=traffic.corr()\n",
"corr_matrix['traffic_volume'].sort_values(ascending=False)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "2b3bc9d3",
"metadata": {},
"source": [
"No variable has a strong linear correlation with traffic_volume.\n",
"\n",
"Although traffic is independent of all other variables in our data, time has the strongest linear correlation with traffic volume.\n",
"\n",
"Our null hypothesis is therefore true, since time affects traffic_volume the most."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "67659970",
"metadata": {},
"source": [
"5. Compare rain, snow and temparature based on traffic_volume"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "d56cbf1e",
"metadata": {},
"source": [
"6. What's the highest recorded traffic?"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "a471549c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" holiday \n",
" temp \n",
" rain_1h \n",
" snow_1h \n",
" clouds_all \n",
" weather_main \n",
" weather_description \n",
" traffic_volume \n",
" date \n",
" time \n",
" weekday \n",
" time_of_day \n",
" \n",
" \n",
" \n",
" \n",
" 31615 \n",
" None \n",
" 270.75 \n",
" 0.0 \n",
" 0.0 \n",
" 90 \n",
" Clouds \n",
" overcast clouds \n",
" 7280 \n",
" 2017-03-09 \n",
" 160000 \n",
" Thursday \n",
" afternoon \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" holiday temp rain_1h snow_1h clouds_all weather_main \\\n",
"31615 None 270.75 0.0 0.0 90 Clouds \n",
"\n",
" weather_description traffic_volume date time weekday \\\n",
"31615 overcast clouds 7280 2017-03-09 160000 Thursday \n",
"\n",
" time_of_day \n",
"31615 afternoon "
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"top_traffic = traffic.sort_values('traffic_volume', ascending=False)\n",
"top_traffic.head(1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a64934ba",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
},
"vscode": {
"interpreter": {
"hash": "0bc6de047e58dac5738e6fd71455bb891a228ea6d3201e9c8a63259fc0a2df17"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}