### 1. Project Overview

I obtained this data from Kaggle, which can be found [here](https://www.kaggle.com/rohanrao/formula-1-world-championship-1950-2020)

We'll be focusing on the main race on Sunday.

In this project, I use different metrics to compare the two McLaren drivers in the 2007 season: Fernando Alonso (from Spain) and Lewis Hamilton (from the UK). A 2x world champion, Alonso has driven in F1 since 2001 and known to derive immense power from poorly engineered cars. Conversely, this is Hamilton's first season in Formula 1; he was previously racing in GP2 (a former F1 feeder).

Hamilton (22 years old in 2007) is younger and therefore less experienced driver than Alonso (26 years in 2007), so we would expect Alonso to perform better.

The metrics we'll be using to evaluate each driver will be their ability to:
 - Begin towards the front of the pack (good grid/starting positioning).
 - Maintaining/Improving their positioning in each lap of a given race.
 - Win races.
 - Win points.
 
Wins and points will be more heavily weighted since these directly dictate a driver's standing in a season. However, grid and lap positioning heavily influence these two factors and can help undercover any nuances between drivers who have a similar number of wins/points.

### 2. Data Cleaning

Data Cleaning mostly consisted of flagging outliers, setting place holders for missing values, checking consistency between related variables, and renaming columns. After the files were cleaned, I used SQL to join relevant fields from different tables and then made some final changes prior to this analysis.

### 3. Preparation for Analysis

We'll start out by importing the relevant libraries and data. Please note you won't see anything in this section unless you toggle on the raw code.

### 4. Grid Positioning

Right before the actual race begins, all the drivers position themselves along the track. The order of this positioning is determined by each driver's performance in qualifying. We deem that one driver outperforms another in any given race if their grid positioning is better/ahead of their teammate's grid positioning.
# difference in positions
ham_2007_pos_piv['end_dif'] = ham_2007_pos_piv['position']['Fernando Alonso'] - ham_2007_pos_piv['position']['Lewis Hamilton']

# add column to indicate which driver finished ahead
ham_2007_pos_piv['Leading_Driver'] = np.where(ham_2007_pos_piv.end_dif < 0 , 'Alonso', 'Hamilton')

fig = px.bar(ham_2007_pos_piv, x='race', y='end_dif',
 labels={'race': 'Race', 'end_dif': 'Difference'},
 color='end_dif', hover_data=['race', 'Leading_Driver'],
 title='Number of Positions Hamilton Finished Ahead of Alonso (2007 Season)')
fig.show() In the two plots above, we notice that:
- Hamilton finished ahead of Alonso in 7 of the 17 races as seen in the pie chart above.
 - Alonso typically only finished ahead of his teammate by a few positions - evident by the height of the bars below the x-axis in the graph above the pie chart.
 - Alonso only held a one position advantage in 60% of the races where he did finish better than his teammate.
 - Conversely, Hamilton tended to put more space between Alonso when he was the leading McLaren driver.
 
Evidently, we have favorable findings for both drivers.
The box plots provide these conclusions:
 - Hamilton's median finishing position is 2nd place, better than Alonso's value of 3rd place.
 - Hamilton had more races where he finished closer to first place but had some where he finished unusually far behind, since his box plot is skewed right
 - Alonso's box plot is also slightly skewed right, though less so than Hamilton's. This means that the number of races where his ending position was towards the front was more evenly split with the number of races where he placed farther back.

While Alonso achieved better placement than his teammate in more races, he usually finished ahead by only a single position. However, when it comes to podium placement, the most important component of final position, Hamilton holds the crown. Furthermore, Hamilton tended to finish in better positions when looking at descriptive statistics. While the comparison in this metric is much closer than the previous ones, __Hamilton portrays himself as the superior driver__ in terms of final positioning.

### 6b. End of Race - Points Earned

As mentioned earlier, a driver's points determine where they place in the overall standings at the end of a season. While positioning and points are directly linked, we can still explore the subtleties to draw further distinctions between Alonso and Hamilton.
