{ "cells": [ { "cell_type": "markdown", "id": "b196af77", "metadata": {}, "source": [ "## A study into success and popularity in the NBA\n", "\n", "#### Calvin Muller" ] }, { "cell_type": "markdown", "id": "70f45457", "metadata": {}, "source": [ "### Asking a question\n", "\n", "When thinking about any sports team, the underlying assumption is that the better the team gets, the more popular the team will become. This makes sense logically: people like to talk about winners. However, could it also be true that if a team is bad on an extreme level that people will also want to talk about this? This is the question that I am investigating: how does the amount of wins a team has impact the popularity of that team? My hypothesis going into it that teams that are on the extreme of both wins and losses will be talked about more than teams that are just mediocre.\n", "\n", "\n", "### Gathering the data\n", "\n", "In order to find this data, I utilized trends.google.com, where Google tracks how much interest is in a certain topic over time. This data provides search data over the last five years for each of the NBA teams I wanted data on (Eastern Conference teams), which contains the first half of the data I need. The other half is the success of the teams, so I went ahead and manually inputted win data for each of these teams over this five year period. This will hopefully give me data that I can plot and find correlation between these two variables. All of this data was gathered into an Excel spreadsheet which I turned into a .CSV file.\n", "\n", "In addition to the data that I gathered from these two sources, I created my own variable for extremity to add to this CSV, which I called \"Extremity Score.\" This was simply the absolute value of the wins subtracted by 41 wins, which is the NBA average. I then created another subjective extremity score that used the value of 32 wins instead, thinking that is a better indicator of what a mediocre team looks like." ] }, { "cell_type": "markdown", "id": "7b7df4bd", "metadata": {}, "source": [ "## Beginning to work with the data\n", "\n", "#### Bringing in the data\n", "\n", "Below I am bringing in the CSV data I gathered and putting it into first a normal pandas dataframe, and then putting it into a pandas pivot table, so that the data is easier to look at as it is grouped by teams. There is some data cleaning that happens here." ] }, { "cell_type": "code", "execution_count": 1, "id": "28a94be7", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "id": "24f5a6de", "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", " | Years | \n", "Searches | \n", "Wins | \n", "Team | \n", "Extremity Score | \n", "Extremity Score (32 wins) | \n", "
---|---|---|---|---|---|---|
1 | \n", "2017.0 | \n", "1237.0 | \n", "51.0 | \n", "Cavaliers | \n", "10.0 | \n", "19.0 | \n", "
2 | \n", "2018.0 | \n", "1072.0 | \n", "50.0 | \n", "Cavaliers | \n", "9.0 | \n", "18.0 | \n", "
3 | \n", "2019.0 | \n", "234.0 | \n", "19.0 | \n", "Cavaliers | \n", "22.0 | \n", "13.0 | \n", "
4 | \n", "2020.0 | \n", "179.0 | \n", "19.0 | \n", "Cavaliers | \n", "22.0 | \n", "13.0 | \n", "
5 | \n", "2021.0 | \n", "255.0 | \n", "22.0 | \n", "Cavaliers | \n", "19.0 | \n", "10.0 | \n", "
Team | \n", "Searches | \n", "Wins | \n", "Extremity Score | \n", "Extremity Score (32 wins) | \n", "
---|---|---|---|---|
Bucks | \n", "169.0 | \n", "42.0 | \n", "1.0 | \n", "10.0 | \n", "
218.0 | \n", "44.0 | \n", "3.0 | \n", "12.0 | \n", "|
272.0 | \n", "56.0 | \n", "15.0 | \n", "24.0 | \n", "|
478.0 | \n", "60.0 | \n", "19.0 | \n", "28.0 | \n", "|
701.0 | \n", "46.0 | \n", "5.0 | \n", "14.0 | \n", "|
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
Wizards | \n", "583.0 | \n", "25.0 | \n", "16.0 | \n", "7.0 | \n", "
905.0 | \n", "32.0 | \n", "9.0 | \n", "0.0 | \n", "|
1106.0 | \n", "43.0 | \n", "2.0 | \n", "11.0 | \n", "|
1204.0 | \n", "34.0 | \n", "7.0 | \n", "2.0 | \n", "|
1447.0 | \n", "49.0 | \n", "8.0 | \n", "17.0 | \n", "
75 rows × 0 columns
\n", "