{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "02rFSC_-6zKz"
},
"source": [
"## Assignment 06\n",
"### Note:\n",
"- For visualization, you should use Matplotlib, seaborn, or Plotly Express \n",
"- Use this notebook as your template and follow the instructions\n",
"\n",
"The first half of this assignment is similar to assignment 04.\n",
"\n",
"This gives you a chance to refresh.\n",
"\n",
"The second half is new and gives you a chance to perform additional practices.\n",
"\n",
"You also get a chance to use some of the Python libraries and techniques.\n",
"\n",
"The links to the zip file is:\n",
"\n",
"- https://collegescorecard.ed.gov/data (This web page contains the link to the zip file)\n",
"\n",
"- https://ed-public-download.app.cloud.gov/downloads/CollegeScorecard_Raw_Data_09012022.zip (The link to the zip file)\n",
"\n",
"You can run `!wget` command in Colab notebook to retrieve it directly, then run `!unzip` command to extract files (I have provided the cells to do so next for your convenience)\n",
"\n",
"Your folder structure should look like this in your Colab enviroment:\n",
"\n",
"- ...\n",
"- 'MERGED1996_97_PP.csv',\n",
"- 'MERGED2015_16_PP.csv',\n",
"- ...\n",
"- 'MERGED2017_18_PP.csv'\n",
"- ...\n",
"\n",
"**Note: you should refresh the folder to see all files.**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RemUiHBXf5EK"
},
"source": [
"## Step 1 - Retrieve the zip file"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "IlnBrYreEoY2",
"outputId": "b4e7bf52-5594-40c0-98ea-6b3f90a69436"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2022-10-07 14:45:58-- https://ed-public-download.app.cloud.gov/downloads/CollegeScorecard_Raw_Data_09012022.zip\n",
"Resolving ed-public-download.app.cloud.gov (ed-public-download.app.cloud.gov)... 3.30.138.208, 160.1.161.208, 2600:1f12:18a:7d01:ad67:f64c:95d6:78ed, ...\n",
"Connecting to ed-public-download.app.cloud.gov (ed-public-download.app.cloud.gov)|3.30.138.208|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 410294884 (391M) [application/zip]\n",
"Saving to: ‘CollegeScorecard_Raw_Data_09012022.zip’\n",
"\n",
"CollegeScorecard_Ra 100%[===================>] 391.29M 44.1MB/s in 9.2s \n",
"\n",
"2022-10-07 14:46:08 (42.6 MB/s) - ‘CollegeScorecard_Raw_Data_09012022.zip’ saved [410294884/410294884]\n",
"\n"
]
}
],
"source": [
"!wget https://ed-public-download.app.cloud.gov/downloads/CollegeScorecard_Raw_Data_09012022.zip"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mY5iKlaef_E3"
},
"source": [
"## Step 2 - Unzip the zip file"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "WXcix8Q4EWzN",
"outputId": "cdaf7075-844f-4e75-94b9-cb09c7c29285"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Archive: CollegeScorecard_Raw_Data_09012022.zip\n",
" extracting: Crosswalks.zip \n",
" inflating: data.yaml \n",
" inflating: FieldOfStudyData1415_1516_PP.csv \n",
" inflating: FieldOfStudyData1516_1617_PP.csv \n",
" inflating: FieldOfStudyData1617_1718_PP.csv \n",
" inflating: FieldOfStudyData1718_1819_PP.csv \n",
" inflating: MERGED1996_97_PP.csv \n",
" inflating: MERGED1997_98_PP.csv \n",
" inflating: MERGED1998_99_PP.csv \n",
" inflating: MERGED1999_00_PP.csv \n",
" inflating: MERGED2000_01_PP.csv \n",
" inflating: MERGED2001_02_PP.csv \n",
" inflating: MERGED2002_03_PP.csv \n",
" inflating: MERGED2003_04_PP.csv \n",
" inflating: MERGED2004_05_PP.csv \n",
" inflating: MERGED2005_06_PP.csv \n",
" inflating: MERGED2006_07_PP.csv \n",
" inflating: MERGED2007_08_PP.csv \n",
" inflating: MERGED2008_09_PP.csv \n",
" inflating: MERGED2009_10_PP.csv \n",
" inflating: MERGED2010_11_PP.csv \n",
" inflating: MERGED2011_12_PP.csv \n",
" inflating: MERGED2012_13_PP.csv \n",
" inflating: MERGED2013_14_PP.csv \n",
" inflating: MERGED2014_15_PP.csv \n",
" inflating: MERGED2015_16_PP.csv \n",
" inflating: MERGED2016_17_PP.csv \n",
" inflating: MERGED2017_18_PP.csv \n",
" inflating: MERGED2018_19_PP.csv \n",
" inflating: MERGED2019_20_PP.csv \n",
" inflating: MERGED2020_21_PP.csv \n",
" inflating: Most-Recent-Cohorts-Field-of-Study.csv \n",
" inflating: Most-Recent-Cohorts-Institution.csv \n"
]
}
],
"source": [
"!unzip CollegeScorecard_Raw_Data_09012022.zip"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3gWAI-4_gCs2"
},
"source": [
"## Step 3 - Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "CfSasSG36zK1"
},
"outputs": [],
"source": [
"#(Write code below)\n",
"import pandas as pd\n",
"import os"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 363
},
"id": "iMpA2gPdub69",
"outputId": "3e11affd-533e-4673-b089-ce058ee82270"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
UNITID
\n",
"
INSTNM
\n",
"
TUITIONFEE_IN
\n",
"
year
\n",
"
\n",
" \n",
" \n",
"
\n",
"
1618
\n",
"
172927
\n",
"
American Indian OIC Inc
\n",
"
NaN
\n",
"
2020
\n",
"
\n",
"
\n",
"
1251
\n",
"
157669
\n",
"
Empire Beauty School-Dixie
\n",
"
NaN
\n",
"
2020
\n",
"
\n",
"
\n",
"
5117
\n",
"
457527
\n",
"
Ambria College of Nursing
\n",
"
NaN
\n",
"
2019
\n",
"
\n",
"
\n",
"
2875
\n",
"
213631
\n",
"
United Lutheran Seminary
\n",
"
NaN
\n",
"
2019
\n",
"
\n",
"
\n",
"
6098
\n",
"
492689
\n",
"
Texas Tech University Health Sciences Center-E...
\n",
"
NaN
\n",
"
2019
\n",
"
\n",
"
\n",
"
4688
\n",
"
444936
\n",
"
Turning Point Beauty College
\n",
"
NaN
\n",
"
2019
\n",
"
\n",
"
\n",
"
5024
\n",
"
455354
\n",
"
Aveda Arts & Sciences Institute-San Antonio
\n",
"
NaN
\n",
"
2019
\n",
"
\n",
"
\n",
"
1567
\n",
"
169983
\n",
"
Kettering University
\n",
"
44380.0
\n",
"
2019
\n",
"
\n",
"
\n",
"
1291
\n",
"
159009
\n",
"
Grambling State University
\n",
"
7683.0
\n",
"
2019
\n",
"
\n",
"
\n",
"
4687
\n",
"
447865
\n",
"
Trendsetters School of Beauty & Barbering
\n",
"
NaN
\n",
"
2020
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" UNITID INSTNM \\\n",
"1618 172927 American Indian OIC Inc \n",
"1251 157669 Empire Beauty School-Dixie \n",
"5117 457527 Ambria College of Nursing \n",
"2875 213631 United Lutheran Seminary \n",
"6098 492689 Texas Tech University Health Sciences Center-E... \n",
"4688 444936 Turning Point Beauty College \n",
"5024 455354 Aveda Arts & Sciences Institute-San Antonio \n",
"1567 169983 Kettering University \n",
"1291 159009 Grambling State University \n",
"4687 447865 Trendsetters School of Beauty & Barbering \n",
"\n",
" TUITIONFEE_IN year \n",
"1618 NaN 2020 \n",
"1251 NaN 2020 \n",
"5117 NaN 2019 \n",
"2875 NaN 2019 \n",
"6098 NaN 2019 \n",
"4688 NaN 2019 \n",
"5024 NaN 2019 \n",
"1567 44380.0 2019 \n",
"1291 7683.0 2019 \n",
"4687 NaN 2020 "
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This is not a good practice. A lot of repetitions.\n",
"# Not recommended\n",
"\n",
"df_2020 = pd.read_csv(\"/content/MERGED2020_21_PP.csv\", usecols=['UNITID', 'INSTNM', 'TUITIONFEE_IN'])\n",
"df_2020[\"year\"] = 2020\n",
"df_2019 = pd.read_csv(\"/content/MERGED2019_20_PP.csv\", usecols=['UNITID', 'INSTNM', 'TUITIONFEE_IN'])\n",
"df_2019[\"year\"] = 2019\n",
"#.... (more code)\n",
"# df_2020 = pd.read_csv(\"/content/MERGED2020_21_PP.csv\")\n",
"# df_all = pd.concat([df_1996, df_1997, ...., df_2020])\n",
"df_all = pd.concat([df_2020, df_2019])\n",
"\n",
"df_all.sample(10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zNVWs_R9gKrr"
},
"source": [
"## Step 4 - Display the current working directory using os.getcwd()\n",
"\n",
"You would need to import a standard Python library called os which stands for operating system. so place that import statement in the previous cell. Since your notebook and your data files may or may not in the same folder, you want to make sure what the current working folder is and how to access a data file in a different folder."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 36
},
"id": "duwKxi1L6zK5",
"outputId": "2eee998e-d4f3-48fe-faf8-a6ff4bbb9a14"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'/content'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#(Write code below)\n",
"os.getcwd()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SxFHpsKSgxzv"
},
"source": [
"## Step 5 - Get the list of file names\n",
"\n",
"os library has a method call listdir which generates a list of files in a directory/folder. Use this method to assign the contents (list of file names) of the data folder to a variable and display it. If necessary, you can use ../ construct to traverse to the parent folder and then to another foloder parallel to the current folder\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "LMBN0HsY6zLA",
"outputId": "6e38f706-5468-4029-d033-ada5c2b96ba1"
},
"outputs": [
{
"data": {
"text/plain": [
"['.config',\n",
" 'MERGED2014_15_PP.csv',\n",
" 'MERGED2005_06_PP.csv',\n",
" 'MERGED2004_05_PP.csv',\n",
" 'Most-Recent-Cohorts-Institution.csv',\n",
" 'MERGED2000_01_PP.csv',\n",
" 'FieldOfStudyData1617_1718_PP.csv',\n",
" 'MERGED2019_20_PP.csv',\n",
" 'MERGED2015_16_PP.csv',\n",
" 'MERGED2018_19_PP.csv',\n",
" 'data.yaml',\n",
" 'MERGED2007_08_PP.csv',\n",
" 'FieldOfStudyData1516_1617_PP.csv',\n",
" 'MERGED2008_09_PP.csv',\n",
" 'FieldOfStudyData1415_1516_PP.csv',\n",
" 'MERGED2003_04_PP.csv',\n",
" 'MERGED2017_18_PP.csv',\n",
" 'MERGED2016_17_PP.csv',\n",
" 'MERGED2012_13_PP.csv',\n",
" 'MERGED2006_07_PP.csv',\n",
" 'MERGED1996_97_PP.csv',\n",
" 'MERGED2002_03_PP.csv',\n",
" 'FieldOfStudyData1718_1819_PP.csv',\n",
" 'MERGED2013_14_PP.csv',\n",
" 'CollegeScorecard_Raw_Data_09012022.zip',\n",
" 'MERGED2011_12_PP.csv',\n",
" 'MERGED2020_21_PP.csv',\n",
" 'Crosswalks.zip',\n",
" 'MERGED2009_10_PP.csv',\n",
" 'MERGED1997_98_PP.csv',\n",
" 'Most-Recent-Cohorts-Field-of-Study.csv',\n",
" 'MERGED1999_00_PP.csv',\n",
" 'MERGED2010_11_PP.csv',\n",
" 'MERGED1998_99_PP.csv',\n",
" 'MERGED2001_02_PP.csv',\n",
" 'sample_data']"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#(Write code below)\n",
"name_list = os.listdir()\n",
"name_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XtKgf-SMhL0z"
},
"source": [
"## Step 6 - Process only the yearly data files\n",
"\n",
"The folder contains files that are not the yearly data files. Write code to remove the unwanted files from the list variable.\n",
"\n",
"Note: don't remove/delete these files from the folder in the folder.\n",
"For example, use the file extension to only use the csv files or use the name patter - data file name begins with \"MERGED\". You can use the concept of list comprehension to write just *one* line of code as well as using a for loop, your choice."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "UCDRqtrf6zLL",
"outputId": "754bfd17-99a5-42bc-c12a-03e2bfab4a11"
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 9, 36, 64]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#(Write code below)\n",
"# using loop\n",
"x = [1,3,6,8]\n",
"y = []\n",
"for num in x:\n",
" y.append(num ** 2)\n",
"\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "OsTZMRfjzpwB",
"outputId": "1d0335ea-bd6b-4d4a-b94f-c0add1ff6858"
},
"outputs": [
{
"data": {
"text/plain": [
"[36, 64]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# using loop with condition\n",
"x = [1,3,6,8]\n",
"y = []\n",
"\n",
"for num in x:\n",
" if num % 2 == 0:\n",
" y.append(num ** 2)\n",
"\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "5eISR8FjyzyH",
"outputId": "89e807f1-7615-413e-dab4-bb2461ec1d71"
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 9, 36, 64]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# using List comprehension\n",
"y = [num **2 for num in x]\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "B6k0ES8GzZQ8",
"outputId": "e22dc168-3562-498a-dc28-89da130ba6d3"
},
"outputs": [
{
"data": {
"text/plain": [
"[36, 64]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# using List comprehension with condition\n",
"y = [num **2 for num in x if num % 2 == 0]\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "j08cIjcf0ZCw",
"outputId": "6802b89f-edcd-4554-e2a7-ff7be22e4308"
},
"outputs": [
{
"data": {
"text/plain": [
"['MERGED2014_15_PP.csv',\n",
" 'MERGED2005_06_PP.csv',\n",
" 'MERGED2004_05_PP.csv',\n",
" 'MERGED2000_01_PP.csv',\n",
" 'MERGED2019_20_PP.csv',\n",
" 'MERGED2015_16_PP.csv',\n",
" 'MERGED2018_19_PP.csv',\n",
" 'MERGED2007_08_PP.csv',\n",
" 'MERGED2008_09_PP.csv',\n",
" 'MERGED2003_04_PP.csv',\n",
" 'MERGED2017_18_PP.csv',\n",
" 'MERGED2016_17_PP.csv',\n",
" 'MERGED2012_13_PP.csv',\n",
" 'MERGED2006_07_PP.csv',\n",
" 'MERGED1996_97_PP.csv',\n",
" 'MERGED2002_03_PP.csv',\n",
" 'MERGED2013_14_PP.csv',\n",
" 'MERGED2011_12_PP.csv',\n",
" 'MERGED2020_21_PP.csv',\n",
" 'MERGED2009_10_PP.csv',\n",
" 'MERGED1997_98_PP.csv',\n",
" 'MERGED1999_00_PP.csv',\n",
" 'MERGED2010_11_PP.csv',\n",
" 'MERGED1998_99_PP.csv',\n",
" 'MERGED2001_02_PP.csv']"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"yearly_list = [file_name for file_name in name_list if file_name.startswith(\"MERGED\")]\n",
"\n",
"yearly_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UbrfZBRThtG8"
},
"source": [
"## Step 7 - Load data files\n",
" \n",
"Now that you have a clean list of the yearly files, you want to loop through them and read them into a dataframe one at a time. You only load six columns: [\"UNITID\", \"INSTNM\", \"STABBR\", \"REGION\", \"ADM_RATE\", \"TUITIONFEE_IN\"].\n",
"\n",
"You should use \"usecols\" option of Pandas to avoid reading unwanted columns. You also want to add a new column call \"YEAR\" to differentiate the data frames from each other. The YEAR variable should be yyyy format so tht you can convert them into integer. If you use the format yyyy-yy (such as 1997-98 school year), you will not be able to convert them directly to integer. If you use scatter plot, the YEAR needs to be converted to integer or float. \n",
"\n",
"You would use an empty list and append the yearly dataframes to the list. After all data files are loaded and appended to the list, you would use Pandas to concatenate them into a new single data frame.\n",
"\n",
"Note: this exercise incorporates many techques we learned before:\n",
"\n",
"- list (creating an empty, append an item to the list)\n",
"- for loop \n",
"- read only the needed columns from a file (using usecols option)\n",
"- add a new column to a data frame\n",
"- concatenate multiple dataframes into a single one\n",
"\n",
"This exercise may appear challenging but it worths the effort. You will learn a lot and love it. I promise."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 36
},
"id": "4JleOHyl2QP6",
"outputId": "c48119e5-8863-4700-988b-96b964d15857"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'2020'"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = \"MERGED2020_21_PP.csv\"\n",
"\n",
"x[6:10]"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "rtrB8C_R6zLS",
"outputId": "0481b5a0-669e-4385-8a53-3f22a002b3d2"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"processing file MERGED1996_97_PP.csv\n",
"(7007, 4)\n",
"processing file MERGED1997_98_PP.csv\n",
"(13941, 4)\n",
"processing file MERGED1998_99_PP.csv\n",
"(20643, 4)\n",
"processing file MERGED1999_00_PP.csv\n",
"(27252, 4)\n",
"processing file MERGED2000_01_PP.csv\n",
"(33906, 4)\n",
"processing file MERGED2001_02_PP.csv\n",
"(40631, 4)\n",
"processing file MERGED2002_03_PP.csv\n",
"(47283, 4)\n",
"processing file MERGED2003_04_PP.csv\n",
"(53956, 4)\n",
"processing file MERGED2004_05_PP.csv\n",
"(60703, 4)\n",
"processing file MERGED2005_06_PP.csv\n",
"(67602, 4)\n",
"processing file MERGED2006_07_PP.csv\n",
"(74553, 4)\n",
"processing file MERGED2007_08_PP.csv\n",
"(81524, 4)\n",
"processing file MERGED2008_09_PP.csv\n",
"(88579, 4)\n",
"processing file MERGED2009_10_PP.csv\n",
"(95796, 4)\n",
"processing file MERGED2010_11_PP.csv\n",
"(103266, 4)\n",
"processing file MERGED2011_12_PP.csv\n",
"(111012, 4)\n",
"processing file MERGED2012_13_PP.csv\n",
"(118874, 4)\n",
"processing file MERGED2013_14_PP.csv\n",
"(126743, 4)\n",
"processing file MERGED2014_15_PP.csv\n",
"(134509, 4)\n",
"processing file MERGED2015_16_PP.csv\n",
"(142175, 4)\n",
"processing file MERGED2016_17_PP.csv\n",
"(149413, 4)\n",
"processing file MERGED2017_18_PP.csv\n",
"(156525, 4)\n",
"processing file MERGED2018_19_PP.csv\n",
"(163332, 4)\n",
"processing file MERGED2019_20_PP.csv\n",
"(170026, 4)\n",
"processing file MERGED2020_21_PP.csv\n",
"(176707, 4)\n"
]
}
],
"source": [
"#(Write code here)\n",
"\n",
"# df_all = pd.DataFrame() this empty dataframe does not work for the append method.\n",
"\n",
"yearly_list.sort()\n",
"\n",
"print(\"processing file\", yearly_list[0])\n",
"df_all = pd.read_csv(\"/content/\" + yearly_list[0], usecols=['UNITID', 'INSTNM', 'TUITIONFEE_IN'])\n",
"df_all[\"year\"] = yearly_list[0][6:10]\n",
"print(df_all.shape)\n",
"\n",
"for yearly_file in yearly_list[1:]:\n",
" print(\"processing file\", yearly_file)\n",
" df1 = pd.read_csv(\"/content/\" + yearly_file, usecols=['UNITID', 'INSTNM', 'TUITIONFEE_IN'])\n",
" df1[\"year\"] = yearly_file[6:10]\n",
" df_all = df_all.append(df1, ignore_index = True)\n",
" print(df_all.shape)\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "M-OsDFXFiZd-"
},
"source": [
"## Step 8 - Explore the new dataframe \n",
"\n",
"For example, # of observations, varibles, head, tail, sample, missing values, statistics, etc."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "riJxj3vU6zLY",
"outputId": "c0cfeed6-8cdf-426d-9555-3ffe026aa902"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 176707 entries, 0 to 176706\n",
"Data columns (total 4 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 UNITID 176707 non-null int64 \n",
" 1 INSTNM 176707 non-null object \n",
" 2 TUITIONFEE_IN 86867 non-null float64\n",
" 3 year 176707 non-null object \n",
"dtypes: float64(1), int64(1), object(2)\n",
"memory usage: 5.4+ MB\n"
]
}
],
"source": [
"#(Write code here)\n",
"df_all.info()"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"id": "fQm-XtG78eAQ",
"outputId": "13c6de5b-5bc8-4bf4-a239-bdefd59d6791"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
UNITID
\n",
"
INSTNM
\n",
"
TUITIONFEE_IN
\n",
"
year
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
100636
\n",
"
Community College of the Air Force
\n",
"
NaN
\n",
"
1996
\n",
"
\n",
"
\n",
"
1
\n",
"
100654
\n",
"
Alabama A & M University
\n",
"
NaN
\n",
"
1996
\n",
"
\n",
"
\n",
"
2
\n",
"
100663
\n",
"
University of Alabama at Birmingham
\n",
"
NaN
\n",
"
1996
\n",
"
\n",
"
\n",
"
3
\n",
"
100672
\n",
"
ALABAMA AVIATION AND TECHNICAL COLLEGE
\n",
"
NaN
\n",
"
1996
\n",
"
\n",
"
\n",
"
4
\n",
"
100690
\n",
"
Amridge University
\n",
"
NaN
\n",
"
1996
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" UNITID INSTNM TUITIONFEE_IN year\n",
"0 100636 Community College of the Air Force NaN 1996\n",
"1 100654 Alabama A & M University NaN 1996\n",
"2 100663 University of Alabama at Birmingham NaN 1996\n",
"3 100672 ALABAMA AVIATION AND TECHNICAL COLLEGE NaN 1996\n",
"4 100690 Amridge University NaN 1996"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_all.head()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 300
},
"id": "0Yc3z2Gf8eRc",
"outputId": "4ce37635-edfd-4e18-94f5-b25edccf9f3c"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" "
],
"text/plain": [
" UNITID INSTNM \\\n",
"176702 49576719 Pennsylvania State University-Penn State Wilke... \n",
"176703 49576720 Pennsylvania State University-Penn State York \n",
"176704 49576721 Pennsylvania State University-Penn State Great... \n",
"176705 49576722 Pennsylvania State University-Penn State Harri... \n",
"176706 49576723 Pennsylvania State University-Penn State Brand... \n",
"\n",
" TUITIONFEE_IN year \n",
"176702 13604.0 2020 \n",
"176703 14486.0 2020 \n",
"176704 NaN 2020 \n",
"176705 15216.0 2020 \n",
"176706 14486.0 2020 "
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_all.tail()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 363
},
"id": "OIr5Up-P8uDI",
"outputId": "53ceba5c-f5c7-4663-8a53-0e225d7ffb36"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
UNITID
\n",
"
INSTNM
\n",
"
TUITIONFEE_IN
\n",
"
year
\n",
"
\n",
" \n",
" \n",
"
\n",
"
47356
\n",
"
102553
\n",
"
University of Alaska Anchorage
\n",
"
3232.0
\n",
"
2003
\n",
"
\n",
"
\n",
"
64385
\n",
"
214546
\n",
"
Pennsylvania Academy of Cosmetology Arts and S...
\n",
"
NaN
\n",
"
2005
\n",
"
\n",
"
\n",
"
166483
\n",
"
220613
\n",
"
Lee University
\n",
"
18770.0
\n",
"
2019
\n",
"
\n",
"
\n",
"
45161
\n",
"
233842
\n",
"
Union Presbyterian Seminary
\n",
"
NaN
\n",
"
2002
\n",
"
\n",
"
\n",
"
105960
\n",
"
194736
\n",
"
Rabbinical College of Long Island
\n",
"
NaN
\n",
"
2011
\n",
"
\n",
"
\n",
"
81186
\n",
"
449223
\n",
"
Regency Beauty Institute-Fairview Heights
\n",
"
NaN
\n",
"
2007
\n",
"
\n",
"
\n",
"
161362
\n",
"
445364
\n",
"
North-West College-Riverside
\n",
"
NaN
\n",
"
2018
\n",
"
\n",
"
\n",
"
91317
\n",
"
194073
\n",
"
New York College of Podiatric Medicine
\n",
"
NaN
\n",
"
2009
\n",
"
\n",
"
\n",
"
53165
\n",
"
420796
\n",
"
EMPIRE BEAUTY SCHOOL-BALTIMORE
\n",
"
NaN
\n",
"
2003
\n",
"
\n",
"
\n",
"
87282
\n",
"
437103
\n",
"
Baton Rouge Community College
\n",
"
1884.0
\n",
"
2008
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" UNITID INSTNM \\\n",
"47356 102553 University of Alaska Anchorage \n",
"64385 214546 Pennsylvania Academy of Cosmetology Arts and S... \n",
"166483 220613 Lee University \n",
"45161 233842 Union Presbyterian Seminary \n",
"105960 194736 Rabbinical College of Long Island \n",
"81186 449223 Regency Beauty Institute-Fairview Heights \n",
"161362 445364 North-West College-Riverside \n",
"91317 194073 New York College of Podiatric Medicine \n",
"53165 420796 EMPIRE BEAUTY SCHOOL-BALTIMORE \n",
"87282 437103 Baton Rouge Community College \n",
"\n",
" TUITIONFEE_IN year \n",
"47356 3232.0 2003 \n",
"64385 NaN 2005 \n",
"166483 18770.0 2019 \n",
"45161 NaN 2002 \n",
"105960 NaN 2011 \n",
"81186 NaN 2007 \n",
"161362 NaN 2018 \n",
"91317 NaN 2009 \n",
"53165 NaN 2003 \n",
"87282 1884.0 2008 "
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_all.sample(10)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 424
},
"id": "4FgXDHHv8uOd",
"outputId": "c5b1e4e3-47f1-469f-c1a0-8f35352c6b83"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
UNITID
\n",
"
INSTNM
\n",
"
TUITIONFEE_IN
\n",
"
year
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
False
\n",
"
False
\n",
"
True
\n",
"
False
\n",
"
\n",
"
\n",
"
1
\n",
"
False
\n",
"
False
\n",
"
True
\n",
"
False
\n",
"
\n",
"
\n",
"
2
\n",
"
False
\n",
"
False
\n",
"
True
\n",
"
False
\n",
"
\n",
"
\n",
"
3
\n",
"
False
\n",
"
False
\n",
"
True
\n",
"
False
\n",
"
\n",
"
\n",
"
4
\n",
"
False
\n",
"
False
\n",
"
True
\n",
"
False
\n",
"
\n",
"
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
\n",
"
\n",
"
176702
\n",
"
False
\n",
"
False
\n",
"
False
\n",
"
False
\n",
"
\n",
"
\n",
"
176703
\n",
"
False
\n",
"
False
\n",
"
False
\n",
"
False
\n",
"
\n",
"
\n",
"
176704
\n",
"
False
\n",
"
False
\n",
"
True
\n",
"
False
\n",
"
\n",
"
\n",
"
176705
\n",
"
False
\n",
"
False
\n",
"
False
\n",
"
False
\n",
"
\n",
"
\n",
"
176706
\n",
"
False
\n",
"
False
\n",
"
False
\n",
"
False
\n",
"
\n",
" \n",
"
\n",
"
176707 rows × 4 columns
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" UNITID INSTNM TUITIONFEE_IN year\n",
"0 False False True False\n",
"1 False False True False\n",
"2 False False True False\n",
"3 False False True False\n",
"4 False False True False\n",
"... ... ... ... ...\n",
"176702 False False False False\n",
"176703 False False False False\n",
"176704 False False True False\n",
"176705 False False False False\n",
"176706 False False False False\n",
"\n",
"[176707 rows x 4 columns]"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_all.isnull()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ViA5VxKz8uc8",
"outputId": "8e88cbc2-9943-428d-a820-9c6aa61eee0f"
},
"outputs": [
{
"data": {
"text/plain": [
".sum of UNITID INSTNM TUITIONFEE_IN year\n",
"0 False False True False\n",
"1 False False True False\n",
"2 False False True False\n",
"3 False False True False\n",
"4 False False True False\n",
"... ... ... ... ...\n",
"176702 False False False False\n",
"176703 False False False False\n",
"176704 False False True False\n",
"176705 False False False False\n",
"176706 False False False False\n",
"\n",
"[176707 rows x 4 columns]>"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_all.isnull().sum"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "odf117B-ip7D"
},
"source": [
"## Step 9 - UMBC \n",
"\n",
"The dataframe contains many years of data of all U.S. colleges. let's just look at UMBC. Filter/query the dataframe to retrieve only rows that belong to UMBC (one row represent one year). Save the UMNC data to a new data frame. using a new variable for the UMBC data frame, so that the old big data frame is still available for later use."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 833
},
"id": "NQjKoDmU6zLd",
"outputId": "e4f53c54-ec01-470b-aa50-ce67668f93f5"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
UNITID
\n",
"
INSTNM
\n",
"
TUITIONFEE_IN
\n",
"
year
\n",
"
\n",
" \n",
" \n",
"
\n",
"
2151
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1996
\n",
"
\n",
"
\n",
"
9056
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1997
\n",
"
\n",
"
\n",
"
15923
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1998
\n",
"
\n",
"
\n",
"
22598
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1999
\n",
"
\n",
"
\n",
"
29178
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
5490.0
\n",
"
2000
\n",
"
\n",
"
\n",
"
35821
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
5910.0
\n",
"
2001
\n",
"
\n",
"
\n",
"
42524
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
6362.0
\n",
"
2002
\n",
"
\n",
"
\n",
"
49156
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
7388.0
\n",
"
2003
\n",
"
\n",
"
\n",
"
55786
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8020.0
\n",
"
2004
\n",
"
\n",
"
\n",
"
62522
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8520.0
\n",
"
2005
\n",
"
\n",
"
\n",
"
69401
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8622.0
\n",
"
2006
\n",
"
\n",
"
\n",
"
76332
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8708.0
\n",
"
2007
\n",
"
\n",
"
\n",
"
83291
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8780.0
\n",
"
2008
\n",
"
\n",
"
\n",
"
90324
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8872.0
\n",
"
2009
\n",
"
\n",
"
\n",
"
97506
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
9171.0
\n",
"
2010
\n",
"
\n",
"
\n",
"
104960
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
9467.0
\n",
"
2011
\n",
"
\n",
"
\n",
"
112672
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
9764.0
\n",
"
2012
\n",
"
\n",
"
\n",
"
120503
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
10068.0
\n",
"
2013
\n",
"
\n",
"
\n",
"
128353
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
10384.0
\n",
"
2014
\n",
"
\n",
"
\n",
"
136064
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11006.0
\n",
"
2015
\n",
"
\n",
"
\n",
"
143685
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11264.0
\n",
"
2016
\n",
"
\n",
"
\n",
"
150891
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11518.0
\n",
"
2017
\n",
"
\n",
"
\n",
"
157940
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11778.0
\n",
"
2018
\n",
"
\n",
"
\n",
"
164723
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
12028.0
\n",
"
2019
\n",
"
\n",
"
\n",
"
171401
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
9420.0
\n",
"
2020
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" UNITID INSTNM TUITIONFEE_IN year\n",
"2151 163268 University of Maryland-Baltimore County NaN 1996\n",
"9056 163268 University of Maryland-Baltimore County NaN 1997\n",
"15923 163268 University of Maryland-Baltimore County NaN 1998\n",
"22598 163268 University of Maryland-Baltimore County NaN 1999\n",
"29178 163268 University of Maryland-Baltimore County 5490.0 2000\n",
"35821 163268 University of Maryland-Baltimore County 5910.0 2001\n",
"42524 163268 University of Maryland-Baltimore County 6362.0 2002\n",
"49156 163268 University of Maryland-Baltimore County 7388.0 2003\n",
"55786 163268 University of Maryland-Baltimore County 8020.0 2004\n",
"62522 163268 University of Maryland-Baltimore County 8520.0 2005\n",
"69401 163268 University of Maryland-Baltimore County 8622.0 2006\n",
"76332 163268 University of Maryland-Baltimore County 8708.0 2007\n",
"83291 163268 University of Maryland-Baltimore County 8780.0 2008\n",
"90324 163268 University of Maryland-Baltimore County 8872.0 2009\n",
"97506 163268 University of Maryland-Baltimore County 9171.0 2010\n",
"104960 163268 University of Maryland-Baltimore County 9467.0 2011\n",
"112672 163268 University of Maryland-Baltimore County 9764.0 2012\n",
"120503 163268 University of Maryland-Baltimore County 10068.0 2013\n",
"128353 163268 University of Maryland-Baltimore County 10384.0 2014\n",
"136064 163268 University of Maryland-Baltimore County 11006.0 2015\n",
"143685 163268 University of Maryland-Baltimore County 11264.0 2016\n",
"150891 163268 University of Maryland-Baltimore County 11518.0 2017\n",
"157940 163268 University of Maryland-Baltimore County 11778.0 2018\n",
"164723 163268 University of Maryland-Baltimore County 12028.0 2019\n",
"171401 163268 University of Maryland-Baltimore County 9420.0 2020"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#(Write code here)\n",
"df_UMBC=df_all[df_all[\"INSTNM\"].str.contains(\"University of Maryland-Baltimore County\")]\n",
"df_UMBC"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JEwabb0PjCmT"
},
"source": [
"## Step 10 - Explore the new dataframe\n",
"\n",
"For example, # of observations, varibles, head, tail, sample, missing values, statistics, etc"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "d5X87ugp6zLt",
"outputId": "451b2bc1-4aee-4d7a-cdd9-3554eef1ae5c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Int64Index: 25 entries, 2151 to 171401\n",
"Data columns (total 4 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 UNITID 25 non-null int64 \n",
" 1 INSTNM 25 non-null object \n",
" 2 TUITIONFEE_IN 21 non-null float64\n",
" 3 year 25 non-null object \n",
"dtypes: float64(1), int64(1), object(2)\n",
"memory usage: 1000.0+ bytes\n"
]
}
],
"source": [
"#(Write code here)\n",
"df_UMBC.info()"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"id": "wBF9yUNE9sGF",
"outputId": "2865a666-20fc-4d4c-bef4-d377fa83a72d"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
UNITID
\n",
"
INSTNM
\n",
"
TUITIONFEE_IN
\n",
"
year
\n",
"
\n",
" \n",
" \n",
"
\n",
"
2151
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1996
\n",
"
\n",
"
\n",
"
9056
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1997
\n",
"
\n",
"
\n",
"
15923
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1998
\n",
"
\n",
"
\n",
"
22598
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1999
\n",
"
\n",
"
\n",
"
29178
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
5490.0
\n",
"
2000
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" UNITID INSTNM TUITIONFEE_IN year\n",
"2151 163268 University of Maryland-Baltimore County NaN 1996\n",
"9056 163268 University of Maryland-Baltimore County NaN 1997\n",
"15923 163268 University of Maryland-Baltimore County NaN 1998\n",
"22598 163268 University of Maryland-Baltimore County NaN 1999\n",
"29178 163268 University of Maryland-Baltimore County 5490.0 2000"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_UMBC.head()"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"id": "qxU20vIu9sJO",
"outputId": "77768fe0-2c44-4d8d-de3c-dac1370bc473"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
UNITID
\n",
"
INSTNM
\n",
"
TUITIONFEE_IN
\n",
"
year
\n",
"
\n",
" \n",
" \n",
"
\n",
"
143685
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11264.0
\n",
"
2016
\n",
"
\n",
"
\n",
"
150891
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11518.0
\n",
"
2017
\n",
"
\n",
"
\n",
"
157940
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11778.0
\n",
"
2018
\n",
"
\n",
"
\n",
"
164723
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
12028.0
\n",
"
2019
\n",
"
\n",
"
\n",
"
171401
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
9420.0
\n",
"
2020
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" UNITID INSTNM TUITIONFEE_IN year\n",
"143685 163268 University of Maryland-Baltimore County 11264.0 2016\n",
"150891 163268 University of Maryland-Baltimore County 11518.0 2017\n",
"157940 163268 University of Maryland-Baltimore County 11778.0 2018\n",
"164723 163268 University of Maryland-Baltimore County 12028.0 2019\n",
"171401 163268 University of Maryland-Baltimore County 9420.0 2020"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_UMBC.tail()"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 300
},
"id": "GiXIOU9V9sL3",
"outputId": "67fc505b-fa89-426b-868b-c52e969699e6"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"#(Write code here)\n",
"sns.lineplot(x='year',y='TUITIONFEE_IN',data=df_UMBC)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5FH2h2_ajlKE"
},
"source": [
"## Step 13 - Tuition Growth Rate\n",
"\n",
"Now let's look at the tuition growth rate year over year. We need to calculate UMBC tuition change percentage each year. First convert the TUITIONFEE_IN column to a Python List"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xQLB2f7J6zL-",
"outputId": "810c269f-663d-449f-83a5-f6902af9473e"
},
"outputs": [
{
"data": {
"text/plain": [
"[0,\n",
" nan,\n",
" nan,\n",
" nan,\n",
" nan,\n",
" 7.650273224043716,\n",
" 7.648054145516074,\n",
" 16.12700408676517,\n",
" 8.554412560909583,\n",
" 6.234413965087282,\n",
" 1.1971830985915493,\n",
" 0.9974483878450475,\n",
" 0.8268259072117593,\n",
" 1.0478359908883828,\n",
" 3.370153291253381,\n",
" 3.227565151019518,\n",
" 3.1372134783986483,\n",
" 3.11347808275297,\n",
" 3.1386571315057608,\n",
" 5.989984591679507,\n",
" 2.344175904052335,\n",
" 2.254971590909091,\n",
" 2.2573363431151243,\n",
" 2.122601460349805,\n",
" -21.68274027269704]"
]
},
"execution_count": 70,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#(Write code here)\n",
"tuition_Growth = df_UMBC['TUITIONFEE_IN'].tolist()\n",
"percent_Growth = [0]\n",
"for i in range(1,len(tuition_Growth)):\n",
" difference = (tuition_Growth[i]-tuition_Growth[i-1])/tuition_Growth[i-1]\n",
" percent_Growth.append(difference*100)\n",
"percent_Growth "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "e4OhAzjlj3tw"
},
"source": [
"## Step 14 - Loop through the list and calculate the % change each year over the prior year\n",
"\n",
"This takes some effort. Not hard, just some abstract/logical thinking and some experiments. Have fun on this one."
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "LRUyQCtJ6zMB",
"outputId": "704c1874-2f50-408b-d67a-a40b90d7aa6c"
},
"outputs": [
{
"data": {
"text/plain": [
"[0,\n",
" nan,\n",
" nan,\n",
" nan,\n",
" nan,\n",
" 7.650273224043716,\n",
" 7.648054145516074,\n",
" 16.12700408676517,\n",
" 8.554412560909583,\n",
" 6.234413965087282,\n",
" 1.1971830985915493,\n",
" 0.9974483878450475,\n",
" 0.8268259072117593,\n",
" 1.0478359908883828,\n",
" 3.370153291253381,\n",
" 3.227565151019518,\n",
" 3.1372134783986483,\n",
" 3.11347808275297,\n",
" 3.1386571315057608,\n",
" 5.989984591679507,\n",
" 2.344175904052335,\n",
" 2.254971590909091,\n",
" 2.2573363431151243,\n",
" 2.122601460349805,\n",
" -21.68274027269704]"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" #(Write code here)\n",
"list = []\n",
"for i in percent_Growth:\n",
" list.append(i)\n",
"list"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vLogjxM1kSDH"
},
"source": [
"## Step 15 - Round up the percentage to two decimal points\n",
"\n",
"The resulting number has many decimal points which are unnecessary and not visually appealing. You can use for loop. Or better, use list comprehension for simplicity/brevity."
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "9sxMk1hc6zML",
"outputId": "622972d4-9c52-4865-fc3a-c41538aa8677"
},
"outputs": [
{
"data": {
"text/plain": [
"[0,\n",
" nan,\n",
" nan,\n",
" nan,\n",
" nan,\n",
" 7.65,\n",
" 7.65,\n",
" 16.13,\n",
" 8.55,\n",
" 6.23,\n",
" 1.2,\n",
" 1.0,\n",
" 0.83,\n",
" 1.05,\n",
" 3.37,\n",
" 3.23,\n",
" 3.14,\n",
" 3.11,\n",
" 3.14,\n",
" 5.99,\n",
" 2.34,\n",
" 2.25,\n",
" 2.26,\n",
" 2.12,\n",
" -21.68]"
]
},
"execution_count": 72,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#(Write code here)\n",
"rounded_list = []\n",
"for i in percent_Growth:\n",
" rounded_list.append(round(i,2))\n",
"rounded_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cWK8vDitkLpx"
},
"source": [
"## Step 16 - Add the list of the percentages to the dataframe as a new column (\"PCT_CHANGE\")\n",
"\n",
"Not as hard as you may think. If you get stuck, you think too hard. Google it and you will find the answer."
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 937
},
"id": "_04rVaZ16zMR",
"outputId": "b24b01fa-9c3e-4211-c62b-79477e1f3649"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" \n"
]
},
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
UNITID
\n",
"
INSTNM
\n",
"
TUITIONFEE_IN
\n",
"
year
\n",
"
PCT_CHANGE
\n",
"
\n",
" \n",
" \n",
"
\n",
"
2151
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1996
\n",
"
0.00
\n",
"
\n",
"
\n",
"
9056
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1997
\n",
"
NaN
\n",
"
\n",
"
\n",
"
15923
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1998
\n",
"
NaN
\n",
"
\n",
"
\n",
"
22598
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
NaN
\n",
"
1999
\n",
"
NaN
\n",
"
\n",
"
\n",
"
29178
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
5490.0
\n",
"
2000
\n",
"
NaN
\n",
"
\n",
"
\n",
"
35821
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
5910.0
\n",
"
2001
\n",
"
7.65
\n",
"
\n",
"
\n",
"
42524
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
6362.0
\n",
"
2002
\n",
"
7.65
\n",
"
\n",
"
\n",
"
49156
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
7388.0
\n",
"
2003
\n",
"
16.13
\n",
"
\n",
"
\n",
"
55786
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8020.0
\n",
"
2004
\n",
"
8.55
\n",
"
\n",
"
\n",
"
62522
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8520.0
\n",
"
2005
\n",
"
6.23
\n",
"
\n",
"
\n",
"
69401
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8622.0
\n",
"
2006
\n",
"
1.20
\n",
"
\n",
"
\n",
"
76332
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8708.0
\n",
"
2007
\n",
"
1.00
\n",
"
\n",
"
\n",
"
83291
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8780.0
\n",
"
2008
\n",
"
0.83
\n",
"
\n",
"
\n",
"
90324
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
8872.0
\n",
"
2009
\n",
"
1.05
\n",
"
\n",
"
\n",
"
97506
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
9171.0
\n",
"
2010
\n",
"
3.37
\n",
"
\n",
"
\n",
"
104960
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
9467.0
\n",
"
2011
\n",
"
3.23
\n",
"
\n",
"
\n",
"
112672
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
9764.0
\n",
"
2012
\n",
"
3.14
\n",
"
\n",
"
\n",
"
120503
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
10068.0
\n",
"
2013
\n",
"
3.11
\n",
"
\n",
"
\n",
"
128353
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
10384.0
\n",
"
2014
\n",
"
3.14
\n",
"
\n",
"
\n",
"
136064
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11006.0
\n",
"
2015
\n",
"
5.99
\n",
"
\n",
"
\n",
"
143685
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11264.0
\n",
"
2016
\n",
"
2.34
\n",
"
\n",
"
\n",
"
150891
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11518.0
\n",
"
2017
\n",
"
2.25
\n",
"
\n",
"
\n",
"
157940
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
11778.0
\n",
"
2018
\n",
"
2.26
\n",
"
\n",
"
\n",
"
164723
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
12028.0
\n",
"
2019
\n",
"
2.12
\n",
"
\n",
"
\n",
"
171401
\n",
"
163268
\n",
"
University of Maryland-Baltimore County
\n",
"
9420.0
\n",
"
2020
\n",
"
-21.68
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" UNITID INSTNM TUITIONFEE_IN year \\\n",
"2151 163268 University of Maryland-Baltimore County NaN 1996 \n",
"9056 163268 University of Maryland-Baltimore County NaN 1997 \n",
"15923 163268 University of Maryland-Baltimore County NaN 1998 \n",
"22598 163268 University of Maryland-Baltimore County NaN 1999 \n",
"29178 163268 University of Maryland-Baltimore County 5490.0 2000 \n",
"35821 163268 University of Maryland-Baltimore County 5910.0 2001 \n",
"42524 163268 University of Maryland-Baltimore County 6362.0 2002 \n",
"49156 163268 University of Maryland-Baltimore County 7388.0 2003 \n",
"55786 163268 University of Maryland-Baltimore County 8020.0 2004 \n",
"62522 163268 University of Maryland-Baltimore County 8520.0 2005 \n",
"69401 163268 University of Maryland-Baltimore County 8622.0 2006 \n",
"76332 163268 University of Maryland-Baltimore County 8708.0 2007 \n",
"83291 163268 University of Maryland-Baltimore County 8780.0 2008 \n",
"90324 163268 University of Maryland-Baltimore County 8872.0 2009 \n",
"97506 163268 University of Maryland-Baltimore County 9171.0 2010 \n",
"104960 163268 University of Maryland-Baltimore County 9467.0 2011 \n",
"112672 163268 University of Maryland-Baltimore County 9764.0 2012 \n",
"120503 163268 University of Maryland-Baltimore County 10068.0 2013 \n",
"128353 163268 University of Maryland-Baltimore County 10384.0 2014 \n",
"136064 163268 University of Maryland-Baltimore County 11006.0 2015 \n",
"143685 163268 University of Maryland-Baltimore County 11264.0 2016 \n",
"150891 163268 University of Maryland-Baltimore County 11518.0 2017 \n",
"157940 163268 University of Maryland-Baltimore County 11778.0 2018 \n",
"164723 163268 University of Maryland-Baltimore County 12028.0 2019 \n",
"171401 163268 University of Maryland-Baltimore County 9420.0 2020 \n",
"\n",
" PCT_CHANGE \n",
"2151 0.00 \n",
"9056 NaN \n",
"15923 NaN \n",
"22598 NaN \n",
"29178 NaN \n",
"35821 7.65 \n",
"42524 7.65 \n",
"49156 16.13 \n",
"55786 8.55 \n",
"62522 6.23 \n",
"69401 1.20 \n",
"76332 1.00 \n",
"83291 0.83 \n",
"90324 1.05 \n",
"97506 3.37 \n",
"104960 3.23 \n",
"112672 3.14 \n",
"120503 3.11 \n",
"128353 3.14 \n",
"136064 5.99 \n",
"143685 2.34 \n",
"150891 2.25 \n",
"157940 2.26 \n",
"164723 2.12 \n",
"171401 -21.68 "
]
},
"execution_count": 73,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#(Write code here)\n",
"df_UMBC['PCT_CHANGE'] = rounded_list\n",
"df_UMBC"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aVcm3NltknpE"
},
"source": [
"## Step 17 - Finaly, we can make the plot tuition growth rate year over year - bar first, then line chart"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 296
},
"id": "5Evbjx-g6zMX",
"outputId": "e8b795de-53eb-4002-e1e4-d198cdb0528c"
},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAEGCAYAAABy53LJAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAdJElEQVR4nO3deZwddZnv8c+XBBARZGsIksSAhnEiQhgavLggCLJkkEzCMqAj+428JFdQnBEGF7hzGWfQsIzgElldkIvEDDthUUGugOnIlhAJYZPEQJolQkASEp77x+/X9EnndJ+qpM/S6e/79epX16mq33meU6dOPfWrqlNHEYGZmVlR6zU7ATMzG1hcOMzMrBQXDjMzK8WFw8zMSnHhMDOzUoY2O4H+stVWW8WoUaOanYaZ2YAya9asFyKirUybdaZwjBo1io6OjmanYWY2oEh6pmwbH6oyM7NSXDjMzKwUFw4zMyvFhcPMzEpx4TAzs1JcOMzMrBQXDjMzK8WFw8zMSllnvgBozfOdnx9QeN6vHDWjjpmYWSO4x2FmZqW4cJiZWSlNLxySLpO0WNLsinFnSVoo6cH8N66ZOZqZWbemFw7gCuDAKuPPj4ix+e/mBudkZma9aHrhiIi7gZeanYeZmRXT9MLRh8mSHs6HsjavNoOkSZI6JHV0dnY2Oj8zs0GpVQvH94H3AWOBRcCUajNFxNSIaI+I9ra2Ur9DYmZma6glC0dEPB8RKyPiLeBHwB7NzsnMzJKWLByStq14OAGY3du8ZmbWWE3/5riknwN7A1tJWgB8E9hb0lgggKeBzzctQTMzW0XTC0dEHFVl9KUNT8TMzAppyUNVZmbWulw4zMysFBcOMzMrxYXDzMxKceEwM7NSXDjMzKwUFw4zMyvFhcPMzEpx4TAzs1JcOMzMrBQXDjMzK8WFw8zMSnHhMDOzUlw4zMysFBcOMzMrxYXDzMxKceEwM7NSml44JF0mabGk2RXjtpB0u6TH8//Nm5mjmZl1a3rhAK4ADuwx7nTgzogYDdyZH5uZWQtoeuGIiLuBl3qMHg9cmYevBP6hoUmZmVmvml44erFNRCzKw88B21SbSdIkSR2SOjo7OxuXnZnZINaqheNtERFA9DJtakS0R0R7W1tbgzMzMxucWrVwPC9pW4D8f3GT8zEzs6xVC8f1wDF5+BjguibmYmZmFZpeOCT9HLgX+BtJCySdAPwH8ClJjwP75cdmZtYChjY7gYg4qpdJ+zY0ETMzK6TpPQ4zMxtYXDjMzKyUph+qssHr1Gk9bxhQ3QWH3lrnTMysDPc4zMysFBcOMzMrxYXDzMxKceEwM7NSfHJ8HXbt5cVOPh92XPfJ50t/fEDh5z/h6BmlczKzgc89DjMzK8WFw8zMSnHhMDOzUlw4zMysFBcOMzMrxYXDzMxK8eW4NqAcdN0Jhee9ZfyldczEbPBy4TCzAe+OqzoLz7vfZ9rqmMng4ENVZmZWSkv3OCQ9DbwKrARWRER7czMyM7OWLhzZPhHxQrOTMDOzxIeqzMyslFYvHAHcJmmWpEk9J0qaJKlDUkdnZ/GTY2ZmtuZavXB8LCL+DjgIOFnSXpUTI2JqRLRHRHtbm6+UMDNrhJYuHBGxMP9fDEwH9mhuRmZm1rKFQ9LGkjbpGgb2B2Y3NyszM2vlq6q2AaZLgpTnVRFxa99NzMys3lq2cETEk8Auzc7DzMxW1bKHqszMrDW5cJiZWSkuHGZmVkrLnuMws76Nv/bmwvNed9i4OmZig81aFQ5JQyNiRX8lY2b1NWHaPYXnnX7ox94ePmLao4XbXXPoGADOmv7nwm3OmvCet4d/Nq3YXSA+e6i/9NssNQ9VSbqnYvgnPSb/vt8zMjOzllbkHMfGFcMf7DFN/ZiLmZkNAEUKR6zhNDMzWwcVOcexmaQJpCKzmaSJebyAd9ctMzOzOnrgksWF5931xK3rmMnAU6Rw3AUcUjH86Yppd/d7RmZmLezZKc8VnnfEacMAeO684hcXDPvymLeHn7/w/xVqs80pHy38/P2hZuGIiOMakYiZmQ0MNQuHpOHAqIi4Jz/+MvCuPPmqiJhfx/zMzKzFFDk5/m1gs4rHnwdeI50YP7seSZmZWesqco7jbyLixorHr0fEFABJv61PWmZm1qqKFI539Hi8b8XwVv2Yi1ldjJv+fwrPe/OEr9UxE7N1Q5HC8aqkHSNiHkBEvAQg6QPAq/VMzqyZ/v6XFxWe96aJk9cq1sHXXlNovhsPO2Kt4pj1hyKF45vAjZLOAf6Qx+0G/CtwSr0SMxuIDp52ReF5bzz02LrlYVZPNU+O559rnUg6RHVF/tsHmBgRt9QzOUkHSnpM0nxJp9czlpmZFVPo7rgRMRs4us65rELSEOBi4FPAAmCmpOsjovg3aczMrN8V+R7H5fR+T6qIiBP6N6W37QHMz789jqSrgfGAC4eZWRMV6XHcWGXcCOBLwJD+TWcV2wHPVjxeAHy4jvHMzKwARRS/wa2kHUgnxfcCzgcujYjldUlMOgw4MCJOzI8/B3w4IiZXzDMJmAQwcuTI3Z555pl6pGJmNmAsvqj4L0NuPXkckmZFRHuZGIV+c1zSByT9FLgBuAcYExHfr1fRyBaSejZdhudxb4uIqRHRHhHtbW3+NTAzs0Yo8guAvwBuBu4F9gauBzaVtIWkLeqY20xgtKTtJW0AHJljm5lZExU5x7E76eT4V4DT8riuX/4LYIc65EVErJA0GZhBOpdyWUTMqUcsMzMrrsht1Uc1II/eYt9M6u2YmVmLKHSOoydJ75P0dUnuAZiZDTKFC4ek90j6kqSZwJzc9si6ZWZmZi2pyMnxSZJ+DfwG2BI4AVgUEWdHxCN1zs/MzFpMkZPjF5GuqPpMRHQASCr+5Q8zM1unFCkc2wKHA1MkDQOuAdava1ZmZtayitwd98WI+EFEfIJ0h9wlwPOS5kr697pnaGZmLaXUVVURsSAipuSvpx8CvFGftMzMrFUVuTvuxD4mz+7HXMzMbAAoco7j0z2Gb6h4HMAv+zUjMzNraUW+OX5c17CkByofm5nZ4FP2m+O+DNfMbJBbo1uOmJnZ4FXk5PgNdPc0dpC0yq3NI+KQeiRmZmatqcjJ8e9UDE+pVyJmZjYwFCkcjwJtEfFo5UhJY4DOumRlZmYtq8g5ju8CW1UZvyVwYf+mY2Zmra5I4Xh/RNzdc2RE/BbYuf9TMjOzVlakcGzSxzTf7NDMbJApUjjmSxrXc6Skg4An+z8lkHSWpIWSHsx/q8U3M7PmKHJy/FTgJklHALPyuHZgT+DgeiUGnB8R36k9m5mZNVKR26o/DnwIuAsYlf/uAnaOiHn1TM7MzFpPkR4HEbEMuLyveSTdGxF79ktWyWRJRwMdwGkR8XKVmJOASQAjR47sx9BmZtab/rzlyDvKzCzpDkmzq/yNB74PvA8YCyyily8eRsTUiGiPiPa2tra1fgFmZlZboR5HQaVugBgR+xWZT9KPgBvXKCMzM+t3LXmTQ0nbVjycgH8wysysZfRnj0P9+FznShpL6sU8DXy+H5/bzMzWQpG7494WEfsXeK7P9UM+AEREvz2XmZn1ryKHqgqddY4IH04yMxsEihyqerekib1NjAj/5riZ2SBSqHCQviFe7RxGAC4cZmaDSJHC8UxEHF/3TMzMbEAoco6jP6+WMjOzAa5I4ThG0kd7jpT0UUnvq0NOZmbWwooUjn8HXqky/hXggv5Nx8zMWl2RwrFNRDzSc2QeN6rfMzIzs5ZWpHBs1se0jforETMzGxiKFI4OSf+z50hJJ9L9w05mZjZIFP0FwOmSPsuqvwC4AekGhGZmNojULBwR8TzwEUn7ADvl0TdFxK/qmpmZmbWkIjc5fAdwEvB+4BHg0ohYUe/EzMysNRU5x3El6dDUI8BBwHfqmpGZmbW0Iuc4xkTEhwAkXQr8vr4pmZlZKyvS43iza8CHqMzMrEiPYxdJXd8cF7BRfiwgImLTumVnZmYtp2aPIyKGRMSm+W+TiBhaMbzGRUPS4ZLmSHpLUnuPaWdImi/pMUkHrGkMMzPrf/35m+NlzQYmAj+sHClpDHAk8EHgPcAdknaMiJWNT9HMzHoqco6jLiJibkQ8VmXSeODqiFgWEU8B84E9GpudmZn1pmmFow/bAc9WPF6Qx61G0iRJHZI6Ojs7G5KcmdlgV9dDVZLuAIZVmXRmRFy3ts8fEVOBqQDt7e2xts9nZma11bVwRMR+a9BsITCi4vHwPM7MzFpAKx6quh44UtKGkrYHRuMvHZqZtYymFQ5JEyQtAPYEbpI0AyAi5gDXAI8CtwIn+4oqM7PW0bTLcSNiOjC9l2nnAOc0NiMzMyuiFQ9VmZlZC3PhMDOzUlw4zMysFBcOMzMrxYXDzMxKceEwM7NSXDjMzKwUFw4zMyvFhcPMzEpx4TAzs1JcOMzMrBQXDjMzK8WFw8zMSnHhMDOzUlw4zMysFBcOMzMrxYXDzMxKaeZPxx4uaY6ktyS1V4wfJemvkh7Mfz9oVo5mZra6pv10LDAbmAj8sMq0JyJibIPzMTOzApr5m+NzASQ1KwUzM1sDrXqOY3tJD0i6S9LHe5tJ0iRJHZI6Ojs7G5mfmdmgVdceh6Q7gGFVJp0ZEdf10mwRMDIiXpS0G/Dfkj4YEa/0nDEipgJTAdrb26O/8jYzs97VtXBExH5r0GYZsCwPz5L0BLAj0NHP6ZmZ2RpouUNVktokDcnDOwCjgSebm5WZmXVp5uW4EyQtAPYEbpI0I0/aC3hY0oPAtcBJEfFSs/I0M7NVNfOqqunA9CrjpwHTGp+RmZkV0XKHqszMrLW5cJiZWSkuHGZmVooLh5mZleLCYWZmpbhwmJlZKS4cZmZWiguHmZmV4sJhZmaluHCYmVkpLhxmZlaKC4eZmZXiwmFmZqW4cJiZWSkuHGZmVooLh5mZleLCYWZmpTTzp2O/LemPkh6WNF3SZhXTzpA0X9Jjkg5oVo5mZra6ZvY4bgd2ioidgXnAGQCSxgBHAh8EDgS+J2lI07I0M7NVNK1wRMRtEbEiP7wPGJ6HxwNXR8SyiHgKmA/s0Ywczcxsda1yjuN44JY8vB3wbMW0BXncaiRNktQhqaOzs7POKZqZGcDQej65pDuAYVUmnRkR1+V5zgRWAD8r+/wRMRWYCtDe3h5rkaqZmRVU18IREfv1NV3SscDBwL4R0bXhXwiMqJhteB5nZmYtoJlXVR0I/AtwSES8XjHpeuBISRtK2h4YDfy+GTmamdnq6trjqOEiYEPgdkkA90XESRExR9I1wKOkQ1gnR8TKJuZpZmYVmlY4IuL9fUw7BzingemYmVlBrXJVlZmZDRAuHGZmVooLh5mZldLMk+NmZtbPtp48ru4x3OMwM7NSXDjMzKwUFw4zMyvFhcPMzEpx4TAzs1JcOMzMrBQXDjMzK8WFw8zMSnHhMDOzUtT9+0kDm6RO4JleJm8FvFDyKdekTSNjtXp+jYzV6vk1Mlar59fIWK2eXyNj9dXmvRHRVurZImKd/wM6GtGmkbFaPT8vCy+LZsdq9fwGwrLo7c+HqszMrBQXDjMzK2WwFI6pDWrTyFitnl8jY7V6fo2M1er5NTJWq+fXyFhrml9V68zJcTMza4zB0uMwM7N+4sJhZmbl9OclWo36Ay4DFgOzK8btAtwLPALcAGyax28AXJ7HvwS83NUuT7sW+CvwBjCzSruXgDeBp/L4TYA/Aq/ldsuA7/XVJk/7akWbxcCoKvk9CjyQ/88BrgAezvGeAh4Hbgc2z22Vl8XrOf8ngFPytC/k8cuB2QXb3Jnzfq1KnP8CngaWAk/m/M7Ny/yPwKvAoiL5Ae8FHsptlgFzC8TqyvG9+bmWlFgWKytiLS7Y5kPAi3n5LQV2rpHfd4EH8/v4KvBWflwk1n/l51oO/KnIsgD+My+3pVWW+weAWTmH53ssv8Py+7uctD4VafPTPP+yKnHuzePn073efhP4NWm9WAp0FskPeEce3/VePVEwvy3y87+RY9XK75S8TB/Nsd4osfxG5XFd68X+NWKdw6rrxUrgsYKxzswxlgHPAcMKtDmjok1lnM+StiWPAL8DdqnYNh2Y550PnF5oG9zsIrCGhWMv4O9YtXDMBD6Rh48H/i0PnwxcnocPyW9oV+E4G/gz8AlS7+uLPdvlWPuSNvjrVYn1NHBlX21IP9H7JnBInu8W4K4q+e2U81uPtIF8E/goaQM9Kz/n6cB/5vnHAb/Ky+J/5LzmAXuSNqxnA5uTNoAX1GgzBriaVKxurBLnFmBb4BjgflIBfSqvdOeSPiCLgLMK5LcLMCXHeBepmF9cI1ZXjh2kD+JFBZfFGNKH/PQ8X9E2fwIuyfN9A5hSML9z83J/Cfh6gVifARaQPvBDSF9ivapGrIWkD/+3c4yZpI11V6yt8/tyKfCVivx2ysv6XNIOy6KK19hbmzHAVaQCNrvH8tsa2B24EDi/YsfqCeDQHOcb+XnOK5DfGOD8HGP9/B78pECbc0nr7FWkYlorv3l5GX6X6utFX7FmA9MqNuwXFIjVleMZpALwHwVi7U36DH8tz/cwcFONNoeQCufXSducx4Ef5jYfobuIHATcn4eH5PdrB9I68RAwptY2eEAeqoqIu0kfzEo7Anfn4dtJKy6kN+1Xud31pA/ORnna8cDGwN0R8RYwvWe7HOsJ0p5Ce2UsSTuSFvbuNdqI9AY9JEmkb3COrpLfbNKHuZ20ciwB3gmMJxWkQ4ErgX/IbccDP4qIP0TEfcCmpL3ST+fpP4iIl4GbgSNqtNkO2BW4Ps/XM86PI2JRRFwJbEba4D+cX+N40oZ8MXBbgfy2Bg7OMTYk7QEfUCPWXNKHace8LKrl2NvrGprnLdpmT9I3bb+e5/tRzrdWftvl6UtJG/xLCsTaKv9dnZfFkhy/r1hL6N5QXJLfh0VdsSJicUTcCjybH7+a8/skaX09PyKW5+U4vkab7YDdgJ/0XH65zUzSZ2phRbvZwCtdrzk/z/0F8tuOVCyvJBWOJaSdsFptDgXenZfFggL5zSV9Hg+iynrRR6zRpPVvcm5zWX6OWrG61ounSNuFCwvEGkbaNk2TNJTUs9i5Rpvd83K7JCJWAL8EJuR5fpe3BQD3AcPz8B7A/Ih4Mq8TV+dc+zQgC0cv5tD9gg8HRuThh4BDJA2VtD3pEMT6kjbL018D5kv6BXBctXakhbxRxbSuWEeSunh9tomIN0nVfy6ph/MR0l5Ctfx2y8+3jNRbWARsA+yTxz+XH0NaIZ+tWAYvAGPza9owIhbl8fPyc/XV5v78vF0Fua84C0jFbdeKdiNIG6X7CuR3P+mDMSNP/xZp41kr1lFAAH8pkGNlLAE3SLoP+HCBNq+TPoAXS3oA+HLJZXEQ8POC+V2RHz9Aeq9vpPf3qitWG+lwxTakXuk+pPVpG6qQNCrn9zIwtGK9mEsqYH216XpNnXlS5Wsq0m7D/PiWAvl1tbmFtBNyC6kg1GozAjiVdOhmWcH83iAddrpR0qS+XldFmxdI28xv5fXi3wrG6npd+7P6etFbm5tJ7+1M0nqxmO5tRm9trsvzLJf0TlLRrbb8TiAtW6i+jm3X22vqsi4VjuOBL0iaRV54efxlpIXRAVxAOuQTpL3Q4aRu4xOkwweH9tLuG6SNycrKWKSu52O12khan7RBnkUqHC+Q9niq5fc70kbrclL39lLSXubTwMpI/cvVrqGW9C5SUZxC+vD01GebiHhllZl7iZMNybmdmtuJtEd6XO65FYkVEbEz8H7SYZjedMWaQTp39fZzF1kWOdYrEdFOOix0Qc631zakD+xQ0mGA3Und+PVLLIsPATMKvldb5+cYTvrAfpLudaO3WCflZbEJaUN0L2ndrPZ+bQBMI21Y/1plep9tyqwX+XX1bFf5uFCsiBhLWh57UOW96vGa9sppzarxmlbJj3QIeCmpyJ8MfLxWfqTP1RDg+xGxK3kHrVasimVxCPCLPpZhZawhpHVue+A9pN5HtXWwcvl15BxvA24lHdJdJY6kfUiF46vV8i5qnSkcEfHHiNg/InYjfZieyONXRMSXImJsRIwn7WEtJx33fx24KCL2J/UC3lmtHTCJ9EbO64oF/DOpUp9XoM1Y4PWI+HjO78KcQ7X8Nid1hX8WEf8aER/Oz/8cME/StqS9D0hd4hG5ME0j7UX9NI9flueF1L1eUqMNpO7wFgDV4uTx65M2pldHxC8lbUr68Hw7Iu4rmB/A85K2jYg/59f3Wl+x8nszmfQBmgIcLem7BWM9l2M9Sdr7W1qjzQLSRvavucv/a7p3DnpdFnn6G8CtEfFmwWUxgdSD2iQilgK/pXsD32usiDiHdDLzaNLGtbMiVpf1SEX5Zzm/hcCKivXib0knbPtqA2m9aMt5VL6mnu2mdbXL+a4PXJ8fV2tXNVZ+r5aQ9rZ77gT1bPNRYD1Jz5LWk09Sfbu2Sn4RsTC/riGkQ9T7FshvAbCCdO4F0nqxktWtEiuPe410bvX5gstiv9xmaD5icSfd62BvbSBtkw6OiL3y/F29cyTtTDqcNz4iXsyj317HsuF5XJ/WmcIhaev8fz3ga8AP8uN3Sto4D3+K9MYvy1X/BrqPQe9HWtFXawd8jLRX82hFrKNIBapqrB5tFgI7SWrL+Z1O2oBVy28H0g3Jzut6TaS97S+R3vRjSF1SSOcjjib1Sl4Gns6HIWbk6SdJ2hz4e9LVY3216Zp2YB5eLU4+P3Md8GpEnC1pA9KH7nd0H2qqmZ+k4aSu+DE5v/0qcq4aKyI+GxEjST2GXwE/zsu1VqzNgZtyrK1IG4hbayyLmaTi8oU837GkHmGv+dFtJd0b/iLv1Z9yrOPyxnYiqXj0tdyHSNoyT/8q6dj3eytikduMB56PiPPy6Jmk3tSp+b07ju5zWr216crjsCqvqdIRwNy83iq/zrl0b/hXaVctlqQ20npwjKSNSIecf9tXm4g4g7ROXEw6dPwM8MMa+W0saZP8uk4kHUIaUSu/iHiOVGxOy7NNJl3M0musinGv0H24r+ayIK0XK4AT8/RjSUcs+moDcEdefiNJ26ir8/wjSec8PhcR8yrmnwmMlrR9XieOpGKd6FW0wFVSZf9IG+xFpA/BAlLX6xTS3v080uGnrm/FjyIdTppL2mtfXNHuX0h7bctIvY/vVWn3lzy9Z6zlpJNd1WJVa/N/87hlpPMdW1bJbyapa/kwqZv5MmkFeox0wvHxvGJskdsqrwxB2lg9ltuNA/4X3ZfjzinY5t48/1ukPeLDKtpcTNpQR87jQdKHdAXdlxouI53jqBXrzJxT5eW4W9SI1ZXjlqQP65KCy+I0ui+7XJaHiyyLiRVt/gxsUyC/Y0nr5Z0l3quDSec5XsuxnimwLB6ie734S25TGWsYaT0PUiFbTlqnxgH/SPd6Mb9gm2k5tyCtF1+saLMg5x4VbR7Pjyvf4wcKxPpC/l95OW6R/LbMy3wBaeO8RY38HiVdmDCbVLRfKLH8PkEqAl2Xdm9fI9aDpF7lS8BvWH296CvWuXSvF4vovhy3rzb3VrSZVRHnEtL2pGtd7ajYno4jbTefAM4ssg32LUfMzKyUdeZQlZmZNYYLh5mZleLCYWZmpbhwmJlZKS4cZmZWiguHmZmV4sJh1kIk9Xa7EbOW4cJhtoYk/W9Jp1Y8PkfSKZL+WdJMSQ9LOrti+n9LmiVpTr6xXtf4pZKmSHqI7jvjmrUsFw6zNXcZ6TYiXbe6OZJ0d4LRpBv0jQV2k7RXnv/4SPcqawe+mG8bAun+W/dHxC4RcU8jX4DZmhja7ATMBqqIeFrSi5J2Jd0m+wHSjQj3z8OQ7mw8mvRbMV+UNCGPH5HHv0i6bcS0RuZutjZcOMzWziWke1QNI/VA9gW+FRGr3GhP0t6kmznuGRGvS/oN6adSAd6IiGp3WTVrST5UZbZ2ppPuKLw76c6uM4Dj8+8xIGm7fJfjdwMv56LxAdLvv5gNSO5xmK2FiFgu6dfAktxruE3S3wL3pjtfsxT4J9Kt3E+SNJd0V9v7mpWz2dry3XHN1kI+Kf4H4PCIeLzZ+Zg1gg9Vma0hSWNIv2lxp4uGDSbucZiZWSnucZiZWSkuHGZmVooLh5mZleLCYWZmpbhwmJlZKf8flq2UO53MDLIAAAAASUVORK5CYII=\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.lineplot(x='year',y='PCT_CHANGE', data=df_UMBC)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "V1TiCardmjU2"
},
"source": [
"## Step 18 - Define Growth Rate Function\n",
"\n",
"Since we want to do the same calculation for JHU. Instead of doing it piecemeal as we did for UMBC. Let's create a function which can be reused for any college. This function takes a list of tuitions and return a list of percentage changes year over year. This function can be used later."
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {
"id": "eCqEe_646zMc"
},
"outputs": [],
"source": [
"#(Write code here)\n",
"def percent_changed(tuition_Growth):\n",
" percent_Growth = [0]\n",
" for i in range(1,len(tuition_Growth)):\n",
" difference = (tuition_Growth[i]-tuition_Growth[i-1])/tuition_Growth[i-1]\n",
" percent_Growth.append(difference*100)\n",
" rounded_list = []\n",
" for i in percent_Growth:\n",
" rounded_list.append(round(i,2))\n",
" return rounded_list\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "U2jqd5GYmy3Z"
},
"source": [
"## Step 19 - Get JHU Data\n",
"\n",
"The dataframe contains many years of data of all U.S. colleges. Let's just look at JHU. Filter/query the dataframe to retrieve only rows that belong to JHU. Save the JHU data to a new data frame using a new variable so that the old big data frame is still available for later use."
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "f2mhYOjV6zMj",
"outputId": "5a1767dc-3a2e-4bc4-891a-3129e8fda95c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Int64Index: 25 entries, 2139 to 171396\n",
"Data columns (total 4 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 UNITID 25 non-null int64 \n",
" 1 INSTNM 25 non-null object \n",
" 2 TUITIONFEE_IN 21 non-null float64\n",
" 3 year 25 non-null object \n",
"dtypes: float64(1), int64(1), object(2)\n",
"memory usage: 1000.0+ bytes\n"
]
}
],
"source": [
"#(Write code here)\n",
"df_JHU = df_all[df_all['INSTNM']=='Johns Hopkins University']\n",
"df_JHU.info()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VkyEUa6mnKgD"
},
"source": [
"## Step 20 - Plot JHU's in-state tuition overtime\n",
"\n",
"Let's plot bar and then line chart."
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 296
},
"id": "wGXcNTJ46zMo",
"outputId": "73594b62-a9e0-4975-a72f-854120d78a93"
},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 76,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZgAAAEGCAYAAABYV4NmAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3debgdVZnv8e9LAorKECQETIJBideOtkxh8MKDCBpCpAlDQJCGAMFcL6DhtvYlqO0Acp0BUVovDZHABSMSkIiBGBBEWgJJmMN4GCInZoIEQuBKDL79x3o3KTZ7n6qTZO0z/T7Pc55Te1WtWm/Vrl1vraratc3dERER2dg26eoARESkd1KCERGRLJRgREQkCyUYERHJQglGRESy6N/VAbTatttu68OGDevqMEREepT58+c/7+4DO1OnzyWYYcOGMW/evK4OQ0SkRzGzhZ2to1NkIiKShRKMiIhkoQQjIiJZKMGIiEgWSjAiIpKFEoyIiGShBCMiIlkowYiISBZKMCIikkWf+ya/iEhvs/RH/1l52kGT9s0YyZupByMiIlkowYiISBZKMCIikoWuwYiI9EHLfjKz8rTbnTFmvdpQD0ZERLJQD0ZEpJtYcv4jlafd/l9GZIxk41APRkREslCCERGRLJRgREQkCyUYERHJQglGRESy0F1kIiIZPPfDJZWnHfrF7TNG0nXUgxERkSyUYEREJAslGBERyUIJRkREssieYMzsWTN7yMzuN7N5UbaNmc02syfj/4AoNzO7yMzazOxBM9u9MJ/xMf2TZja+UL5HzL8t6lruZRIRkXKtuovs4+7+fOH1ZOBWd/+OmU2O12cBhwDD429v4KfA3ma2DfB1YCTgwHwzm+HuK2OazwJ3AzOB0cBNrVksEekL7rt0WaXpdjt1u8yR9CxddYpsLDA1hqcChxfKr/BkDrC1me0AHAzMdvcVkVRmA6Nj3JbuPsfdHbiiMC8REelCrUgwDvzOzOab2cQoG+Tui2N4CTAohgcDzxXqtkdZR+XtDcrfxMwmmtk8M5u3fPnyDV0eERGpoBWnyPZz90Vmth0w28weK450dzczzxmAu18CXAIwcuTIrG2JiEiSvQfj7ovi/zLgemAvYGmc3iL+105wLgKGFqoPibKOyoc0KBcRkS6WNcGY2TvNbIvaMDAKeBiYAdTuBBsP3BDDM4AT426yfYCX4lTaLGCUmQ2IO85GAbNi3Coz2yfuHjuxMC8REelCuU+RDQKujzuH+wNXu/vNZjYXuMbMJgALgWNi+pnAGKANeBU4GcDdV5jZucDcmO4cd18Rw6cBlwObk+4e0x1kIiLdQNYE4+5PA7s0KH8BOKhBuQOnN5nXFGBKg/J5wIc3OFgR6fVuubr6TT6f+MzAjJH0Dfomv4iIZKEEIyIiWSjBiIhIFkowIiKShRKMiIhkoZ9MFpEe6arp1e8IO/4o3RHWFdSDERGRLJRgREQkCyUYERHJQglGRESyUIIREZEsdBeZiHSpb1z/l+rTHvGejJHIxqYejIiIZKEEIyIiWSjBiIhIFkowIiKShRKMiIhkoQQjIiJZ6DZlEdlojpn+SOVprzlqRMZIpDtQD0ZERLJQghERkSyUYEREJAslGBERyUIX+UWkoSOm31lpuuuP2i9zJNJTqQcjIiJZKMGIiEgWSjAiIpKFEoyIiGTRkgRjZv3M7D4zuzFe72Rmd5tZm5n90sw2i/K3xeu2GD+sMI+zo/xxMzu4UD46ytrMbHIrlkdERMq16i6yScCjwJbx+rvABe4+zcx+BkwAfhr/V7r7zmZ2bEz3aTMbARwLfAh4D3CLmX0g5nUx8EmgHZhrZjPcvfrzKkR6ubHXzqw87Q3jxmSMRPqa7D0YMxsCfAq4NF4bcCBwbUwyFTg8hsfGa2L8QTH9WGCau7/m7s8AbcBe8dfm7k+7+xpgWkwrIiJdrBWnyC4E/jfw93j9buBFd18br9uBwTE8GHgOIMa/FNO/UV5Xp1n5m5jZRDObZ2bzli9fvjGWSURESmRNMGZ2KLDM3efnbKeMu1/i7iPdfeTAgQO7MhQRkT4j9zWYfYHDzGwM8HbSNZgfAVubWf/opQwBFsX0i4ChQLuZ9Qe2Al4olNcU6zQrFxGRLpQ1wbj72cDZAGZ2APAldz/ezH4FjCNdMxkP3BBVZsTru2L8793dzWwGcLWZnU+6yD8cuAcwYLiZ7URKLMcCn8m5TCJd6dBrr6k87Y3jjskYiUi5rnoW2VnANDP7FnAfcFmUXwZcaWZtwApSwsDdF5jZNcAjwFrgdHd/HcDMzgBmAf2AKe6+oKVLIiIiDbUswbj77cDtMfw06Q6w+mn+ChzdpP55wHkNymcC1e/DFBGRltA3+UVEJIsOezBm9jLgtZfx36PeZu6ux/2LiEhDHSYId9+i+NrM3gWcDvwP4PqMcYmISA9XqQdiZlsDZwInAlcDe7r7CzkDE+ntDp1+eaXpbjzqpKxxiORSdopsW+CLwKeBKcBu7v5SKwITEZGerawHsxBYDvwceBWYkB4Nlrj7+flCExGRnqwswXyfdRf5t+hoQhERkaKyi/zfaFEcIiLSy5Rdg7moo/Hu/oWNG46IiPQWZafIuvQpyCI9waeu+0nlaX975BkZIxHpXspOkU3taHyNmf3Y3T+/cUISEZHeYGM9KmbfjTQfERHpJfQsMhERyUIJRkREsthYCcbKJxERkb5kvRNM/KRxzY82QiwiItKLdJhgzOzOwvCVdaPvqQ24++UbNywREenpyr4H887C8Ifqxum0mPQ6Y67/VqXpZh7x1cyRiPR8ZafIfD3HiYhIH1fWg9nazI4gJaKtzezIKDdgq6yRiYhIj1aWYO4ADovhPwD/VDdORESkobJHxZzUojhERKSXKbuL7MLC8KS6cZdniklERHqBsov8+xeGx9eN+8hGjkVERHqRsgRjTYZFREQ6VHaRfxMzG0BKRLXhWqLplzUykQ1wyA0TKk9709jLMkYi0neVJZitSD86Vksq9xbG6XswIiLSVNldZMNaFIeIiPQyZXeRnVEYrn9UTCkze7uZ3WNmD5jZAjP7ZpTvZGZ3m1mbmf3SzDaL8rfF67YYP6wwr7Oj/HEzO7hQPjrK2sxscmdjFBGRPMou8p9SGK5/2GUVrwEHuvsuwK7AaDPbB/gucIG77wysBGonzCcAK6P8gpgOMxsBHEt6Htpo4N/NrJ+Z9QMuBg4BRgDHxbQiItLFOvO4/k7fRebJ6ni5afw5cCBwbZRPBQ6P4bHxmhh/kJlZlE9z99fc/RmgDdgr/trc/Wl3XwNMi2lFRKSLdeZZZFsWnkUGgLtfV9ZA9DLmAzuTehtPAS+6+9qYpB0YHMODgedi3mvN7CXg3VE+pzDbYp3n6sr3LotJepYzp4+uPO2FR92cMRIR6YyyBPMH1j2L7A7e/CwyB0oTjLu/DuxqZlsD1wMfXI84N4iZTQQmAuy4446tbl5EpE8qu4vs5I3VkLu/aGa3AR8l9Yz6Ry9mCLAoJlsEDAXa4xcztwJeKJTXFOs0Ky+2fQlwCcDIkSN1e7WISAt0mGDM7MQORru7d3jh38wGAn+L5LI58EnShfvbgHGkaybjgRuiyox4fVeM/727u5nNAK42s/OB9wDDSb+oacBwM9uJlFiOBT7TUUwiItIaZafI9mxSfhjpGkjZnWU7AFPjOswmwDXufqOZPQJMM7NvAfcBta9SXwZcaWZtwApSwsDdF5jZNcAjwFrg9Dj1VruVehbpyQJT3H1BSUwiItICZafIPl8bjru5jgfOIl1wP69s5u7+ILBbg/KnSXeA1Zf/FTi6ybzOa9Smu88EZpbFIt3DD35xcPlEwJeOm5U5EhHJrawHQ1wLOQn4EimxjHP3xzPHJSIiPVzZNZjTgUnArcBod3+2FUGJiEjPV9aD+TGwDNgP2DedJQPSxXV3d/0mjIiINFSWYHZqSRQiItLrlF3kXwjp4ZSk54ABPBIX6aUPu+yKahfrASacqAv2In1R2TWYLYFLgZHA/VG8q5nNBya4+6rM8YmISA9V9rDLi0jfPdnZ3Y909yOB9wMPAT/JHZyIiPRcZddg9nX3k4oF7u7AOWb2ZLaoRESkx+vM4/rrdfrx/SIi0neU9WD+ZGZfA86NngsAZvZvpOeFSS9w7c+rPQ5/3Ml6FL6IVFeWYD5Pej5Ym5m9cZGf9PywU3MGJiIiPVvZbcqrgKPN7P2knySGdJvyU9kjExGRHq3sNuXdCy9rv7OyVa3c3e/NFZiIiPRsZafIftjBOAcO3IixiIhIL1KWYL7s7rqY30PccumYytN+4lT9woGI5FV2m/LFLYlCRER6nbIEo++6iIjIeil9mrKZzWg20t0P28jxiIhIL1GWYJbT8YV+ERGRhsoSzGp3/0NLIhERkV6l7BrMMy2JQkREep2yHsxVZnZks5Huft1GjkdERHqJsgRzaAfjHFCCERGRhsqeRXZyqwIREZHepexZZP9SV+TA88Cd7q7rMyIi0lTZRf4t6v62BEYCN5nZsZljExGRHqzsFNk3G5Wb2TbALcC0HEGJiEjPt14/mezuK9BjZEREpAPrlWDM7OPAyo0ci4iI9CIdJhgze9jMHqz7awe+C5xWNnMzG2pmt5nZI2a2wMwmRfk2ZjbbzJ6M/wOi3MzsIjNri7Z2L8xrfEz/pJmNL5TvYWYPRZ2LzEw9KxGRbqDsezCDgV0Lrx14wd1fqTj/tcAX3f1eM9sCmG9ms4GTgFvd/TtmNhmYDJwFHAIMj7+9gZ8Ce8c1n6+TbjDwmM8Md18Z03wWuBuYCYwGbqoYn4iIZFL6qBh3X1j4+3Mnkgvuvrj2s8ru/jLwKClpjQWmxmRTgcNjeCxwhSdzgK3NbAfgYGC2u6+IpDIbGB3jtnT3Oe7uwBWFeYmISBcq68Fs1+C7MG9w9/OrNmRmw4DdSD2NQe6+OEYtAQbF8GDguUK19ijrqLy9QXl92xOBiQA77rhj1ZBFRGQDlPVg+gHv4q3fh6n9VWJm7wKmA2e6+6riuOh5eCdi7jR3v8TdR7r7yIEDB+ZsSkREQlkPZrG7n7MhDZjZpqTkclXh4ZhLzWwHd18cp7mWRfkiYGih+pAoWwQcUFd+e5QPaTC9iIh0saw/mRx3dF0GPFp3Om0GULsTbDxwQ6H8xLibbB/gpTiVNgsYZWYD4o6zUcCsGLfKzPaJtk4szEtERLpQWQ/moA2c/77ACcBDZnZ/lH0Z+A5wjZlNABYCx8S4mcAYoA14FTgZ0hc7zexcYG5Md0582RPS7dKXA5uT7h7THWQiIt1A2aNiVnQ0voy730nzXtBbkldcjzm9ybymAFMalM8DPrwBYYqISAbr9U1+ERGRMkowIiKShRKMiIhkoQQjIiJZKMGIiEgWSjAiIpKFEoyIiGShBCMiIlkowYiISBZKMCIikoUSjIiIZKEEIyIiWSjBiIhIFkowIiKShRKMiIhkoQQjIiJZKMGIiEgWSjAiIpKFEoyIiGShBCMiIlkowYiISBZKMCIikoUSjIiIZKEEIyIiWSjBiIhIFkowIiKShRKMiIhkoQQjIiJZZE0wZjbFzJaZ2cOFsm3MbLaZPRn/B0S5mdlFZtZmZg+a2e6FOuNj+ifNbHyhfA8zeyjqXGRmlnN5RESkutw9mMuB0XVlk4Fb3X04cGu8BjgEGB5/E4GfQkpIwNeBvYG9gK/XklJM89lCvfq2RESki2RNMO5+B7CirngsMDWGpwKHF8qv8GQOsLWZ7QAcDMx29xXuvhKYDYyOcVu6+xx3d+CKwrxERKSLdcU1mEHuvjiGlwCDYngw8FxhuvYo66i8vUH5W5jZRDObZ2bzli9fvuFLICIipbr0In/0PLwF7Vzi7iPdfeTAgQNzNyciInRNglkap7eI/8uifBEwtDDdkCjrqHxIg3IREekGuiLBzABqd4KNB24olJ8Yd5PtA7wUp9JmAaPMbEBc3B8FzIpxq8xsn7h77MTCvEREpIv1zzlzM/sFcACwrZm1k+4G+w5wjZlNABYCx8TkM4ExQBvwKnAygLuvMLNzgbkx3TnuXrtx4DTSnWqbAzfFn4iIdANZE4y7H9dk1EENpnXg9CbzmQJMaVA+D/jwhsQoIiJ56Jv8IiKShRKMiIhkoQQjIiJZKMGIiEgWSjAiIpKFEoyIiGShBCMiIlkowYiISBZKMCIikoUSjIiIZKEEIyIiWSjBiIhIFkowIiKShRKMiIhkoQQjIiJZKMGIiEgWSjAiIpKFEoyIiGShBCMiIlkowYiISBZKMCIikoUSjIiIZKEEIyIiWSjBiIhIFkowIiKShRKMiIhkoQQjIiJZKMGIiEgWSjAiIpJFr0gwZjbazB43szYzm9zV8YiISC9IMGbWD7gYOAQYARxnZiO6NioREenxCQbYC2hz96fdfQ0wDRjbxTGJiPR55u5dHcMGMbNxwGh3PzVenwDs7e5nFKaZCEyMl/8NeLzJ7LYFnl+PMNanXqvq9Na2FF/Paau7x9fKtrp7fB3Ve6+7D+zUnNy9R/8B44BLC69PAH6ynvOa16p6rarTW9tSfD2nre4en9bFxqnX6K83nCJbBAwtvB4SZSIi0oV6Q4KZCww3s53MbDPgWGBGF8ckItLn9e/qADaUu681szOAWUA/YIq7L1jP2V3SwnqtqtNb21J8Paet7h5fK9vq7vFtSL236PEX+UVEpHvqDafIRESkG1KCERGRPDbW7Wjd9Q+YAiwDHi6U7QLcBTwE/AbYMso3A34OrAD+BjxTqLMHsBR4DXgZ+OeO6gBbAI8BrwD/P+r9e12dh6Leylp8wHHAk1HvZdK1pfr4HgNWA88CC4BJwKdjeHXMczYwIOpZrIdXgb8CTwGTYtxpUb4GeLiuzkXRxmrg6UJbNwMvAcsj1tK2gF1JN2SsjnXxYIW2vgHcG+tpdbyPVZdrG+D38Z78uWKd12OdvxZtVVkXO0Y7teX6Y4V18fFYplpbrwPHV4jvomhnTYNlahbfd4FHo3xx3fr7IDAf+Dtp215QaGscaRtcE+9xsc5dEXcb8EihrSmkbWI1b90uGrZFugP0j4X191iFOm+P8tr6e6pKfDF+W2BVLFvVdfHnQlsvVawzLMrWxLKNKonvPOB+1m0Xr5O+r1elra8U1t8SYPsK79Uk1m0Xy+vWxfGkz+dDwJ+AXQr7wdERVxswuXT/29UJoAUJZn9gd96cYOYCH4vhU4BzY/h00g58f+AgUmLYJMYtAq6I4QnADyrUKbbzLDC1WCeGD4s3/mHSTRfLgPuAjwHfA25oEN8O0dZ8YMt4s/8C/BiYDEwF/gP4btQbQ9oJ7g7sE3E9AXwUeBH4JjAAeAG4sFDnpmhrPHA3KWk+EetsOvBoTDu5QluHREyTgfeQPuRV2tol1sPXYh2eV6GtEVHnP4GrSR+SKnXWEB+aBsvULL57gF/G9O8C/q0T8U0mJcJXgR+W1PkM0A6cTbqZZSFwdUl8i2LZvx9xzQW+XohvO9IO4zLgS4Vl+jDpoOd7pIOaxcR3zaLOnsCPgAt83cHUE6TvoE0FljRYh83a2j/qTI6y5XVtNaozArgg6mxKSgBXVohvRKzb+4EbK8Y3gpRUzunEMo0gfZ6nFxLAhRXj+168x0uA71Ro6wDSZ/irMd2DwG9L2nqWdABwPvBl4BbSNlJr67+zLtkcAtwdw/1Iyfx9pO3iAWBER/vfXn+KzN3vIB3RF30AuCOGZwNHxfAI4PdR5ynSUcTIGLcD8D9j+HekD3VZnQ8Ad5jZB0hvyJ7FOhHfDNKHeXPSkagB74/4tiTtIOrjW+zut5I2rA+SdiRLgVGkD+stpA/e4VFvLPAf7n6vu8+J+T4N/FOM/5m7rwRmAscU6lwRbU0FtibtQB8FnotlaY9pp1Zoay2xM3H3v8Q8DqvQ1nYx/lrSKd1fVWhrcCzH8nivHqtYp38sS6NlahTfc8A7SL2zqe6+Gri0Yltjo41xpORwaEmdbeNvGvA20nv/0ZL4XiTthA6LuB4kJYvDAdx9mbvfHMuBu78c6/xA0vZ6gafHL/082qjVmUvaZhfV1VsC7BftvmkddtDW20iJdGqUzQcOLqkzmPT5m0razl8kbVtl8e0C7A18qxPxDY73+JedqDOc9NmvPU1kCmlHXRZfbbt4hrQv+VGFtrYH3glMN7P+pP3AR0raWk46KP1UvLd/IPWWa239KfYHAHNI3y2E9XgsV69PME0sYN2KOZp1X9R8ADgs3qghpJ3+UDPbmnSq5Wozu5d0Wm3HjurUtXMsqVv5lnbMbCfgH4FN3f1vpCT2DlJvYgRpx9ys3h6kI933kXbEO5A2nsOBdwODot5gYsMMz5N2iq8Ab3P3xVH+BKkn06hOOylx7kY6Qt6G1PWGtGMpa+tuYJC7LzazvUjb3jYV2moHdgbmkU73PFChrXtI70XtA/5qxfgM+I2ZzSHtiJrVqcX3kRjeCZhpZt8n9UArrwvStvHzCnUuj9f3kZLEjZS/VwNJByCDSNvvx0lHsINowMyGkdb5SqB/Ybt4lJToGirUuzvaXBujittFR3Vq28Uw0ja/RZU6pMS8LP5vVSG+T5NOMdUeg1I1vk2AK81sPumgrKzO81Hn22Z2H3BuszpNlmsU8IuK8c0kvbdzSdvFMpqsv0K9waQkuD2pdzaGdFDSqK0JpPULjbezwc3agr6bYE4BTosNZgvSqRFIRxrtpJ3Z10g7ptdJR7abAe8FnPTGb1pS5412SF3ex5u0cyHpqM3NbFNSgjkkxn2Q9OY3q3c3qbs8CfgcKTH9kdQFfj1ifRMzexcpof2QdQmiqNl96/1I3fcz3X3Vmyqk/nOHbdXqmNkOwJXAyRXbepR0Tnpn0umf7Sos1wnA39y9vTBJlfhWuftI0umoC0kJp6P4Lif1Iv5K6tG9Dzipk+viH0nX2cqWabtodwjpQ31gvO4ovs+RDoa2IO2w7qLJdkHavqcDZ5JO89Zr+F5FjNPpxHZRbKuwLt6YT9U67r4raX3sRZP3qjDfK0mnkWufzc7Et9rddyd9Lk+n8Xovrr/XYpqfuvtuxIFcSXzF9XcY8Ksq8UU7m5IOct5D6s1s2qBOsa0zgG/HtDeTThm+Zbsws4+TEsxZjeZXRZ9MMO7+mLuPcvc9SB+8p6J8rbv/r9hwJ5LevCdIvYlXgd2jzv8hNtQO6uDujwH/Ssr65zdqx93Hko4O15COVHH32e4+inT6pF+T+MaRPljT3f06d/9NTHckKZktIh3NEMNDI4FNJ+0Q/1+UvxY7Okjd+heLdQCi3p7ANHe/LsavID40Ub+sLUi9q1mkc9IL6+t00NZS0sb/MOkIsqytjwL9zew54AekxPT3CvEtMbMd3P1pUvJe3VF8Uf9+0pHjQODXpFNEVdbFUuBU4HrSqa+yOkeQjja3iFNxf2RdImi6/tz9PNLpkBNJO+HlhbZqNol1dFWs80XA2sJ28Q+kJF9vk4jxqsJ7tZz4AnfddtGsrdq6+A1wFSkJVqoT79WLpKP3RgdLb8QXw4eRdqrXAAea2bUV26ptF7WbTOoTcH2ddlIv7s8x/jYKia1RfIW2XiFdL15acf19Iur0jzMgt7LugLRpW+5+GWl/8WlSj3VJsS0z+wjptOpYd38hijv9WK4+mWDMbLv4vwnwVeBn8fodZvbOmGw/0kHOI3Ek8TvggKjzbdLpr6Z1Cu0cR0piDdsxs0+SNsbXSG/WCDP7h5jfJ0k7rjfVMzMjdY1fcfezCm3NIB21nkb6ENwQ85lB2sFcRtqYno3TH7Ni/OfMbADpnOy1xTrR1g3Ay+7+zcJq/BPrzs2OL2srHuOzGbDQ3a9tVKe+LTMbYmabF5ZrP+BDZW25+/GkHsjFpB7eQzFNR/ENAH4LjDezbUk3Udxcsi7mkk4tzI7lOZB02qpsvdfGfZa0bZSuP9LOajVwciSRI0lJpqP118/M3h3jzyKd0ntvoS2izlhgqbufH8VzSaddzoz37WQaP37pGNKNHucXym6JdULdcjVsK8rWkk4Rn1+xzkDStjs+to+jC+uiYXzufra7DyFtF78mXQOdV6Gtd5JOEY2P4aNJ1yya1nH3JaSk+cWY7AzSjTxV1t8qUpKutP5I28Va4NQYfxLpjEiHbRX2F18gbUtvr7VlZjsC1wEnuPsThXl0/rFc3g3u9Mr5R/oALyZ9YNpJXb5JpF7GE6Q7NWpPNBhGOvp/ibTDL9b5BqkX8xrpFNSOFepMIh1NPNOkndqF0WWFeldE2WukHcqFDeo9Szqif5R0BH0/cHuMeyWW9xZgm6hnscE4KfE8HnXGAJ9n3W3KC+rqXExKek6666TW1sOk88yvR9z3Vmjr+zFt7XbPl4H9S9pqi3X3cEy/tBPL9W7S0dwSUg+yrM4XSTuBWnyPVFwXX2Hd7eGros0q8e1K6p08WXGZDiWdknsl4ltYIb4HSDug2va5sK6t7Unbnsd7uYZ0I8AY0pFtbbtoq6vTHnF4oU5tG1xC6i3+rW4dNmvrS1FW3C6OKalzWvwv3qZcJb7adjE/xldZF6ewbvur3e5bZf19jLQ91G5536lCfEeQzgzczlu3i47a+h7rtovFrLtNuaO2HiZtFy/HNMW2LiUd3NS28XmF/ekY0n7zKeArZftfPSpGRESy6JOnyEREJD8lGBERyUIJRkREslCCERGRLJRgREQkCyUYERHJQglGpAcys2aPiRHpNpRgRDIzs3PM7MzC6/PMbJKZ/auZzTWzB83sm4Xxvzaz+Wa2wMwmFspXm9kPzewB1j1JWaTbUoIRyW8K6fEvtccTHUv6xvtw0vPkdgX2MLP9Y/pTPD3zbiTwhXjcC6TnaN3t7ru4+52tXACR9dG/qwMQ6e3c/Vkze8HMdiM9Ev0+0gMpR8UwpN9vGU76HaAvmNkRUT40yl8gPSJkeitjF9kQSjAirXEp6UGE25N6NAcB33b3/1ucyMwOID0h96Pu/qqZ3U56ECHAX9290VN5RbolnSITaY3rST95uyfpScCzgFPiNzows8HxhNutgJWRXD5I+rVHkR5JPRiRFnD3NWZ2G/Bi9EJ+Fz/LcFd6yjqrgX8m/UTA58zsUdLTbud0VcwiG0pPUxZpgRavcPoAAABHSURBVLi4fy9wtLs/2dXxiLSCTpGJZGZmI0i/I3Krkov0JerBiIhIFurBiIhIFkowIiKShRKMiIhkoQQjIiJZKMGIiEgW/wVnTVDZ9u9iUwAAAABJRU5ErkJggg==\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.lineplot(x='year',y='PCT_CHANGE',data=df_JHU)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CBHqIJBcoXv-"
},
"source": [
"## Step 24 - Compare UMBC and JHU \n",
"\n",
"In order to plot both UMBC and JHU tuition change over time in the same plot, we need to combine the two datasets using the common key of YEAR. \n",
"\n",
"First make a umbc2 dataframe with only two columns needed. We don't need other columns. Also change the column name from \"PCT_CHANGE\" to \"UMBC_PCT\" in preparation for the merge. This is because both umbc and jhu dataframe have the same column name \"PCT_CHANGE\", we rename them to there is no collision during the merge. BTW, Pandas handles collision gracefully, Feel free to try it without changing the column names."
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 937
},
"id": "bGYYULq16zNA",
"outputId": "bca4b2dd-b8a9-438e-dd59-ba048c2ed346"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:3: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" This is separate from the ipykernel package so we can avoid doing imports until\n"
]
},
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
year
\n",
"
UMBC_PCT
\n",
"
\n",
" \n",
" \n",
"
\n",
"
2151
\n",
"
1996
\n",
"
0.00
\n",
"
\n",
"
\n",
"
9056
\n",
"
1997
\n",
"
NaN
\n",
"
\n",
"
\n",
"
15923
\n",
"
1998
\n",
"
NaN
\n",
"
\n",
"
\n",
"
22598
\n",
"
1999
\n",
"
NaN
\n",
"
\n",
"
\n",
"
29178
\n",
"
2000
\n",
"
NaN
\n",
"
\n",
"
\n",
"
35821
\n",
"
2001
\n",
"
7.65
\n",
"
\n",
"
\n",
"
42524
\n",
"
2002
\n",
"
7.65
\n",
"
\n",
"
\n",
"
49156
\n",
"
2003
\n",
"
16.13
\n",
"
\n",
"
\n",
"
55786
\n",
"
2004
\n",
"
8.55
\n",
"
\n",
"
\n",
"
62522
\n",
"
2005
\n",
"
6.23
\n",
"
\n",
"
\n",
"
69401
\n",
"
2006
\n",
"
1.20
\n",
"
\n",
"
\n",
"
76332
\n",
"
2007
\n",
"
1.00
\n",
"
\n",
"
\n",
"
83291
\n",
"
2008
\n",
"
0.83
\n",
"
\n",
"
\n",
"
90324
\n",
"
2009
\n",
"
1.05
\n",
"
\n",
"
\n",
"
97506
\n",
"
2010
\n",
"
3.37
\n",
"
\n",
"
\n",
"
104960
\n",
"
2011
\n",
"
3.23
\n",
"
\n",
"
\n",
"
112672
\n",
"
2012
\n",
"
3.14
\n",
"
\n",
"
\n",
"
120503
\n",
"
2013
\n",
"
3.11
\n",
"
\n",
"
\n",
"
128353
\n",
"
2014
\n",
"
3.14
\n",
"
\n",
"
\n",
"
136064
\n",
"
2015
\n",
"
5.99
\n",
"
\n",
"
\n",
"
143685
\n",
"
2016
\n",
"
2.34
\n",
"
\n",
"
\n",
"
150891
\n",
"
2017
\n",
"
2.25
\n",
"
\n",
"
\n",
"
157940
\n",
"
2018
\n",
"
2.26
\n",
"
\n",
"
\n",
"
164723
\n",
"
2019
\n",
"
2.12
\n",
"
\n",
"
\n",
"
171401
\n",
"
2020
\n",
"
-21.68
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" year UMBC_PCT\n",
"2151 1996 0.00\n",
"9056 1997 NaN\n",
"15923 1998 NaN\n",
"22598 1999 NaN\n",
"29178 2000 NaN\n",
"35821 2001 7.65\n",
"42524 2002 7.65\n",
"49156 2003 16.13\n",
"55786 2004 8.55\n",
"62522 2005 6.23\n",
"69401 2006 1.20\n",
"76332 2007 1.00\n",
"83291 2008 0.83\n",
"90324 2009 1.05\n",
"97506 2010 3.37\n",
"104960 2011 3.23\n",
"112672 2012 3.14\n",
"120503 2013 3.11\n",
"128353 2014 3.14\n",
"136064 2015 5.99\n",
"143685 2016 2.34\n",
"150891 2017 2.25\n",
"157940 2018 2.26\n",
"164723 2019 2.12\n",
"171401 2020 -21.68"
]
},
"execution_count": 82,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#(Write code here)\n",
"umbc2 = df_UMBC[['year']]\n",
"umbc2['UMBC_PCT'] = df_UMBC['PCT_CHANGE']\n",
"umbc2"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "W23PSBYmozzi"
},
"source": [
"## Step 25 \n",
"\n",
"Then make a jhu2 dataframe with only two columns needed. WE don't need other columns. Also change the column name from \"PCT_CHANGE\" to \"JHU_PCT\" in preparation for the merge"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 937
},
"id": "DX95nFyG6zNF",
"outputId": "f78aee30-e1d6-4300-8201-dad581692289"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:3: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" This is separate from the ipykernel package so we can avoid doing imports until\n"
]
},
{
"data": {
"text/html": [
"\n",
"