{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Pandas Merge and String Methods"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Lecture Notes and in-class exercises"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"▶️ First, run the code cell below to import `unittest`, a module used for **🧭 Check Your Work** sections and the autograder."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import unittest\n",
"tc = unittest.TestCase()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 👇 Tasks\n",
"\n",
"- ✔️ Import the following Python packages.\n",
" 1. `pandas`: Use alias `pd`.\n",
" 2. `numpy`: Use alias `np`."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"### BEGIN SOLUTION\n",
"import pandas as pd\n",
"import numpy as np\n",
"### END SOLUTION"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 🧭 Check your work\n",
"\n",
"- Once you're done, run the code cell below to test correctness.\n",
"- ✔️ If the code cell runs without an error, you're good to move on.\n",
"- ❌ If the code cell throws an error, go back and fix incorrect parts."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"tc.assertTrue('pd' in globals(), 'Check whether you have correctly import Pandas with an alias.')\n",
"tc.assertTrue('np' in globals(), 'Check whether you have correctly import NumPy with an alias.')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"### 📌 Load employees and work laptops data\n",
"\n",
"For the first part, we're going to work with a small DataFrame to see how we merge two DataFrames together.\n",
"\n",
"▶️ Run the code cell below to create `df_employees` and `df_laptops`."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"df_employees = pd.DataFrame({\n",
" 'emp_id': [1, 2, 3],\n",
" 'name': ['Jasper', 'Gary', 'Sally'],\n",
" 'laptop_id': ['A', 'B', np.nan]\n",
"})\n",
"\n",
"df_laptops = pd.DataFrame({\n",
" 'laptop_id': ['A', 'B', 'C', 'D'],\n",
" 'model': ['Red Touchbook', 'BlueGo', 'Eco Green', 'Hackbook Pro']\n",
"})\n",
"\n",
"# Used for 🧭 Check Your Work sections\n",
"df_employees_check = df_employees.copy()\n",
"df_laptops_check = df_laptops.copy()\n",
"df_join_check = df_employees_check.merge(df_laptops, on='laptop_id', how='outer')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"▶️ Run the code cell below to display `df_employees`."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" emp_id | \n",
" name | \n",
" laptop_id | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" Jasper | \n",
" A | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" Gary | \n",
" B | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" Sally | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" emp_id name laptop_id\n",
"0 1 Jasper A\n",
"1 2 Gary B\n",
"2 3 Sally NaN"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_employees"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"▶️ Run the code cell below to display `df_laptops`."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" laptop_id | \n",
" model | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" A | \n",
" Red Touchbook | \n",
"
\n",
" \n",
" 1 | \n",
" B | \n",
" BlueGo | \n",
"
\n",
" \n",
" 2 | \n",
" C | \n",
" Eco Green | \n",
"
\n",
" \n",
" 3 | \n",
" D | \n",
" Hackbook Pro | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" laptop_id model\n",
"0 A Red Touchbook\n",
"1 B BlueGo\n",
"2 C Eco Green\n",
"3 D Hackbook Pro"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_laptops"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"### 🎯 Exercise 1: Inner merge\n",
"\n",
"#### 👇 Tasks\n",
"\n",
"- ✔️ Find employees who have been assigned a work laptop.\n",
"- ✔️ In other words, merge `df_employees` and `df_laptop` using an inner merge.\n",
"- ✔️ Store the merged result to a new variable named `df_inner`.\n",
"\n",
"#### 🚀 Sample Code\n",
"\n",
"```python\n",
"df_inner = pd.merge(\n",
" left=...,\n",
" right=...,\n",
" on='...',\n",
" how='...'\n",
")\n",
"```\n",
"\n",
"#### 🔑 Expected Output\n",
"\n",
"| | emp_id | name | laptop_id | model |\n",
"|---:|---------:|:-------|:------------|:--------------|\n",
"| 0 | 1 | Jasper | A | Red Touchbook |\n",
"| 1 | 2 | Gary | B | BlueGo |"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" emp_id | \n",
" name | \n",
" laptop_id | \n",
" model | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" Jasper | \n",
" A | \n",
" Red Touchbook | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" Gary | \n",
" B | \n",
" BlueGo | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" emp_id name laptop_id model\n",
"0 1 Jasper A Red Touchbook\n",
"1 2 Gary B BlueGo"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"### BEGIN SOLUTION\n",
"df_inner = pd.merge(\n",
" left=df_employees,\n",
" right=df_laptops,\n",
" on='laptop_id',\n",
" how='inner'\n",
")\n",
"### END SOLUTION\n",
"\n",
"display(df_inner)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 🧭 Check Your Work\n",
"\n",
"- Once you're done, run the code cell below to test correctness.\n",
"- ✔️ If the code cell runs without an error, you're good to move on.\n",
"- ❌ If the code cell throws an error, go back and fix incorrect parts."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "exercise-01",
"locked": true,
"points": "1",
"solution": false
}
},
"outputs": [],
"source": [
"# DO NOT CHANGE THE CODE IN THIS CELL\n",
"df_jc = df_join_check\n",
"df_sol = df_jc[df_jc['emp_id'].notna() & df_jc['laptop_id'].notna()].reset_index(drop=True)\n",
"\n",
"pd.testing.assert_frame_equal(\n",
" df_inner.reset_index(drop=True),\n",
" df_sol.reset_index(drop=True),\n",
" check_dtype=False\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"### 🎯 Exercise 2: Left merge\n",
"\n",
"#### 👇 Tasks\n",
"\n",
"- ✔️ List all employees and their assigned work laptops - if they are assigned one.\n",
"- ✔️ If an employee has not been assigned a work laptop, leave `'laptop_id'` and `'model'` as np.NaN (or any other null-like value).\n",
"- ✔️ In other words, merge `df_employees` and `df_laptop` using a left merge.\n",
"- ✔️ Store the merged result to a new variable named `df_left`.\n",
"\n",
"#### 🔑 Expected Output\n",
"\n",
"| | emp_id | name | laptop_id | model |\n",
"|---:|---------:|:-------|:------------|:--------------|\n",
"| 0 | 1 | Jasper | A | Red Touchbook |\n",
"| 1 | 2 | Gary | B | BlueGo |\n",
"| 2 | 3 | Sally | NaN | NaN |"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" emp_id | \n",
" name | \n",
" laptop_id | \n",
" model | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" Jasper | \n",
" A | \n",
" Red Touchbook | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" Gary | \n",
" B | \n",
" BlueGo | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" Sally | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" emp_id name laptop_id model\n",
"0 1 Jasper A Red Touchbook\n",
"1 2 Gary B BlueGo\n",
"2 3 Sally NaN NaN"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"### BEGIN SOLUTION\n",
"df_left = pd.merge(\n",
" left=df_employees,\n",
" right=df_laptops,\n",
" on='laptop_id',\n",
" how='left'\n",
")\n",
"### END SOLUTION\n",
"\n",
"display(df_left)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 🧭 Check Your Work\n",
"\n",
"- Once you're done, run the code cell below to test correctness.\n",
"- ✔️ If the code cell runs without an error, you're good to move on.\n",
"- ❌ If the code cell throws an error, go back and fix incorrect parts."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "exercise-01",
"locked": true,
"points": "1",
"solution": false
}
},
"outputs": [],
"source": [
"# DO NOT CHANGE THE CODE IN THIS CELL\n",
"df_jc = df_join_check\n",
"df_sol = df_jc[df_jc['emp_id'].notna()].reset_index(drop=True)\n",
"\n",
"pd.testing.assert_frame_equal(\n",
" df_left.reset_index(drop=True),\n",
" df_sol.reset_index(drop=True),\n",
" check_dtype=False\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"### 🎯 Exercise 3: Right merge\n",
"\n",
"#### 👇 Tasks\n",
"\n",
"- ✔️ List all laptops and their associated owners - if they are assigned one.\n",
"- ✔️ If a laptop has not been assigned to an employee, leave `'emp_id'` and `'name'` as np.NaN (or any other null-like value).\n",
"- ✔️ In other words, merge `df_employees` and `df_laptop` using a right merge.\n",
"- ✔️ Store the merged result to a new variable named `df_right`.\n",
"\n",
"#### 🔑 Expected Output\n",
"\n",
"| | emp_id | name | laptop_id | model |\n",
"|---:|---------:|:-------|:------------|:--------------|\n",
"| 0 | 1 | Jasper | A | Red Touchbook |\n",
"| 1 | 2 | Gary | B | BlueGo |\n",
"| 2 | NaN | NaN | C | Eco Green |\n",
"| 3 | NaN | NaN | D | Hackbook Pro |"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" emp_id | \n",
" name | \n",
" laptop_id | \n",
" model | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1.0 | \n",
" Jasper | \n",
" A | \n",
" Red Touchbook | \n",
"
\n",
" \n",
" 1 | \n",
" 2.0 | \n",
" Gary | \n",
" B | \n",
" BlueGo | \n",
"
\n",
" \n",
" 2 | \n",
" NaN | \n",
" NaN | \n",
" C | \n",
" Eco Green | \n",
"
\n",
" \n",
" 3 | \n",
" NaN | \n",
" NaN | \n",
" D | \n",
" Hackbook Pro | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" emp_id name laptop_id model\n",
"0 1.0 Jasper A Red Touchbook\n",
"1 2.0 Gary B BlueGo\n",
"2 NaN NaN C Eco Green\n",
"3 NaN NaN D Hackbook Pro"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"### BEGIN SOLUTION\n",
"df_right = pd.merge(\n",
" left=df_employees,\n",
" right=df_laptops,\n",
" on='laptop_id',\n",
" how='right'\n",
")\n",
"### END SOLUTION\n",
"\n",
"display(df_right)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 🧭 Check Your Work\n",
"\n",
"- Once you're done, run the code cell below to test correctness.\n",
"- ✔️ If the code cell runs without an error, you're good to move on.\n",
"- ❌ If the code cell throws an error, go back and fix incorrect parts."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "exercise-01",
"locked": true,
"points": "1",
"solution": false
},
"tags": []
},
"outputs": [],
"source": [
"# DO NOT CHANGE THE CODE IN THIS CELL\n",
"df_jc = df_join_check\n",
"df_sol = df_jc[df_jc['laptop_id'].notna()].reset_index(drop=True)\n",
"\n",
"pd.testing.assert_frame_equal(\n",
" df_right.reset_index(drop=True),\n",
" df_sol.reset_index(drop=True),\n",
" check_dtype=False\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"### 🎯 Exercise 4: Full outer merge\n",
"\n",
"#### 👇 Tasks\n",
"\n",
"- ✔️ List all employees and all work laptops - regardless of whether they are associated with one another.\n",
"- ✔️ If an employee has not been assigned a work laptop, leave `'laptop_id'` and `'model'` as np.NaN (or any other null-like value).\n",
"- ✔️ If a laptop has not been assigned to an employee, leave `'emp_id'` and `'name'` as np.NaN (or any other null-like value).\n",
"- ✔️ In other words, merge `df_employees` and `df_laptop` using an outer merge.\n",
"- ✔️ Store the merged result to a new variable named `df_outer`.\n",
"\n",
"#### 🔑 Expected Output\n",
"\n",
"| | emp_id | name | laptop_id | model |\n",
"|---:|---------:|:-------|:------------|:--------------|\n",
"| 0 | 1 | Jasper | A | Red Touchbook |\n",
"| 1 | 2 | Gary | B | BlueGo |\n",
"| 2 | 3 | Sally | NaN | NaN |\n",
"| 3 | NaN | NaN | C | Eco Green |\n",
"| 4 | NaN | NaN | D | Hackbook Pro |"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" emp_id | \n",
" name | \n",
" laptop_id | \n",
" model | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1.0 | \n",
" Jasper | \n",
" A | \n",
" Red Touchbook | \n",
"
\n",
" \n",
" 1 | \n",
" 2.0 | \n",
" Gary | \n",
" B | \n",
" BlueGo | \n",
"
\n",
" \n",
" 2 | \n",
" 3.0 | \n",
" Sally | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 3 | \n",
" NaN | \n",
" NaN | \n",
" C | \n",
" Eco Green | \n",
"
\n",
" \n",
" 4 | \n",
" NaN | \n",
" NaN | \n",
" D | \n",
" Hackbook Pro | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" emp_id name laptop_id model\n",
"0 1.0 Jasper A Red Touchbook\n",
"1 2.0 Gary B BlueGo\n",
"2 3.0 Sally NaN NaN\n",
"3 NaN NaN C Eco Green\n",
"4 NaN NaN D Hackbook Pro"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"### BEGIN SOLUTION\n",
"df_outer = pd.merge(\n",
" left=df_employees,\n",
" right=df_laptops,\n",
" on='laptop_id',\n",
" how='outer'\n",
")\n",
"### END SOLUTION\n",
"\n",
"display(df_outer)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 🧭 Check Your Work\n",
"\n",
"- Once you're done, run the code cell below to test correctness.\n",
"- ✔️ If the code cell runs without an error, you're good to move on.\n",
"- ❌ If the code cell throws an error, go back and fix incorrect parts."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "exercise-01",
"locked": true,
"points": "1",
"solution": false
}
},
"outputs": [],
"source": [
"# DO NOT CHANGE THE CODE IN THIS CELL\n",
"df_jc = df_join_check\n",
"df_sol = df_jc[df_jc['emp_id'].notna()].reset_index(drop=True)\n",
"\n",
"pd.testing.assert_frame_equal(\n",
" df_left.reset_index(drop=True),\n",
" df_sol.reset_index(drop=True),\n",
" check_dtype=False\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"## Pandas string methods\n",
"\n",
"### 📌 Load textual data\n",
"\n",
"▶️ Run the code cell below to create `df_libraries`."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"df_libraries = pd.DataFrame({\n",
" 'name': ['ACES (Funk)', 'Grainger', 'Law', 'Main'],\n",
" 'amenities': [\n",
" 'Rooms,Scanner,Printer',\n",
" 'Rooms,Scanner,Printer,Cafe',\n",
" 'Cafe',\n",
" 'Rooms,Scanner,Printer,Cafe'\n",
" ],\n",
"})\n",
"\n",
"# Used for 🧭 Check Your Work sections\n",
"df_libraries_check = df_libraries.copy()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"▶️ Run the code cell below to display `df_libraries`."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" amenities | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" ACES (Funk) | \n",
" Rooms,Scanner,Printer | \n",
"
\n",
" \n",
" 1 | \n",
" Grainger | \n",
" Rooms,Scanner,Printer,Cafe | \n",
"
\n",
" \n",
" 2 | \n",
" Law | \n",
" Cafe | \n",
"
\n",
" \n",
" 3 | \n",
" Main | \n",
" Rooms,Scanner,Printer,Cafe | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" name amenities\n",
"0 ACES (Funk) Rooms,Scanner,Printer\n",
"1 Grainger Rooms,Scanner,Printer,Cafe\n",
"2 Law Cafe\n",
"3 Main Rooms,Scanner,Printer,Cafe"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_libraries"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"### 🎯 Exercise 5: Length of library names\n",
"\n",
"#### 👇 Tasks\n",
"\n",
"- ✔️ Find the number of characters (i.e., string length) of each library.\n",
"- ✔️ Store the result to a new column named `'name_length'` in `df_libraries`.\n",
"\n",
"#### 🔑 Expected Output\n",
"\n",
"| | name | amenities | name_length |\n",
"|---:|:------------|:---------------------------|--------------:|\n",
"| 0 | ACES (Funk) | Rooms,Scanner,Printer | 11 |\n",
"| 1 | Grainger | Rooms,Scanner,Printer,Cafe | 8 |\n",
"| 2 | Law | Cafe | 3 |\n",
"| 3 | Main | Rooms,Scanner,Printer,Cafe | 4 |"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" amenities | \n",
" name_length | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" ACES (Funk) | \n",
" Rooms,Scanner,Printer | \n",
" 11 | \n",
"
\n",
" \n",
" 1 | \n",
" Grainger | \n",
" Rooms,Scanner,Printer,Cafe | \n",
" 8 | \n",
"
\n",
" \n",
" 2 | \n",
" Law | \n",
" Cafe | \n",
" 3 | \n",
"
\n",
" \n",
" 3 | \n",
" Main | \n",
" Rooms,Scanner,Printer,Cafe | \n",
" 4 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" name amenities name_length\n",
"0 ACES (Funk) Rooms,Scanner,Printer 11\n",
"1 Grainger Rooms,Scanner,Printer,Cafe 8\n",
"2 Law Cafe 3\n",
"3 Main Rooms,Scanner,Printer,Cafe 4"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"### BEGIN SOLUTION\n",
"df_libraries['name_length'] = df_libraries['name'].str.len()\n",
"### END SOLUTION\n",
"\n",
"display(df_libraries)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 🧭 Check Your Work\n",
"\n",
"- Once you're done, run the code cell below to test correctness.\n",
"- ✔️ If the code cell runs without an error, you're good to move on.\n",
"- ❌ If the code cell throws an error, go back and fix incorrect parts."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "exercise-01",
"locked": true,
"points": "1",
"solution": false
},
"tags": []
},
"outputs": [],
"source": [
"# DO NOT CHANGE THE CODE IN THIS CELL\n",
"df_lc = df_libraries_check\n",
"df_lc['name_length'] = df_lc['name'].str.len()\n",
"\n",
"pd.testing.assert_frame_equal(\n",
" df_libraries.reset_index(drop=True),\n",
" df_lc.reset_index(drop=True),\n",
" check_dtype=False\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"### 🎯 Exercise 6: Uppercase library names\n",
"\n",
"#### 👇 Tasks\n",
"\n",
"- ✔️ Convert the library names to uppercase.\n",
"- ✔️ Directly update the `'name'` column in `df_libraries`.\n",
"\n",
"#### 🔑 Expected Output\n",
"\n",
"| | name | amenities | name_length |\n",
"|---:|:------------|:---------------------------|--------------:|\n",
"| 0 | ACES (FUNK) | Rooms,Scanner,Printer | 11 |\n",
"| 1 | GRAINGER | Rooms,Scanner,Printer,Cafe | 8 |\n",
"| 2 | LAW | Cafe | 3 |\n",
"| 3 | MAIN | Rooms,Scanner,Printer,Cafe | 4 |"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" amenities | \n",
" name_length | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" ACES (FUNK) | \n",
" Rooms,Scanner,Printer | \n",
" 11 | \n",
"
\n",
" \n",
" 1 | \n",
" GRAINGER | \n",
" Rooms,Scanner,Printer,Cafe | \n",
" 8 | \n",
"
\n",
" \n",
" 2 | \n",
" LAW | \n",
" Cafe | \n",
" 3 | \n",
"
\n",
" \n",
" 3 | \n",
" MAIN | \n",
" Rooms,Scanner,Printer,Cafe | \n",
" 4 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" name amenities name_length\n",
"0 ACES (FUNK) Rooms,Scanner,Printer 11\n",
"1 GRAINGER Rooms,Scanner,Printer,Cafe 8\n",
"2 LAW Cafe 3\n",
"3 MAIN Rooms,Scanner,Printer,Cafe 4"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"### BEGIN SOLUTION\n",
"df_libraries['name'] = df_libraries['name'].str.upper()\n",
"### END SOLUTION\n",
"\n",
"display(df_libraries)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 🧭 Check Your Work\n",
"\n",
"- Once you're done, run the code cell below to test correctness.\n",
"- ✔️ If the code cell runs without an error, you're good to move on.\n",
"- ❌ If the code cell throws an error, go back and fix incorrect parts."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "exercise-01",
"locked": true,
"points": "1",
"solution": false
},
"tags": []
},
"outputs": [],
"source": [
"# DO NOT CHANGE THE CODE IN THIS CELL\n",
"df_lc = df_libraries_check\n",
"df_lc['name_length'] = df_lc['name'].str.len()\n",
"df_lc['name'] = df_lc['name'].str.upper()\n",
"\n",
"pd.testing.assert_frame_equal(\n",
" df_libraries.reset_index(drop=True),\n",
" df_lc.reset_index(drop=True),\n",
" check_dtype=False\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"### 🎯 Exercise 7: Split amenities into lists\n",
"\n",
"#### 👇 Tasks\n",
"\n",
"- ✔️ Split the items in the `'amenities'` column using the comma (`,`) as a delimiter.\n",
"- ✔️ Store the splitted result to a new column named `'amenities_list'` in `df_libraries`.\n",
"\n",
"#### 🔑 Expected Output\n",
"\n",
"| | name | amenities | name_length | amenities_list |\n",
"|---:|:------------|:---------------------------|--------------:|:----------------------------------------|\n",
"| 0 | ACES (FUNK) | Rooms,Scanner,Printer | 11 | ['Rooms', 'Scanner', 'Printer'] |\n",
"| 1 | GRAINGER | Rooms,Scanner,Printer,Cafe | 8 | ['Rooms', 'Scanner', 'Printer', 'Cafe'] |\n",
"| 2 | LAW | Cafe | 3 | ['Cafe'] |\n",
"| 3 | MAIN | Rooms,Scanner,Printer,Cafe | 4 | ['Rooms', 'Scanner', 'Printer', 'Cafe'] |"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" amenities | \n",
" name_length | \n",
" amenities_list | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" ACES (FUNK) | \n",
" Rooms,Scanner,Printer | \n",
" 11 | \n",
" [Rooms, Scanner, Printer] | \n",
"
\n",
" \n",
" 1 | \n",
" GRAINGER | \n",
" Rooms,Scanner,Printer,Cafe | \n",
" 8 | \n",
" [Rooms, Scanner, Printer, Cafe] | \n",
"
\n",
" \n",
" 2 | \n",
" LAW | \n",
" Cafe | \n",
" 3 | \n",
" [Cafe] | \n",
"
\n",
" \n",
" 3 | \n",
" MAIN | \n",
" Rooms,Scanner,Printer,Cafe | \n",
" 4 | \n",
" [Rooms, Scanner, Printer, Cafe] | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" name amenities name_length \\\n",
"0 ACES (FUNK) Rooms,Scanner,Printer 11 \n",
"1 GRAINGER Rooms,Scanner,Printer,Cafe 8 \n",
"2 LAW Cafe 3 \n",
"3 MAIN Rooms,Scanner,Printer,Cafe 4 \n",
"\n",
" amenities_list \n",
"0 [Rooms, Scanner, Printer] \n",
"1 [Rooms, Scanner, Printer, Cafe] \n",
"2 [Cafe] \n",
"3 [Rooms, Scanner, Printer, Cafe] "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"### BEGIN SOLUTION\n",
"df_libraries['amenities_list'] = df_libraries['amenities'].str.split(',')\n",
"### END SOLUTION\n",
"\n",
"display(df_libraries)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 🧭 Check Your Work\n",
"\n",
"- Once you're done, run the code cell below to test correctness.\n",
"- ✔️ If the code cell runs without an error, you're good to move on.\n",
"- ❌ If the code cell throws an error, go back and fix incorrect parts."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "exercise-01",
"locked": true,
"points": "1",
"solution": false
},
"tags": []
},
"outputs": [],
"source": [
"# DO NOT CHANGE THE CODE IN THIS CELL\n",
"df_lc = df_libraries_check\n",
"df_lc['name_length'] = df_lc['name'].str.len()\n",
"df_lc['name'] = df_lc['name'].str.upper()\n",
"df_lc['amenities_list'] = df_lc['amenities'].str.split(',')\n",
"\n",
"pd.testing.assert_frame_equal(\n",
" df_libraries.reset_index(drop=True),\n",
" df_lc.reset_index(drop=True),\n",
" check_dtype=False\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}