{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pandas Merge and String Methods"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Lecture Notes and in-class exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "▶️ First, run the code cell below to import `unittest`, a module used for **🧭 Check Your Work** sections and the autograder."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import unittest\n",
    "tc = unittest.TestCase()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Import the following Python packages.\n",
    "    1. `pandas`: Use alias `pd`.\n",
    "    2. `numpy`: Use alias `np`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "### BEGIN SOLUTION\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "### END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check your work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "tc.assertTrue('pd' in globals(), 'Check whether you have correctly import Pandas with an alias.')\n",
    "tc.assertTrue('np' in globals(), 'Check whether you have correctly import NumPy with an alias.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 📌 Load employees and work laptops data\n",
    "\n",
    "For the first part, we're going to work with a small DataFrame to see how we merge two DataFrames together.\n",
    "\n",
    "▶️ Run the code cell below to create `df_employees` and `df_laptops`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_employees = pd.DataFrame({\n",
    "    'emp_id': [1, 2, 3],\n",
    "    'name': ['Jasper', 'Gary', 'Sally'],\n",
    "    'laptop_id': ['A', 'B', np.nan]\n",
    "})\n",
    "\n",
    "df_laptops = pd.DataFrame({\n",
    "    'laptop_id': ['A', 'B', 'C', 'D'],\n",
    "    'model': ['Red Touchbook', 'BlueGo', 'Eco Green', 'Hackbook Pro']\n",
    "})\n",
    "\n",
    "# Used for 🧭 Check Your Work sections\n",
    "df_employees_check = df_employees.copy()\n",
    "df_laptops_check = df_laptops.copy()\n",
    "df_join_check = df_employees_check.merge(df_laptops, on='laptop_id', how='outer')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "▶️ Run the code cell below to display `df_employees`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>laptop_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Jasper</td>\n",
       "      <td>A</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Gary</td>\n",
       "      <td>B</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Sally</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id    name laptop_id\n",
       "0       1  Jasper         A\n",
       "1       2    Gary         B\n",
       "2       3   Sally       NaN"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_employees"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "▶️ Run the code cell below to display `df_laptops`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>laptop_id</th>\n",
       "      <th>model</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A</td>\n",
       "      <td>Red Touchbook</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>B</td>\n",
       "      <td>BlueGo</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>C</td>\n",
       "      <td>Eco Green</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>D</td>\n",
       "      <td>Hackbook Pro</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  laptop_id          model\n",
       "0         A  Red Touchbook\n",
       "1         B         BlueGo\n",
       "2         C      Eco Green\n",
       "3         D   Hackbook Pro"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_laptops"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Exercise 1: Inner merge\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Find employees who have been assigned a work laptop.\n",
    "- ✔️ In other words, merge `df_employees` and `df_laptop` using an inner merge.\n",
    "- ✔️ Store the merged result to a new variable named `df_inner`.\n",
    "\n",
    "#### 🚀 Sample Code\n",
    "\n",
    "```python\n",
    "df_inner = pd.merge(\n",
    "    left=...,\n",
    "    right=...,\n",
    "    on='...',\n",
    "    how='...'\n",
    ")\n",
    "```\n",
    "\n",
    "#### 🔑 Expected Output\n",
    "\n",
    "|    |   emp_id | name   | laptop_id   | model         |\n",
    "|---:|---------:|:-------|:------------|:--------------|\n",
    "|  0 |        1 | Jasper | A           | Red Touchbook |\n",
    "|  1 |        2 | Gary   | B           | BlueGo        |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>laptop_id</th>\n",
       "      <th>model</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Jasper</td>\n",
       "      <td>A</td>\n",
       "      <td>Red Touchbook</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Gary</td>\n",
       "      <td>B</td>\n",
       "      <td>BlueGo</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id    name laptop_id          model\n",
       "0       1  Jasper         A  Red Touchbook\n",
       "1       2    Gary         B         BlueGo"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_inner = pd.merge(\n",
    "    left=df_employees,\n",
    "    right=df_laptops,\n",
    "    on='laptop_id',\n",
    "    how='inner'\n",
    ")\n",
    "### END SOLUTION\n",
    "\n",
    "display(df_inner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "exercise-01",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "# DO NOT CHANGE THE CODE IN THIS CELL\n",
    "df_jc = df_join_check\n",
    "df_sol = df_jc[df_jc['emp_id'].notna() & df_jc['laptop_id'].notna()].reset_index(drop=True)\n",
    "\n",
    "pd.testing.assert_frame_equal(\n",
    "    df_inner.reset_index(drop=True),\n",
    "    df_sol.reset_index(drop=True),\n",
    "    check_dtype=False\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Exercise 2: Left merge\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ List all employees and their assigned work laptops - if they are assigned one.\n",
    "- ✔️ If an employee has not been assigned a work laptop, leave `'laptop_id'` and `'model'` as np.NaN (or any other null-like value).\n",
    "- ✔️ In other words, merge `df_employees` and `df_laptop` using a left merge.\n",
    "- ✔️ Store the merged result to a new variable named `df_left`.\n",
    "\n",
    "#### 🔑 Expected Output\n",
    "\n",
    "|    |   emp_id | name   | laptop_id   | model         |\n",
    "|---:|---------:|:-------|:------------|:--------------|\n",
    "|  0 |        1 | Jasper | A           | Red Touchbook |\n",
    "|  1 |        2 | Gary   | B           | BlueGo        |\n",
    "|  2 |        3 | Sally  | NaN         | NaN           |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>laptop_id</th>\n",
       "      <th>model</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Jasper</td>\n",
       "      <td>A</td>\n",
       "      <td>Red Touchbook</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Gary</td>\n",
       "      <td>B</td>\n",
       "      <td>BlueGo</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Sally</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id    name laptop_id          model\n",
       "0       1  Jasper         A  Red Touchbook\n",
       "1       2    Gary         B         BlueGo\n",
       "2       3   Sally       NaN            NaN"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_left = pd.merge(\n",
    "    left=df_employees,\n",
    "    right=df_laptops,\n",
    "    on='laptop_id',\n",
    "    how='left'\n",
    ")\n",
    "### END SOLUTION\n",
    "\n",
    "display(df_left)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "exercise-01",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "# DO NOT CHANGE THE CODE IN THIS CELL\n",
    "df_jc = df_join_check\n",
    "df_sol = df_jc[df_jc['emp_id'].notna()].reset_index(drop=True)\n",
    "\n",
    "pd.testing.assert_frame_equal(\n",
    "    df_left.reset_index(drop=True),\n",
    "    df_sol.reset_index(drop=True),\n",
    "    check_dtype=False\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Exercise 3: Right merge\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ List all laptops and their associated owners - if they are assigned one.\n",
    "- ✔️ If a laptop has not been assigned to an employee, leave `'emp_id'` and `'name'` as np.NaN (or any other null-like value).\n",
    "- ✔️ In other words, merge `df_employees` and `df_laptop` using a right merge.\n",
    "- ✔️ Store the merged result to a new variable named `df_right`.\n",
    "\n",
    "#### 🔑 Expected Output\n",
    "\n",
    "|    |   emp_id | name   | laptop_id   | model         |\n",
    "|---:|---------:|:-------|:------------|:--------------|\n",
    "|  0 |        1 | Jasper | A           | Red Touchbook |\n",
    "|  1 |        2 | Gary   | B           | BlueGo        |\n",
    "|  2 |      NaN | NaN    | C           | Eco Green     |\n",
    "|  3 |      NaN | NaN    | D           | Hackbook Pro  |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>laptop_id</th>\n",
       "      <th>model</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1.0</td>\n",
       "      <td>Jasper</td>\n",
       "      <td>A</td>\n",
       "      <td>Red Touchbook</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2.0</td>\n",
       "      <td>Gary</td>\n",
       "      <td>B</td>\n",
       "      <td>BlueGo</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>C</td>\n",
       "      <td>Eco Green</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>D</td>\n",
       "      <td>Hackbook Pro</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id    name laptop_id          model\n",
       "0     1.0  Jasper         A  Red Touchbook\n",
       "1     2.0    Gary         B         BlueGo\n",
       "2     NaN     NaN         C      Eco Green\n",
       "3     NaN     NaN         D   Hackbook Pro"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_right = pd.merge(\n",
    "    left=df_employees,\n",
    "    right=df_laptops,\n",
    "    on='laptop_id',\n",
    "    how='right'\n",
    ")\n",
    "### END SOLUTION\n",
    "\n",
    "display(df_right)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "exercise-01",
     "locked": true,
     "points": "1",
     "solution": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# DO NOT CHANGE THE CODE IN THIS CELL\n",
    "df_jc = df_join_check\n",
    "df_sol = df_jc[df_jc['laptop_id'].notna()].reset_index(drop=True)\n",
    "\n",
    "pd.testing.assert_frame_equal(\n",
    "    df_right.reset_index(drop=True),\n",
    "    df_sol.reset_index(drop=True),\n",
    "    check_dtype=False\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Exercise 4: Full outer merge\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ List all employees and all work laptops - regardless of whether they are associated with one another.\n",
    "- ✔️ If an employee has not been assigned a work laptop, leave `'laptop_id'` and `'model'` as np.NaN (or any other null-like value).\n",
    "- ✔️ If a laptop has not been assigned to an employee, leave `'emp_id'` and `'name'` as np.NaN (or any other null-like value).\n",
    "- ✔️ In other words, merge `df_employees` and `df_laptop` using an outer merge.\n",
    "- ✔️ Store the merged result to a new variable named `df_outer`.\n",
    "\n",
    "#### 🔑 Expected Output\n",
    "\n",
    "|    |   emp_id | name   | laptop_id   | model         |\n",
    "|---:|---------:|:-------|:------------|:--------------|\n",
    "|  0 |        1 | Jasper | A           | Red Touchbook |\n",
    "|  1 |        2 | Gary   | B           | BlueGo        |\n",
    "|  2 |        3 | Sally  | NaN         | NaN           |\n",
    "|  3 |      NaN | NaN    | C           | Eco Green     |\n",
    "|  4 |      NaN | NaN    | D           | Hackbook Pro  |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>laptop_id</th>\n",
       "      <th>model</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1.0</td>\n",
       "      <td>Jasper</td>\n",
       "      <td>A</td>\n",
       "      <td>Red Touchbook</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2.0</td>\n",
       "      <td>Gary</td>\n",
       "      <td>B</td>\n",
       "      <td>BlueGo</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3.0</td>\n",
       "      <td>Sally</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>C</td>\n",
       "      <td>Eco Green</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>D</td>\n",
       "      <td>Hackbook Pro</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id    name laptop_id          model\n",
       "0     1.0  Jasper         A  Red Touchbook\n",
       "1     2.0    Gary         B         BlueGo\n",
       "2     3.0   Sally       NaN            NaN\n",
       "3     NaN     NaN         C      Eco Green\n",
       "4     NaN     NaN         D   Hackbook Pro"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_outer = pd.merge(\n",
    "    left=df_employees,\n",
    "    right=df_laptops,\n",
    "    on='laptop_id',\n",
    "    how='outer'\n",
    ")\n",
    "### END SOLUTION\n",
    "\n",
    "display(df_outer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "exercise-01",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "# DO NOT CHANGE THE CODE IN THIS CELL\n",
    "df_jc = df_join_check\n",
    "df_sol = df_jc[df_jc['emp_id'].notna()].reset_index(drop=True)\n",
    "\n",
    "pd.testing.assert_frame_equal(\n",
    "    df_left.reset_index(drop=True),\n",
    "    df_sol.reset_index(drop=True),\n",
    "    check_dtype=False\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## Pandas string methods\n",
    "\n",
    "### 📌 Load textual data\n",
    "\n",
    "▶️ Run the code cell below to create `df_libraries`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_libraries = pd.DataFrame({\n",
    "    'name': ['ACES (Funk)', 'Grainger', 'Law', 'Main'],\n",
    "    'amenities': [\n",
    "        'Rooms,Scanner,Printer',\n",
    "        'Rooms,Scanner,Printer,Cafe',\n",
    "        'Cafe',\n",
    "        'Rooms,Scanner,Printer,Cafe'\n",
    "    ],\n",
    "})\n",
    "\n",
    "# Used for 🧭 Check Your Work sections\n",
    "df_libraries_check = df_libraries.copy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "▶️ Run the code cell below to display `df_libraries`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>amenities</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ACES (Funk)</td>\n",
       "      <td>Rooms,Scanner,Printer</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Grainger</td>\n",
       "      <td>Rooms,Scanner,Printer,Cafe</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Law</td>\n",
       "      <td>Cafe</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Main</td>\n",
       "      <td>Rooms,Scanner,Printer,Cafe</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          name                   amenities\n",
       "0  ACES (Funk)       Rooms,Scanner,Printer\n",
       "1     Grainger  Rooms,Scanner,Printer,Cafe\n",
       "2          Law                        Cafe\n",
       "3         Main  Rooms,Scanner,Printer,Cafe"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_libraries"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Exercise 5: Length of library names\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Find the number of characters (i.e., string length) of each library.\n",
    "- ✔️ Store the result to a new column named `'name_length'` in `df_libraries`.\n",
    "\n",
    "#### 🔑 Expected Output\n",
    "\n",
    "|    | name        | amenities                  |   name_length |\n",
    "|---:|:------------|:---------------------------|--------------:|\n",
    "|  0 | ACES (Funk) | Rooms,Scanner,Printer      |            11 |\n",
    "|  1 | Grainger    | Rooms,Scanner,Printer,Cafe |             8 |\n",
    "|  2 | Law         | Cafe                       |             3 |\n",
    "|  3 | Main        | Rooms,Scanner,Printer,Cafe |             4 |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>amenities</th>\n",
       "      <th>name_length</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ACES (Funk)</td>\n",
       "      <td>Rooms,Scanner,Printer</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Grainger</td>\n",
       "      <td>Rooms,Scanner,Printer,Cafe</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Law</td>\n",
       "      <td>Cafe</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Main</td>\n",
       "      <td>Rooms,Scanner,Printer,Cafe</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          name                   amenities  name_length\n",
       "0  ACES (Funk)       Rooms,Scanner,Printer           11\n",
       "1     Grainger  Rooms,Scanner,Printer,Cafe            8\n",
       "2          Law                        Cafe            3\n",
       "3         Main  Rooms,Scanner,Printer,Cafe            4"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_libraries['name_length'] = df_libraries['name'].str.len()\n",
    "### END SOLUTION\n",
    "\n",
    "display(df_libraries)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "exercise-01",
     "locked": true,
     "points": "1",
     "solution": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# DO NOT CHANGE THE CODE IN THIS CELL\n",
    "df_lc = df_libraries_check\n",
    "df_lc['name_length'] = df_lc['name'].str.len()\n",
    "\n",
    "pd.testing.assert_frame_equal(\n",
    "    df_libraries.reset_index(drop=True),\n",
    "    df_lc.reset_index(drop=True),\n",
    "    check_dtype=False\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Exercise 6: Uppercase library names\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Convert the library names to uppercase.\n",
    "- ✔️ Directly update the `'name'` column in `df_libraries`.\n",
    "\n",
    "#### 🔑 Expected Output\n",
    "\n",
    "|    | name        | amenities                  |   name_length |\n",
    "|---:|:------------|:---------------------------|--------------:|\n",
    "|  0 | ACES (FUNK) | Rooms,Scanner,Printer      |            11 |\n",
    "|  1 | GRAINGER    | Rooms,Scanner,Printer,Cafe |             8 |\n",
    "|  2 | LAW         | Cafe                       |             3 |\n",
    "|  3 | MAIN        | Rooms,Scanner,Printer,Cafe |             4 |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>amenities</th>\n",
       "      <th>name_length</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ACES (FUNK)</td>\n",
       "      <td>Rooms,Scanner,Printer</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>GRAINGER</td>\n",
       "      <td>Rooms,Scanner,Printer,Cafe</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LAW</td>\n",
       "      <td>Cafe</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MAIN</td>\n",
       "      <td>Rooms,Scanner,Printer,Cafe</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          name                   amenities  name_length\n",
       "0  ACES (FUNK)       Rooms,Scanner,Printer           11\n",
       "1     GRAINGER  Rooms,Scanner,Printer,Cafe            8\n",
       "2          LAW                        Cafe            3\n",
       "3         MAIN  Rooms,Scanner,Printer,Cafe            4"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_libraries['name'] = df_libraries['name'].str.upper()\n",
    "### END SOLUTION\n",
    "\n",
    "display(df_libraries)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "exercise-01",
     "locked": true,
     "points": "1",
     "solution": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# DO NOT CHANGE THE CODE IN THIS CELL\n",
    "df_lc = df_libraries_check\n",
    "df_lc['name_length'] = df_lc['name'].str.len()\n",
    "df_lc['name'] = df_lc['name'].str.upper()\n",
    "\n",
    "pd.testing.assert_frame_equal(\n",
    "    df_libraries.reset_index(drop=True),\n",
    "    df_lc.reset_index(drop=True),\n",
    "    check_dtype=False\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Exercise 7: Split amenities into lists\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Split the items in the `'amenities'` column using the comma (`,`) as a delimiter.\n",
    "- ✔️ Store the splitted result to a new column named `'amenities_list'` in `df_libraries`.\n",
    "\n",
    "#### 🔑 Expected Output\n",
    "\n",
    "|    | name        | amenities                  |   name_length | amenities_list                          |\n",
    "|---:|:------------|:---------------------------|--------------:|:----------------------------------------|\n",
    "|  0 | ACES (FUNK) | Rooms,Scanner,Printer      |            11 | ['Rooms', 'Scanner', 'Printer']         |\n",
    "|  1 | GRAINGER    | Rooms,Scanner,Printer,Cafe |             8 | ['Rooms', 'Scanner', 'Printer', 'Cafe'] |\n",
    "|  2 | LAW         | Cafe                       |             3 | ['Cafe']                                |\n",
    "|  3 | MAIN        | Rooms,Scanner,Printer,Cafe |             4 | ['Rooms', 'Scanner', 'Printer', 'Cafe'] |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>amenities</th>\n",
       "      <th>name_length</th>\n",
       "      <th>amenities_list</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ACES (FUNK)</td>\n",
       "      <td>Rooms,Scanner,Printer</td>\n",
       "      <td>11</td>\n",
       "      <td>[Rooms, Scanner, Printer]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>GRAINGER</td>\n",
       "      <td>Rooms,Scanner,Printer,Cafe</td>\n",
       "      <td>8</td>\n",
       "      <td>[Rooms, Scanner, Printer, Cafe]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>LAW</td>\n",
       "      <td>Cafe</td>\n",
       "      <td>3</td>\n",
       "      <td>[Cafe]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MAIN</td>\n",
       "      <td>Rooms,Scanner,Printer,Cafe</td>\n",
       "      <td>4</td>\n",
       "      <td>[Rooms, Scanner, Printer, Cafe]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          name                   amenities  name_length  \\\n",
       "0  ACES (FUNK)       Rooms,Scanner,Printer           11   \n",
       "1     GRAINGER  Rooms,Scanner,Printer,Cafe            8   \n",
       "2          LAW                        Cafe            3   \n",
       "3         MAIN  Rooms,Scanner,Printer,Cafe            4   \n",
       "\n",
       "                    amenities_list  \n",
       "0        [Rooms, Scanner, Printer]  \n",
       "1  [Rooms, Scanner, Printer, Cafe]  \n",
       "2                           [Cafe]  \n",
       "3  [Rooms, Scanner, Printer, Cafe]  "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_libraries['amenities_list'] = df_libraries['amenities'].str.split(',')\n",
    "### END SOLUTION\n",
    "\n",
    "display(df_libraries)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "exercise-01",
     "locked": true,
     "points": "1",
     "solution": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# DO NOT CHANGE THE CODE IN THIS CELL\n",
    "df_lc = df_libraries_check\n",
    "df_lc['name_length'] = df_lc['name'].str.len()\n",
    "df_lc['name'] = df_lc['name'].str.upper()\n",
    "df_lc['amenities_list'] = df_lc['amenities'].str.split(',')\n",
    "\n",
    "pd.testing.assert_frame_equal(\n",
    "    df_libraries.reset_index(drop=True),\n",
    "    df_lc.reset_index(drop=True),\n",
    "    check_dtype=False\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}