{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pandas Missing Values, Datetime, Aggregation, and Merging"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Lecture Notes and in-class exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "▶️ First, run the code cell below to import `unittest`, a module used for **🧭 Check Your Work** sections and the autograder."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import unittest\n",
    "tc = unittest.TestCase()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Import the following Python packages.\n",
    "    1. `pandas`: Use alias `pd`.\n",
    "    2. `numpy`: Use alias `np`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "### BEGIN SOLUTION\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "### END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check your work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "tc.assertTrue('pd' in globals(), 'Check whether you have correctly import Pandas with an alias.')\n",
    "tc.assertTrue('np' in globals(), 'Check whether you have correctly import NumPy with an alias.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## 👉 Working with Missing Values\n",
    "\n",
    "A dataset commonly can contain one or more missing values. Pandas provides a flexible set of methods for you to work with missing values. Missing values can be represented in multiple ways. Here are few common notations that denote a missing value.\n",
    "\n",
    "- `NaN`: NaN stands for \"not a number\".\n",
    "- `None`: python's built-in type (equivalent to `null` in other programming languages)\n",
    "- `NA`: A native scalar to denote a missing value in `pandas`\n",
    "- `NaT`: NaT stands for \"not a time\". "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "nan\n"
     ]
    }
   ],
   "source": [
    "# NaN\n",
    "nan_example = np.nan\n",
    "print(nan_example)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "None\n"
     ]
    }
   ],
   "source": [
    "# None\n",
    "none_example = None\n",
    "print(none_example)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<NA>\n"
     ]
    }
   ],
   "source": [
    "# NA\n",
    "na_example = pd.NA\n",
    "print(na_example)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NaT\n"
     ]
    }
   ],
   "source": [
    "# NaT\n",
    "nat_example = pd.NaT\n",
    "print(nat_example)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 📌 Load data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "▶️ Run the code cell below to create a new `DataFrame` named `df_people`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>major1</th>\n",
       "      <th>major2</th>\n",
       "      <th>city</th>\n",
       "      <th>fav_restaurant</th>\n",
       "      <th>fav_movie</th>\n",
       "      <th>has_iphone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Dwight Boon</td>\n",
       "      <td>Accountancy</td>\n",
       "      <td>Marketing</td>\n",
       "      <td>Fairfax</td>\n",
       "      <td>Chick-fil-a</td>\n",
       "      <td>Ford vs Ferrari</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Jaclyn Kay</td>\n",
       "      <td>Marketing</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Ploiesti</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Wade Kitchens</td>\n",
       "      <td>Supply Chain Management</td>\n",
       "      <td>Marketing</td>\n",
       "      <td>Naperville</td>\n",
       "      <td>Noodles and Company</td>\n",
       "      <td>The Mummy</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Rajeev Frye</td>\n",
       "      <td>Statistics</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Chicago</td>\n",
       "      <td>Soho House</td>\n",
       "      <td>The Shawshank Redemption</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Rowland Waldo</td>\n",
       "      <td>Nondegree</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Grayslake</td>\n",
       "      <td>Chipotle</td>\n",
       "      <td>Inception</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            name                   major1     major2        city  \\\n",
       "0    Dwight Boon              Accountancy  Marketing     Fairfax   \n",
       "1     Jaclyn Kay                Marketing        NaN    Ploiesti   \n",
       "2  Wade Kitchens  Supply Chain Management  Marketing  Naperville   \n",
       "3    Rajeev Frye               Statistics        NaN     Chicago   \n",
       "4  Rowland Waldo                Nondegree        NaN   Grayslake   \n",
       "\n",
       "        fav_restaurant                 fav_movie  has_iphone  \n",
       "0          Chick-fil-a           Ford vs Ferrari       False  \n",
       "1                  NaN                       NaN       False  \n",
       "2  Noodles and Company                 The Mummy        True  \n",
       "3           Soho House  The Shawshank Redemption       False  \n",
       "4             Chipotle                 Inception        True  "
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# DO NOT CHANGE THE CODE IN THIS CELL\n",
    "df_people = pd.read_csv(\"https://raw.githubusercontent.com/UI-Deloitte-business-analytics-center/datasets/main/people-sample.csv\")\n",
    "\n",
    "# Used to keep a clean copy\n",
    "df_people_backup = df_people.copy()\n",
    "\n",
    "# head() displays the first 5 rows of a DataFrame\n",
    "df_people.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The table below describes each column in `df_people`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "| Column Name             | Description                                               |\n",
    "|-------------------------|-----------------------------------------------------------|\n",
    "| name                    | Full name                                                |\n",
    "| major1                  | Major                                                     |\n",
    "| major2                  | Second major OR minor (blank if no second major or no minor) |\n",
    "| city                    | City the person is from                                   |\n",
    "| fav_restaurant          | Favorite restaurant (blank if no restaurant was given)    |\n",
    "| has_iphone              | Whether the person use an iPhone                          |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Challenge 1: People from Grayslake"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Using `df_people`, filter rows where the person is from `\"Grayslake\"`.\n",
    "    - Check whether the `city` column is equal to `\"Grayslake\"`.\n",
    "    - Store the result to a new variable named `df_grayslake`.\n",
    "- ✔️ `df_people` should remain unaltered after your code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>major1</th>\n",
       "      <th>major2</th>\n",
       "      <th>city</th>\n",
       "      <th>fav_restaurant</th>\n",
       "      <th>fav_movie</th>\n",
       "      <th>has_iphone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Rowland Waldo</td>\n",
       "      <td>Nondegree</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Grayslake</td>\n",
       "      <td>Chipotle</td>\n",
       "      <td>Inception</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>Maud Dickenson</td>\n",
       "      <td>Accountancy</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Grayslake</td>\n",
       "      <td>Big Bowl</td>\n",
       "      <td>The Trial of Chicago 7</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              name       major1 major2       city fav_restaurant  \\\n",
       "4    Rowland Waldo    Nondegree    NaN  Grayslake       Chipotle   \n",
       "19  Maud Dickenson  Accountancy    NaN  Grayslake       Big Bowl   \n",
       "\n",
       "                 fav_movie  has_iphone  \n",
       "4                Inception        True  \n",
       "19  The Trial of Chicago 7        True  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_grayslake = df_people[df_people[\"city\"] == \"Grayslake\"]\n",
    "### END SOLUTION\n",
    "\n",
    "df_grayslake"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "challenge-03",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "# df_people should remain unaltered\n",
    "pd.testing.assert_frame_equal(df_people, df_people_backup)\n",
    "\n",
    "pd.testing.assert_frame_equal(df_grayslake.sort_values(df_grayslake.columns.tolist()).reset_index(drop=True),\n",
    "                              df_people_backup.query(f'{\"cItY\".lower()} == \"{\"GraYsLakE\".capitalize()}\"')\n",
    "                                 .sort_values(df_people_backup.columns.tolist()).reset_index(drop=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Challenge 2: Anyone with a non-missing `major2`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Using `df_people`, filter rows where the person has a second major or a minor.\n",
    "    - You're looking for rows where `major2` is not `NaN`.\n",
    "- ✔️ `NaN` is a special value to denote missing value. You must use `my_series.isna()` or `my_series.notna()` to check whether a row contains a missing value or not.\n",
    "- ✔️ Store the result to a new variable named `df_major2`.\n",
    "- ✔️ `df_people` should remain unaltered after your code."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🚀 Hints\n",
    "\n",
    "- `my_series.notna()` can be used to check whether a row contains a missing value or not.\n",
    "\n",
    "![notna](https://github.com/bdi475/images/blob/main/pandas/notna-series.png?raw=true)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>major1</th>\n",
       "      <th>major2</th>\n",
       "      <th>city</th>\n",
       "      <th>fav_restaurant</th>\n",
       "      <th>fav_movie</th>\n",
       "      <th>has_iphone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Dwight Boon</td>\n",
       "      <td>Accountancy</td>\n",
       "      <td>Marketing</td>\n",
       "      <td>Fairfax</td>\n",
       "      <td>Chick-fil-a</td>\n",
       "      <td>Ford vs Ferrari</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Wade Kitchens</td>\n",
       "      <td>Supply Chain Management</td>\n",
       "      <td>Marketing</td>\n",
       "      <td>Naperville</td>\n",
       "      <td>Noodles and Company</td>\n",
       "      <td>The Mummy</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Eldred Bullard</td>\n",
       "      <td>Econometrics &amp; Quant Econ</td>\n",
       "      <td>Computer Science</td>\n",
       "      <td>Evanston</td>\n",
       "      <td>Cravings</td>\n",
       "      <td>TRON: Legacy</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Conrad Sargent</td>\n",
       "      <td>Advertising</td>\n",
       "      <td>Marketing</td>\n",
       "      <td>Evanston</td>\n",
       "      <td>Dominos</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Berenice Lewis</td>\n",
       "      <td>Computer Science &amp; Econom</td>\n",
       "      <td>Business</td>\n",
       "      <td>Urbana</td>\n",
       "      <td>Olive Garden</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Monica Morris</td>\n",
       "      <td>Statistics &amp; Computer Sci</td>\n",
       "      <td>Business</td>\n",
       "      <td>Oak Park</td>\n",
       "      <td>Chipotle</td>\n",
       "      <td>Rounders</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>Sara Janson</td>\n",
       "      <td>Computer Science &amp; Econom</td>\n",
       "      <td>Statistics</td>\n",
       "      <td>Chicago</td>\n",
       "      <td>Texas Roadhouse</td>\n",
       "      <td>Stepbrothers</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Inderjeet Joyner</td>\n",
       "      <td>Accountancy</td>\n",
       "      <td>Finance</td>\n",
       "      <td>Highland Park</td>\n",
       "      <td>Sushi King</td>\n",
       "      <td>Wolf of Wall Street</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>Tate Parent</td>\n",
       "      <td>Accountancy</td>\n",
       "      <td>Finance</td>\n",
       "      <td>Urbana</td>\n",
       "      <td>Pizzaria Antica</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>Flynn Woodrow</td>\n",
       "      <td>Accountancy</td>\n",
       "      <td>Supply Chain Management</td>\n",
       "      <td>Wuxi</td>\n",
       "      <td>Haidilao</td>\n",
       "      <td>Interstellar</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>Bailey Farnham</td>\n",
       "      <td>Computer Science</td>\n",
       "      <td>Business</td>\n",
       "      <td>Rockford</td>\n",
       "      <td>Chipotle</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                name                     major1                   major2  \\\n",
       "0        Dwight Boon                Accountancy                Marketing   \n",
       "2      Wade Kitchens    Supply Chain Management                Marketing   \n",
       "9     Eldred Bullard  Econometrics & Quant Econ         Computer Science   \n",
       "10    Conrad Sargent                Advertising                Marketing   \n",
       "11    Berenice Lewis  Computer Science & Econom                 Business   \n",
       "12     Monica Morris  Statistics & Computer Sci                 Business   \n",
       "14       Sara Janson  Computer Science & Econom               Statistics   \n",
       "15  Inderjeet Joyner                Accountancy                  Finance   \n",
       "17       Tate Parent                Accountancy                  Finance   \n",
       "20     Flynn Woodrow                Accountancy  Supply Chain Management   \n",
       "29    Bailey Farnham           Computer Science                 Business   \n",
       "\n",
       "             city       fav_restaurant            fav_movie  has_iphone  \n",
       "0         Fairfax          Chick-fil-a      Ford vs Ferrari       False  \n",
       "2      Naperville  Noodles and Company            The Mummy        True  \n",
       "9        Evanston             Cravings         TRON: Legacy       False  \n",
       "10       Evanston              Dominos                  NaN        True  \n",
       "11         Urbana         Olive Garden                  NaN        True  \n",
       "12       Oak Park             Chipotle             Rounders        True  \n",
       "14        Chicago      Texas Roadhouse         Stepbrothers        True  \n",
       "15  Highland Park           Sushi King  Wolf of Wall Street        True  \n",
       "17         Urbana      Pizzaria Antica                  NaN        True  \n",
       "20           Wuxi             Haidilao         Interstellar        True  \n",
       "29       Rockford             Chipotle                  NaN        True  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_major2 = df_people[df_people[\"major2\"].notna()]\n",
    "### END SOLUTION\n",
    "\n",
    "df_major2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "challenge-04",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "# df_people should remain unaltered\n",
    "pd.testing.assert_frame_equal(df_people, df_people_backup)\n",
    "\n",
    "pd.testing.assert_frame_equal(df_major2.sort_values(df_major2.columns.tolist()).reset_index(drop=True),\n",
    "                              df_people_backup.query(f\"major2 == major2\")\n",
    "                                 .sort_values(df_people_backup.columns.tolist()).reset_index(drop=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Challenge 3: Anyone without a favorite movie\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Using `df_people`, filter rows where the person's `fav_movie` is `NaN`.\n",
    "- ✔️ `NaN` is a special value to denote missing value. You must use `my_series.isna()` or `my_series.notna()` to compare a `Series` with `NaN`.\n",
    "- ✔️ Store the result to a new variable named `df_no_fav_movie`.\n",
    "- ✔️ `df_people` should remain unaltered after your code.\n",
    "\n",
    "#### 🚀 Hints\n",
    "\n",
    "- `my_series.isna()` can be used to check whether a row contains a missing value."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>major1</th>\n",
       "      <th>major2</th>\n",
       "      <th>city</th>\n",
       "      <th>fav_restaurant</th>\n",
       "      <th>fav_movie</th>\n",
       "      <th>has_iphone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Jaclyn Kay</td>\n",
       "      <td>Marketing</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Ploiesti</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Conrad Sargent</td>\n",
       "      <td>Advertising</td>\n",
       "      <td>Marketing</td>\n",
       "      <td>Evanston</td>\n",
       "      <td>Dominos</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Berenice Lewis</td>\n",
       "      <td>Computer Science &amp; Econom</td>\n",
       "      <td>Business</td>\n",
       "      <td>Urbana</td>\n",
       "      <td>Olive Garden</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Kimberlee Gupta</td>\n",
       "      <td>Computer Science</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Naperville</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>Tate Parent</td>\n",
       "      <td>Accountancy</td>\n",
       "      <td>Finance</td>\n",
       "      <td>Urbana</td>\n",
       "      <td>Pizzaria Antica</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>Erick Hollins</td>\n",
       "      <td>Psychology</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Orland Park</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>Marsha Brett</td>\n",
       "      <td>Accountancy</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Lake Forest</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>Chanda Neal</td>\n",
       "      <td>Nondegree</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Buffalo Grove</td>\n",
       "      <td>McDonalds</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>Lei Tyler</td>\n",
       "      <td>Mathematics</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Vernon Hills</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Xinyi Cookson</td>\n",
       "      <td>Chemical Engineering</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Northbrook</td>\n",
       "      <td>Burger King</td>\n",
       "      <td>NaN</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Layton Overton</td>\n",
       "      <td>Information Sciences</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Champaign</td>\n",
       "      <td>Noodles and Company</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Reshmi Backus</td>\n",
       "      <td>Nondegree</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Urbana</td>\n",
       "      <td>Haidilao</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>Bailey Farnham</td>\n",
       "      <td>Computer Science</td>\n",
       "      <td>Business</td>\n",
       "      <td>Rockford</td>\n",
       "      <td>Chipotle</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Prabhakar Ryley</td>\n",
       "      <td>Econometrics &amp; Quant Econ</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Dubai</td>\n",
       "      <td>California Pizza Kitchen</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>Monday Clifford</td>\n",
       "      <td>Economics</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Fremont</td>\n",
       "      <td>Olive Garden</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>Miranda Steele</td>\n",
       "      <td>Statistics</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Champaign</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>Viraj Tatham</td>\n",
       "      <td>Information Sciences</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Champaign</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>Yuwen Xiang</td>\n",
       "      <td>Statistics</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Northbrook</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>Eleanore Powers</td>\n",
       "      <td>Accountancy</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Vernon Hills</td>\n",
       "      <td>Chick-fil-a</td>\n",
       "      <td>NaN</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>38</th>\n",
       "      <td>Sachin Weston</td>\n",
       "      <td>Computer Science &amp; Econom</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Naperville</td>\n",
       "      <td>McDonalds</td>\n",
       "      <td>NaN</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               name                     major1     major2           city  \\\n",
       "1        Jaclyn Kay                  Marketing        NaN       Ploiesti   \n",
       "10   Conrad Sargent                Advertising  Marketing       Evanston   \n",
       "11   Berenice Lewis  Computer Science & Econom   Business         Urbana   \n",
       "16  Kimberlee Gupta           Computer Science        NaN     Naperville   \n",
       "17      Tate Parent                Accountancy    Finance         Urbana   \n",
       "21    Erick Hollins                 Psychology        NaN    Orland Park   \n",
       "22     Marsha Brett                Accountancy        NaN    Lake Forest   \n",
       "23      Chanda Neal                  Nondegree        NaN  Buffalo Grove   \n",
       "24        Lei Tyler                Mathematics        NaN   Vernon Hills   \n",
       "26    Xinyi Cookson       Chemical Engineering        NaN     Northbrook   \n",
       "27   Layton Overton       Information Sciences        NaN      Champaign   \n",
       "28    Reshmi Backus                  Nondegree        NaN         Urbana   \n",
       "29   Bailey Farnham           Computer Science   Business       Rockford   \n",
       "30  Prabhakar Ryley  Econometrics & Quant Econ        NaN          Dubai   \n",
       "31  Monday Clifford                  Economics        NaN        Fremont   \n",
       "33   Miranda Steele                 Statistics        NaN      Champaign   \n",
       "34     Viraj Tatham       Information Sciences        NaN      Champaign   \n",
       "36      Yuwen Xiang                 Statistics        NaN     Northbrook   \n",
       "37  Eleanore Powers                Accountancy        NaN   Vernon Hills   \n",
       "38    Sachin Weston  Computer Science & Econom        NaN     Naperville   \n",
       "\n",
       "              fav_restaurant fav_movie  has_iphone  \n",
       "1                        NaN       NaN       False  \n",
       "10                   Dominos       NaN        True  \n",
       "11              Olive Garden       NaN        True  \n",
       "16                       NaN       NaN        True  \n",
       "17           Pizzaria Antica       NaN        True  \n",
       "21                       NaN       NaN        True  \n",
       "22                       NaN       NaN        True  \n",
       "23                 McDonalds       NaN        True  \n",
       "24                       NaN       NaN        True  \n",
       "26               Burger King       NaN       False  \n",
       "27       Noodles and Company       NaN        True  \n",
       "28                  Haidilao       NaN        True  \n",
       "29                  Chipotle       NaN        True  \n",
       "30  California Pizza Kitchen       NaN        True  \n",
       "31              Olive Garden       NaN        True  \n",
       "33                       NaN       NaN        True  \n",
       "34                       NaN       NaN        True  \n",
       "36                       NaN       NaN        True  \n",
       "37               Chick-fil-a       NaN       False  \n",
       "38                 McDonalds       NaN       False  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_no_fav_movie = df_people[df_people[\"fav_movie\"].isna()]\n",
    "### END SOLUTION\n",
    "\n",
    "df_no_fav_movie"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "challenge-05",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "# df_people should remain unaltered\n",
    "pd.testing.assert_frame_equal(df_people, df_people_backup)\n",
    "\n",
    "pd.testing.assert_frame_equal(df_no_fav_movie.sort_values(df_no_fav_movie.columns.tolist()).reset_index(drop=True),\n",
    "                              df_people_backup.query(f\"fav_movie != fav_movie\")\n",
    "                                 .sort_values(df_people_backup.columns.tolist()).reset_index(drop=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## 🗓️ Working with Datetime Values\n",
    "\n",
    "You will often see date-looking strings in your data. A few examples are:\n",
    "\n",
    "- `20210315`\n",
    "- `Mar 15, 2021`\n",
    "- `2020-03-15`\n",
    "- `2020/3/15`\n",
    "\n",
    "In this part, we'll discuss how we can *parse* and utilize datetime values."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 📌 Load employees data\n",
    "\n",
    "▶️ Run the code cell below to create a new `DataFrame` named `df_emp`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>dept</th>\n",
       "      <th>office_phone</th>\n",
       "      <th>start_date</th>\n",
       "      <th>salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30</td>\n",
       "      <td>Talal</td>\n",
       "      <td>Finance</td>\n",
       "      <td>(217)123-4500</td>\n",
       "      <td>2017-05-01</td>\n",
       "      <td>202000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>40</td>\n",
       "      <td>Josh</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2018-02-01</td>\n",
       "      <td>185000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10</td>\n",
       "      <td>Anika</td>\n",
       "      <td>Finance</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2020-08-01</td>\n",
       "      <td>240000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>Aishani</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>(217)987-6600</td>\n",
       "      <td>2019-12-01</td>\n",
       "      <td>160500</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id     name      dept   office_phone  start_date  salary\n",
       "0      30    Talal   Finance  (217)123-4500  2017-05-01  202000\n",
       "1      40     Josh  Purchase            NaN  2018-02-01  185000\n",
       "2      10    Anika   Finance            NaN  2020-08-01  240000\n",
       "3      20  Aishani  Purchase  (217)987-6600  2019-12-01  160500"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# DO NOT CHANGE THE CODE IN THIS CELL\n",
    "df_emp = pd.DataFrame({\n",
    "    'emp_id': [30, 40, 10, 20],\n",
    "    'name': ['Talal', 'Josh', 'Anika', 'Aishani'],\n",
    "    'dept': ['Finance', 'Purchase', 'Finance', 'Purchase'],\n",
    "    'office_phone': ['(217)123-4500', np.nan, np.nan, '(217)987-6600'],\n",
    "    'start_date': ['2017-05-01', '2018-02-01', '2020-08-01', '2019-12-01'],\n",
    "    'salary': [202000, 185000, 240000, 160500]\n",
    "})\n",
    "\n",
    "# Used for intermediate checks\n",
    "df_emp_backup = df_emp.copy()\n",
    "\n",
    "df_emp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Question**: What is the data type of the `start_date` column?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "▶️ Run `str(df_emp['start_date'].dtype)` below to see the data type of the `start_date` column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'object'"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "str(df_emp['start_date'].dtype)\n",
    "### END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "While `object` can refer to many different types, you can safely assume that all `object` data types you see in this course refer to strings."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Challenge 4: Parse a string column as datetime\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Parse `start_date` to a `datetime` data type.\n",
    "- ✔️ Store the result to a new column named `start_date_parsed`.\n",
    "\n",
    "#### 🚀 Hints\n",
    "\n",
    "The code below converts `date_str` column to a `datetime`-typed column and stores the converted result to a new column named `date_parsed`.\n",
    "\n",
    "```python\n",
    "my_dataframe['date_parsed'] = pd.to_datetime(my_dataframe['date_str'])\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>dept</th>\n",
       "      <th>office_phone</th>\n",
       "      <th>start_date</th>\n",
       "      <th>salary</th>\n",
       "      <th>start_date_parsed</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30</td>\n",
       "      <td>Talal</td>\n",
       "      <td>Finance</td>\n",
       "      <td>(217)123-4500</td>\n",
       "      <td>2017-05-01</td>\n",
       "      <td>202000</td>\n",
       "      <td>2017-05-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>40</td>\n",
       "      <td>Josh</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2018-02-01</td>\n",
       "      <td>185000</td>\n",
       "      <td>2018-02-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10</td>\n",
       "      <td>Anika</td>\n",
       "      <td>Finance</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2020-08-01</td>\n",
       "      <td>240000</td>\n",
       "      <td>2020-08-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>Aishani</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>(217)987-6600</td>\n",
       "      <td>2019-12-01</td>\n",
       "      <td>160500</td>\n",
       "      <td>2019-12-01</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id     name      dept   office_phone  start_date  salary  \\\n",
       "0      30    Talal   Finance  (217)123-4500  2017-05-01  202000   \n",
       "1      40     Josh  Purchase            NaN  2018-02-01  185000   \n",
       "2      10    Anika   Finance            NaN  2020-08-01  240000   \n",
       "3      20  Aishani  Purchase  (217)987-6600  2019-12-01  160500   \n",
       "\n",
       "  start_date_parsed  \n",
       "0        2017-05-01  \n",
       "1        2018-02-01  \n",
       "2        2020-08-01  \n",
       "3        2019-12-01  "
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_emp['start_date_parsed'] = pd.to_datetime(df_emp['start_date'])\n",
    "### END SOLUTION\n",
    "\n",
    "df_emp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "tc.assertEqual(set(df_emp.columns), set(df_emp_backup.columns.tolist() + ['start_date_parsed']))\n",
    "pd.testing.assert_series_equal(df_emp['start_date_parsed'].reset_index(drop=True),\n",
    "                               pd.to_datetime(df_emp_backup['_'.join(['sTarT', 'DaTe']).lower()])\n",
    "                                  .reset_index(drop=True),\n",
    "                               check_names=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "▶️ Run `str(df_emp['start_date_parsed'].dtype)` below to see the data type of the `start_date` column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "challenge-06",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'datetime64[ns]'"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# BEGIN SOLUTION\n",
    "str(df_emp['start_date_parsed'].dtype)\n",
    "### END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Challenge 5: Drop `start_date` column *in-place*\n",
    "\n",
    "We no longer need the `start_date` column. We'll work with the new `start_date_parsed` column from this point on.\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Drop `start_date` column from `df_emp` *in-place*.\n",
    "\n",
    "#### 🚀 Hints\n",
    "\n",
    "The code below drops `col1` from `my_dataframe` *in-place* without creating a new variable.\n",
    "\n",
    "```python\n",
    "my_dataframe.drop(columns=['col1'], inplace=True)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>dept</th>\n",
       "      <th>office_phone</th>\n",
       "      <th>salary</th>\n",
       "      <th>start_date_parsed</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30</td>\n",
       "      <td>Talal</td>\n",
       "      <td>Finance</td>\n",
       "      <td>(217)123-4500</td>\n",
       "      <td>202000</td>\n",
       "      <td>2017-05-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>40</td>\n",
       "      <td>Josh</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>NaN</td>\n",
       "      <td>185000</td>\n",
       "      <td>2018-02-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10</td>\n",
       "      <td>Anika</td>\n",
       "      <td>Finance</td>\n",
       "      <td>NaN</td>\n",
       "      <td>240000</td>\n",
       "      <td>2020-08-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>Aishani</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>(217)987-6600</td>\n",
       "      <td>160500</td>\n",
       "      <td>2019-12-01</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id     name      dept   office_phone  salary start_date_parsed\n",
       "0      30    Talal   Finance  (217)123-4500  202000        2017-05-01\n",
       "1      40     Josh  Purchase            NaN  185000        2018-02-01\n",
       "2      10    Anika   Finance            NaN  240000        2020-08-01\n",
       "3      20  Aishani  Purchase  (217)987-6600  160500        2019-12-01"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_emp.drop(columns=['start_date'], inplace=True)\n",
    "### END SOLUTION\n",
    "\n",
    "df_emp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "challenge-07",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "df_check = df_emp_backup.copy()\n",
    "df_check['_'.join(['sTarT', 'DaTe', 'pArSeD']).lower()] = pd.to_datetime(df_check['start_date'])\n",
    "df_check = df_check.drop(columns=['start_date'])\n",
    "\n",
    "# Check result\n",
    "tc.assertEqual(set(df_emp.columns), set(['start_date_parsed', 'salary', 'office_phone', 'dept', 'name', 'emp_id']))\n",
    "pd.testing.assert_frame_equal(df_emp.sort_values(df_emp.columns.tolist()).reset_index(drop=True),\n",
    "                              df_check.sort_values(df_check.columns.tolist()).reset_index(drop=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Challenge 6: Rename `start_date_parsed` to `start_date`\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Rename `start_date_parsed` to `start_date` in `df_emp` *in-place*.\n",
    "\n",
    "#### 🚀 Hints\n",
    "\n",
    "The code below renames the `name_before` column to `name_after` in `my_dataframe` *in-place* without creating a new variable.\n",
    "\n",
    "```python\n",
    "my_dataframe.rename(columns={'name_before': 'name_after'}, inplace=True)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>dept</th>\n",
       "      <th>office_phone</th>\n",
       "      <th>salary</th>\n",
       "      <th>start_date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30</td>\n",
       "      <td>Talal</td>\n",
       "      <td>Finance</td>\n",
       "      <td>(217)123-4500</td>\n",
       "      <td>202000</td>\n",
       "      <td>2017-05-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>40</td>\n",
       "      <td>Josh</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>NaN</td>\n",
       "      <td>185000</td>\n",
       "      <td>2018-02-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10</td>\n",
       "      <td>Anika</td>\n",
       "      <td>Finance</td>\n",
       "      <td>NaN</td>\n",
       "      <td>240000</td>\n",
       "      <td>2020-08-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>Aishani</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>(217)987-6600</td>\n",
       "      <td>160500</td>\n",
       "      <td>2019-12-01</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id     name      dept   office_phone  salary start_date\n",
       "0      30    Talal   Finance  (217)123-4500  202000 2017-05-01\n",
       "1      40     Josh  Purchase            NaN  185000 2018-02-01\n",
       "2      10    Anika   Finance            NaN  240000 2020-08-01\n",
       "3      20  Aishani  Purchase  (217)987-6600  160500 2019-12-01"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_emp.rename(columns={'start_date_parsed': 'start_date'}, inplace=True)\n",
    "### END SOLUTION\n",
    "\n",
    "df_emp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "challenge-08",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "df_check = df_emp_backup.copy()\n",
    "df_check['_'.join(['sTarT', 'DaTe', 'pArSeD']).lower()] = pd.to_datetime(df_check['start_date'])\n",
    "df_check = df_check.drop(columns=['start_date']).rename(columns={'start_date_parsed': 'start_date'})\n",
    "\n",
    "# Check result\n",
    "pd.testing.assert_frame_equal(df_emp.sort_values(df_emp.columns.tolist()).reset_index(drop=True),\n",
    "                              df_check.sort_values(df_check.columns.tolist()).reset_index(drop=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Challenge 7: Extract year from a datetime column\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Create a new column named `start_year` in `df_emp` that contains the starting years in integers (e.g., `2017`, `2018`).\n",
    "- ✔️ Extract values from `df_emp['start_date']`.\n",
    "\n",
    "#### 🚀 Hints\n",
    "\n",
    "The code extracts the year of a datetime column `my_date` and stores it to a new column named `my_year`.\n",
    "\n",
    "```python\n",
    "my_dataframe['my_year'] = my_dataframe['my_date'].dt.year\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>dept</th>\n",
       "      <th>office_phone</th>\n",
       "      <th>salary</th>\n",
       "      <th>start_date</th>\n",
       "      <th>start_year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30</td>\n",
       "      <td>Talal</td>\n",
       "      <td>Finance</td>\n",
       "      <td>(217)123-4500</td>\n",
       "      <td>202000</td>\n",
       "      <td>2017-05-01</td>\n",
       "      <td>2017</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>40</td>\n",
       "      <td>Josh</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>NaN</td>\n",
       "      <td>185000</td>\n",
       "      <td>2018-02-01</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10</td>\n",
       "      <td>Anika</td>\n",
       "      <td>Finance</td>\n",
       "      <td>NaN</td>\n",
       "      <td>240000</td>\n",
       "      <td>2020-08-01</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>Aishani</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>(217)987-6600</td>\n",
       "      <td>160500</td>\n",
       "      <td>2019-12-01</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id     name      dept   office_phone  salary start_date  start_year\n",
       "0      30    Talal   Finance  (217)123-4500  202000 2017-05-01        2017\n",
       "1      40     Josh  Purchase            NaN  185000 2018-02-01        2018\n",
       "2      10    Anika   Finance            NaN  240000 2020-08-01        2020\n",
       "3      20  Aishani  Purchase  (217)987-6600  160500 2019-12-01        2019"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_emp['start_year'] = df_emp['start_date'].dt.year\n",
    "### END SOLUTION\n",
    "\n",
    "df_emp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "challenge-09",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "df_check = df_emp_backup.copy()\n",
    "df_check['_'.join(['sTarT', 'DaTe', 'pArSeD']).lower()] = pd.to_datetime(df_check['start_date'])\n",
    "df_check = df_check.drop(columns=['start_date']).rename(columns={'start_date_parsed': 'start_date'})\n",
    "df_check['_'.join(['sTarT', 'yEaR']).lower()] = df_check['_'.join(['sTarT', 'dAtE']).lower()].dt.year\n",
    "\n",
    "df_emp_check = df_emp.sort_values(df_emp.columns.tolist()).reset_index(drop=True)\n",
    "df_check = df_check.sort_values(df_check.columns.tolist()).reset_index(drop=True)\n",
    "cols_to_check = ['emp_id', 'start_year']\n",
    "\n",
    "# Check result\n",
    "pd.testing.assert_frame_equal(df_emp_check[cols_to_check],\n",
    "                              df_check[cols_to_check])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Challenge 8: Extract month, day of month from a datetime column\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Create new columns named `start_month` and `start_day` in `df_emp` that contain the starting months and days of month in integers.\n",
    "- ✔️ Extract values from `df_emp['start_date']`.\n",
    "\n",
    "#### 🚀 Hints\n",
    "\n",
    "The code extracts the months and days of a datetime column `my_date` and stores it to two new columns.\n",
    "\n",
    "```python\n",
    "my_dataframe['my_month'] = my_dataframe['my_date'].dt.month\n",
    "my_dataframe['my_day'] = my_dataframe['my_date'].dt.day\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>dept</th>\n",
       "      <th>office_phone</th>\n",
       "      <th>salary</th>\n",
       "      <th>start_date</th>\n",
       "      <th>start_year</th>\n",
       "      <th>start_month</th>\n",
       "      <th>start_day</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30</td>\n",
       "      <td>Talal</td>\n",
       "      <td>Finance</td>\n",
       "      <td>(217)123-4500</td>\n",
       "      <td>202000</td>\n",
       "      <td>2017-05-01</td>\n",
       "      <td>2017</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>40</td>\n",
       "      <td>Josh</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>NaN</td>\n",
       "      <td>185000</td>\n",
       "      <td>2018-02-01</td>\n",
       "      <td>2018</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10</td>\n",
       "      <td>Anika</td>\n",
       "      <td>Finance</td>\n",
       "      <td>NaN</td>\n",
       "      <td>240000</td>\n",
       "      <td>2020-08-01</td>\n",
       "      <td>2020</td>\n",
       "      <td>8</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>Aishani</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>(217)987-6600</td>\n",
       "      <td>160500</td>\n",
       "      <td>2019-12-01</td>\n",
       "      <td>2019</td>\n",
       "      <td>12</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id     name      dept   office_phone  salary start_date  start_year  \\\n",
       "0      30    Talal   Finance  (217)123-4500  202000 2017-05-01        2017   \n",
       "1      40     Josh  Purchase            NaN  185000 2018-02-01        2018   \n",
       "2      10    Anika   Finance            NaN  240000 2020-08-01        2020   \n",
       "3      20  Aishani  Purchase  (217)987-6600  160500 2019-12-01        2019   \n",
       "\n",
       "   start_month  start_day  \n",
       "0            5          1  \n",
       "1            2          1  \n",
       "2            8          1  \n",
       "3           12          1  "
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_emp['start_month'] = df_emp['start_date'].dt.month\n",
    "df_emp['start_day'] = df_emp['start_date'].dt.day\n",
    "### END SOLUTION\n",
    "\n",
    "df_emp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "challenge-10",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "df_check = df_emp_backup.copy()\n",
    "df_check['_'.join(['sTarT', 'DaTe', 'pArSeD']).lower()] = pd.to_datetime(df_check['start_date'])\n",
    "df_check = df_check.drop(columns=['start_date']).rename(columns={'start_date_parsed': 'start_date'})\n",
    "df_check['_'.join(['sTarT', 'mOnTh']).lower()] = df_check['_'.join(['sTarT', 'dAtE']).lower()].dt.month\n",
    "df_check['_'.join(['sTarT', 'dAy']).lower()] = df_check['_'.join(['sTarT', 'dAtE']).lower()].dt.day\n",
    "\n",
    "df_emp_check = df_emp.sort_values(df_emp.columns.tolist()).reset_index(drop=True)\n",
    "df_check = df_check.sort_values(df_check.columns.tolist()).reset_index(drop=True)\n",
    "cols_to_check = ['emp_id', 'start_month', 'start_day']\n",
    "\n",
    "# Check result\n",
    "pd.testing.assert_frame_equal(df_emp_check[cols_to_check],\n",
    "                              df_check[cols_to_check])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "### 🎯 Challenge 9: Extract quarter, weekday of month from a datetime column\n",
    "\n",
    "#### 👇 Tasks\n",
    "\n",
    "- ✔️ Create new columns named `start_quarter` and `start_weekday` in `df_emp` that contain the starting quarters and weekdays of month in integers.\n",
    "- ✔️ Extract values from `df_emp['start_date']`.\n",
    "\n",
    "#### 🚀 Hints\n",
    "\n",
    "The code extracts the quarters and weekdays of a datetime column `my_date` and stores it to two new columns.\n",
    "\n",
    "```python\n",
    "my_dataframe['my_quarter'] = my_dataframe['my_date'].dt.quarter\n",
    "my_dataframe['my_weekday'] = my_dataframe['my_date'].dt.weekday\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>emp_id</th>\n",
       "      <th>name</th>\n",
       "      <th>dept</th>\n",
       "      <th>office_phone</th>\n",
       "      <th>salary</th>\n",
       "      <th>start_date</th>\n",
       "      <th>start_year</th>\n",
       "      <th>start_month</th>\n",
       "      <th>start_day</th>\n",
       "      <th>start_quarter</th>\n",
       "      <th>start_weekday</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30</td>\n",
       "      <td>Talal</td>\n",
       "      <td>Finance</td>\n",
       "      <td>(217)123-4500</td>\n",
       "      <td>202000</td>\n",
       "      <td>2017-05-01</td>\n",
       "      <td>2017</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>40</td>\n",
       "      <td>Josh</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>NaN</td>\n",
       "      <td>185000</td>\n",
       "      <td>2018-02-01</td>\n",
       "      <td>2018</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10</td>\n",
       "      <td>Anika</td>\n",
       "      <td>Finance</td>\n",
       "      <td>NaN</td>\n",
       "      <td>240000</td>\n",
       "      <td>2020-08-01</td>\n",
       "      <td>2020</td>\n",
       "      <td>8</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>Aishani</td>\n",
       "      <td>Purchase</td>\n",
       "      <td>(217)987-6600</td>\n",
       "      <td>160500</td>\n",
       "      <td>2019-12-01</td>\n",
       "      <td>2019</td>\n",
       "      <td>12</td>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   emp_id     name      dept   office_phone  salary start_date  start_year  \\\n",
       "0      30    Talal   Finance  (217)123-4500  202000 2017-05-01        2017   \n",
       "1      40     Josh  Purchase            NaN  185000 2018-02-01        2018   \n",
       "2      10    Anika   Finance            NaN  240000 2020-08-01        2020   \n",
       "3      20  Aishani  Purchase  (217)987-6600  160500 2019-12-01        2019   \n",
       "\n",
       "   start_month  start_day  start_quarter  start_weekday  \n",
       "0            5          1              2              0  \n",
       "1            2          1              1              3  \n",
       "2            8          1              3              5  \n",
       "3           12          1              4              6  "
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "df_emp['start_quarter'] = df_emp['start_date'].dt.quarter\n",
    "df_emp['start_weekday'] = df_emp['start_date'].dt.weekday\n",
    "### END SOLUTION\n",
    "\n",
    "df_emp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 🧭 Check Your Work\n",
    "\n",
    "- Once you're done, run the code cell below to test correctness.\n",
    "- ✔️ If the code cell runs without an error, you're good to move on.\n",
    "- ❌ If the code cell throws an error, go back and fix incorrect parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "challenge-10",
     "locked": true,
     "points": "1",
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "df_check = df_emp_backup.copy()\n",
    "df_check['_'.join(['sTarT', 'DaTe', 'pArSeD']).lower()] = pd.to_datetime(df_check['start_date'])\n",
    "df_check = df_check.drop(columns=['start_date']).rename(columns={'start_date_parsed': 'start_date'})\n",
    "df_check['_'.join(['sTarT', 'qUarTer']).lower()] = df_check['_'.join(['sTarT', 'dAtE']).lower()].dt.quarter\n",
    "df_check['_'.join(['sTarT', 'wEeKDaY']).lower()] = df_check['_'.join(['sTarT', 'dAtE']).lower()].dt.weekday\n",
    "\n",
    "df_emp_check = df_emp.sort_values(df_emp.columns.tolist()).reset_index(drop=True)\n",
    "df_check = df_check.sort_values(df_check.columns.tolist()).reset_index(drop=True)\n",
    "cols_to_check = ['emp_id', 'start_quarter', 'start_weekday']\n",
    "\n",
    "# Check result\n",
    "pd.testing.assert_frame_equal(df_emp_check[cols_to_check],\n",
    "                              df_check[cols_to_check])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}