{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "4a87b5ef",
   "metadata": {},
   "source": [
    "---   \n",
    " <img align=\"left\" width=\"75\" height=\"75\"  src=\"https://upload.wikimedia.org/wikipedia/en/c/c8/University_of_the_Punjab_logo.png\"> \n",
    "\n",
    "<h1 align=\"center\">Department of Data Science</h1>\n",
    "<h1 align=\"center\">Course: Tools and Techniques for Data Science</h1>\n",
    "\n",
    "---\n",
    "<h3><div align=\"right\">Instructor: Muhammad Arif Butt, Ph.D.</div></h3>    "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab0dc25c",
   "metadata": {},
   "source": [
    "<h1 align=\"center\">Lecture 3.13 (Pandas-05)</h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49aec3ca",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/arifpucit/data-science/blob/master/Section-3-Python-for-Data-Scientists/Lec-3.13(Pandas-05-Subsetting-Dataframes).ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19f82705",
   "metadata": {},
   "source": [
    "## _Indexing, Subsetting and Slicing Dataframes.ipynb_"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "328bd879",
   "metadata": {},
   "source": [
    "## Motivation:\n",
    "- The ability to select specific rows and columns to access and filter data based on specific conditions are two of the key features of Pandas.\n",
    "    - **Selection** allows you to access specific rows or columns (a subset) of the data by their index and/or location in the DataFrame\n",
    "        - In large datasets, you may be required to select the first/last N records\n",
    "        - In large datasets, you may be required to select a range (n to m) of records\n",
    "        - In large datasets, you may be required to select specific columns of your interest\n",
    "        - In large datasets, you may be required to select specific range and specific columns of your interest\n",
    "    - **Filtering** allows you to access specific rows or columns (a subset) of the data based on one or more conditions\n",
    "        - In a medical dataset, you may be required to filter record of all those patients who suffer with a specific disease, or who have a specific blood group\n",
    "        - In a medical dataset, you may be required to filter pregnant women who have anemia, and compare this subset to women who don’t have anemia.\n",
    "        - In a travel dataset, you may be required to filter hotels inside Lahore city, sorted by their minimum per day cost\n",
    "        - In a client dataset, you may be required filter the clients who use a Gmail account(may require a string filter)\n",
    "        - In a client dataset, you may be required to filter the clients who belong to a specific countries (may require use of .isin() function)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12db95e1",
   "metadata": {},
   "source": [
    "## Learning agenda of this notebook\n",
    "1. Understanding Indices of a Dataframe\n",
    "    - Understand the Dataset\n",
    "    - Changing the Column Indices of a Dataframe\n",
    "    - Changing the Row Indices of a Dataframe\n",
    "2. Selecting Row(s) and Column(s) of a Dataframe using `df[]` \n",
    "3. Selecting Rows and Columns using `iloc` Method\n",
    "4. Selecting Rows and Columns using `loc` Method\n",
    "5. Conditional Selection   \n",
    "6. Selecting columns of a specific data type\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9ea649a9",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "8f71cf0e",
   "metadata": {},
   "source": [
    "## 1. Understanding Indices of a Dataframe\n",
    "\n",
    "<img align=\"right\" width=\"300\" height=\"300\"  src=\"images/series-anatomy.png\"  >\n",
    "<img align=\"left\" width=\"500\" height=\"500\"  src=\"images/pandas.png\"  >"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cae470d2",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "1d2bba7a",
   "metadata": {},
   "source": [
    "###  a. Understand the Dataset\n",
    "- Let us first understand the dataframe on which we are going to work in today's notebook"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "id": "3a6a3d25",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>MS07</td>\n",
       "      <td>Zobia</td>\n",
       "      <td>40</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>MS10</td>\n",
       "      <td>Shahid</td>\n",
       "      <td>38</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group  gender  subj1  \\\n",
       "0     MS01       Rauf   52     Lahore    MORNING  group C    Male   78.3   \n",
       "1     MS02       Arif   51  Islamabad        AFT  group A    Male   70.5   \n",
       "2     MS03    Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   \n",
       "3     MS04     Hadeed   20     Lahore        MOR  group A    Male   82.0   \n",
       "4     MS05       Zara   40   Peshawer        AFT  group D  Female   65.9   \n",
       "5     MS06      Mohid   16     Lahore    MORNING  group C  Female   69.3   \n",
       "6     MS07      Zobia   40    Sialkot        AFT  group B  Female   90.2   \n",
       "7     MS08     Idrees   51     Multan    MORNING  group D    Male   84.1   \n",
       "8     MS09      Jamil   53    Karachi        AFT  group C    Male   90.5   \n",
       "9     MS10     Shahid   38     Lahore  AFTERNOON  group D    Male   90.5   \n",
       "10    MS11    Khurram   35  Islamabad        MOR  group B    Male   90.5   \n",
       "11    MS12       Maaz   25    Karachi  AFTERNOON  group C    Male   90.5   \n",
       "12    MS13    Mujahid   18     Lahore    MORNING  group D    Male    NaN   \n",
       "13    MS14       Sara   28     Multan  AFTERNOON  group A  Female   84.1   \n",
       "14    MS15     Fatima   33    Sialkot        AFT  group C  Female   90.5   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A    Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "0    84.4       5000.0  \n",
       "1    60.5       6000.0  \n",
       "2    75.1       8500.0  \n",
       "3    84.3       4000.0  \n",
       "4    72.8       3500.0  \n",
       "5    78.6          NaN  \n",
       "6     NaN       4000.0  \n",
       "7    76.0       8000.0  \n",
       "8    81.3       3500.0  \n",
       "9    81.3       3800.0  \n",
       "10   81.3       6000.0  \n",
       "11   81.3          NaN  \n",
       "12   76.5       7000.0  \n",
       "13   76.0       8000.0  \n",
       "14   81.3       3500.0  \n",
       "15   81.3       3800.0  "
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "df = pd.read_csv('datasets/groupdata.csv')\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "e67be45b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(16, 10)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# `shape` attribute of a dataframe object return a two value tuple containing rows and columns\n",
    "# Note the rows count does not include the column labels and column count does not include the row index\n",
    "df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "575dee99",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "id": "8c00cc7d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=16, step=1)"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# `index` attribute of a dataframe object return the list of row indices and its datatype\n",
    "df.index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9219f240",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "b8a7fc1b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['roll no', 'name', 'age', 'address', 'session', 'group', 'gender',\n",
       "       'subj1', 'subj2', 'scholarship'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# `columns` attribute of a dataframe object return the list of column labels and its datatype\n",
    "df.columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "782d6635",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "899d087f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "roll no         object\n",
       "name            object\n",
       "age              int64\n",
       "address         object\n",
       "session         object\n",
       "group           object\n",
       "gender          object\n",
       "subj1          float64\n",
       "subj2          float64\n",
       "scholarship    float64\n",
       "dtype: object"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# `dtypes` attribute of a dataframe object return the data type of each column in the dataframe\n",
    "df.dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "370b4796",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "edc86d22",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 16 entries, 0 to 15\n",
      "Data columns (total 10 columns):\n",
      " #   Column       Non-Null Count  Dtype  \n",
      "---  ------       --------------  -----  \n",
      " 0   roll no      16 non-null     object \n",
      " 1   name         16 non-null     object \n",
      " 2   age          16 non-null     int64  \n",
      " 3   address      16 non-null     object \n",
      " 4   session      16 non-null     object \n",
      " 5   group        16 non-null     object \n",
      " 6   gender       16 non-null     object \n",
      " 7   subj1        15 non-null     float64\n",
      " 8   subj2        15 non-null     float64\n",
      " 9   scholarship  14 non-null     float64\n",
      "dtypes: float64(3), int64(1), object(6)\n",
      "memory usage: 1.4+ KB\n"
     ]
    }
   ],
   "source": [
    "#This method prints information about a DataFrame including the row indices, column labels, \n",
    "# non-null values count in each column, datatype and memory usage\n",
    "df.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7262d057",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "id": "1a1b03fb",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>16</td>\n",
       "      <td>16</td>\n",
       "      <td>16.000000</td>\n",
       "      <td>16</td>\n",
       "      <td>16</td>\n",
       "      <td>16</td>\n",
       "      <td>16</td>\n",
       "      <td>15.000000</td>\n",
       "      <td>15.000000</td>\n",
       "      <td>14.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>unique</th>\n",
       "      <td>16</td>\n",
       "      <td>16</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>top</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>freq</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>10</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>36.062500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>82.153333</td>\n",
       "      <td>78.133333</td>\n",
       "      <td>5328.571429</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>12.228212</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>9.885261</td>\n",
       "      <td>5.964378</td>\n",
       "      <td>1888.600854</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>16.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>64.900000</td>\n",
       "      <td>60.500000</td>\n",
       "      <td>3500.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>27.250000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>74.400000</td>\n",
       "      <td>76.000000</td>\n",
       "      <td>3800.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>36.500000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>84.100000</td>\n",
       "      <td>81.300000</td>\n",
       "      <td>4500.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>44.250000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>90.500000</td>\n",
       "      <td>81.300000</td>\n",
       "      <td>6750.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>53.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>90.500000</td>\n",
       "      <td>84.400000</td>\n",
       "      <td>8500.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       roll no  name        age address session    group gender      subj1  \\\n",
       "count       16    16  16.000000      16      16       16     16  15.000000   \n",
       "unique      16    16        NaN       6       4        4      2        NaN   \n",
       "top       MS01  Rauf        NaN  Lahore     AFT  group C   Male        NaN   \n",
       "freq         1     1        NaN       5       5        5     10        NaN   \n",
       "mean       NaN   NaN  36.062500     NaN     NaN      NaN    NaN  82.153333   \n",
       "std        NaN   NaN  12.228212     NaN     NaN      NaN    NaN   9.885261   \n",
       "min        NaN   NaN  16.000000     NaN     NaN      NaN    NaN  64.900000   \n",
       "25%        NaN   NaN  27.250000     NaN     NaN      NaN    NaN  74.400000   \n",
       "50%        NaN   NaN  36.500000     NaN     NaN      NaN    NaN  84.100000   \n",
       "75%        NaN   NaN  44.250000     NaN     NaN      NaN    NaN  90.500000   \n",
       "max        NaN   NaN  53.000000     NaN     NaN      NaN    NaN  90.500000   \n",
       "\n",
       "            subj2  scholarship  \n",
       "count   15.000000    14.000000  \n",
       "unique        NaN          NaN  \n",
       "top           NaN          NaN  \n",
       "freq          NaN          NaN  \n",
       "mean    78.133333  5328.571429  \n",
       "std      5.964378  1888.600854  \n",
       "min     60.500000  3500.000000  \n",
       "25%     76.000000  3800.000000  \n",
       "50%     81.300000  4500.000000  \n",
       "75%     81.300000  6750.000000  \n",
       "max     84.400000  8500.000000  "
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# If no argument is passed, this method displays descriptive statistics about the numeric columns of the dataframe\n",
    "df.describe(include='all')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f685f94c",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "0de5cdd8",
   "metadata": {},
   "source": [
    "### b. Changing the Column Indices/Labels of a Datarame\n",
    "- Every dataframe has column labels associated with its columns\n",
    "- These by default are integer values from 0,1,2,3...\n",
    "- However, while creating a dataframe from scratch, or while reading them from a file you can set them to more meaningful string values.\n",
    "- While reading from csv file the first row in the file is taken as the column labels\n",
    "- We can change the column labels, if we want\n",
    "- Let us practically see this for better understanding"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "id": "3e29d2a9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  roll no     name  age    address    session    group  gender  subj1  subj2  \\\n",
       "0    MS01     Rauf   52     Lahore    MORNING  group C    Male   78.3   84.4   \n",
       "1    MS02     Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "2    MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "3    MS04   Hadeed   20     Lahore        MOR  group A    Male   82.0   84.3   \n",
       "4    MS05     Zara   40   Peshawer        AFT  group D  Female   65.9   72.8   \n",
       "\n",
       "   scholarship  \n",
       "0       5000.0  \n",
       "1       6000.0  \n",
       "2       8500.0  \n",
       "3       4000.0  \n",
       "4       3500.0  "
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "df = pd.read_csv('datasets/groupdata.csv')\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "id": "b1224d79",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>MS01</th>\n",
       "      <th>Rauf</th>\n",
       "      <th>52</th>\n",
       "      <th>Lahore</th>\n",
       "      <th>MORNING</th>\n",
       "      <th>group C</th>\n",
       "      <th>Male</th>\n",
       "      <th>78.3</th>\n",
       "      <th>84.4</th>\n",
       "      <th>5000</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   MS01     Rauf  52     Lahore    MORNING  group C    Male  78.3  84.4  \\\n",
       "0  MS02     Arif  51  Islamabad        AFT  group A    Male  70.5  60.5   \n",
       "1  MS03  Shaista  35    Karachi  AFTERNOON  group B  Female  64.9  75.1   \n",
       "2  MS04   Hadeed  20     Lahore        MOR  group A    Male  82.0  84.3   \n",
       "3  MS05     Zara  40   Peshawer        AFT  group D  Female  65.9  72.8   \n",
       "4  MS06    Mohid  16     Lahore    MORNING  group C  Female  69.3  78.6   \n",
       "\n",
       "     5000  \n",
       "0  6000.0  \n",
       "1  8500.0  \n",
       "2  4000.0  \n",
       "3  3500.0  \n",
       "4     NaN  "
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.read_csv('datasets/groupdatawithoutcollables.csv')\n",
    "df.head()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "id": "0daf15a9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "      <th>8</th>\n",
       "      <th>9</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      0        1   2          3          4        5       6     7     8  \\\n",
       "0  MS01     Rauf  52     Lahore    MORNING  group C    Male  78.3  84.4   \n",
       "1  MS02     Arif  51  Islamabad        AFT  group A    Male  70.5  60.5   \n",
       "2  MS03  Shaista  35    Karachi  AFTERNOON  group B  Female  64.9  75.1   \n",
       "3  MS04   Hadeed  20     Lahore        MOR  group A    Male  82.0  84.3   \n",
       "4  MS05     Zara  40   Peshawer        AFT  group D  Female  65.9  72.8   \n",
       "\n",
       "        9  \n",
       "0  5000.0  \n",
       "1  6000.0  \n",
       "2  8500.0  \n",
       "3  4000.0  \n",
       "4  3500.0  "
      ]
     },
     "execution_count": 61,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# To read such files, you have to pass the parameter `header=None` to the `read_csv()` method\n",
    "df = pd.read_csv('datasets/groupdatawithoutcollables.csv', header=None)\n",
    "df.head()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3488b6fe",
   "metadata": {},
   "source": [
    ">Let us suppose we have above dataframe, in which the column indices are just integer values associated with the position of every column. We want to assign some meaningful names to the columns for better understanding. There are many options or ways to do that."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7d06d188",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "cd16b0a0",
   "metadata": {},
   "source": [
    "**Changing Column IndicesLabels:** Assign a list of column labels to the `columns` attribute of dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "id": "ee7c1175",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  roll no     name  age    address    session    group  gender  subj1  subj2  \\\n",
       "0    MS01     Rauf   52     Lahore    MORNING  group C    Male   78.3   84.4   \n",
       "1    MS02     Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "2    MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "3    MS04   Hadeed   20     Lahore        MOR  group A    Male   82.0   84.3   \n",
       "4    MS05     Zara   40   Peshawer        AFT  group D  Female   65.9   72.8   \n",
       "\n",
       "   scholarship  \n",
       "0       5000.0  \n",
       "1       6000.0  \n",
       "2       8500.0  \n",
       "3       4000.0  \n",
       "4       3500.0  "
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "col_names = ['roll no', 'name', 'age', 'address', 'session', 'group', 'gender', 'subj1', 'subj2', 'scholarship']\n",
    "df.columns = col_names\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f5e22ca",
   "metadata": {},
   "source": [
    ">Note that in the above dataframe, first column name has a space, which is a bit difficult to use sometimes, so if you want to change value of a specific column label, you can use the `df.rename()` method"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "id": "44c77040",
   "metadata": {},
   "outputs": [],
   "source": [
    "# You pass a dictionary object to the columns argument to rename() method\n",
    "# The key is the old column name, while the value is the new column name\n",
    "df1 = df.rename(columns={'roll no': 'rollno'}, inplace=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "id": "d4a2bad3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>rollno</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  rollno     name  age    address    session    group  gender  subj1  subj2  \\\n",
       "0   MS01     Rauf   52     Lahore    MORNING  group C    Male   78.3   84.4   \n",
       "1   MS02     Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "2   MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "\n",
       "   scholarship  \n",
       "0       5000.0  \n",
       "1       6000.0  \n",
       "2       8500.0  "
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d365bf55",
   "metadata": {},
   "source": [
    ">Last but not the least, another way is to assign appropriate column labels to your dataframes by passing a list of column names to the `names` argument of the `df.read_csv()` method. Do it at your own :)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "84643aca",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "2c3c30ba",
   "metadata": {},
   "source": [
    "### c. Changing the Row Indices/Labels of a  Dataframe\n",
    "- Every dataframe has row index associated with its rows\n",
    "- These by default are integer values from 0,1,2,3...\n",
    "- However, while creating a dataframe from scratch you may set them to some meaningful string values (seldom required).\n",
    "- We have already seen this in our previous session\n",
    "- Today, we will see two methods that work on row indices of a Pandas Dataframe named `df.set_index()` and `df.reset_index()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "id": "320a1817",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  roll no     name  age    address    session    group  gender  subj1  subj2  \\\n",
       "0    MS01     Rauf   52     Lahore    MORNING  group C    Male   78.3   84.4   \n",
       "1    MS02     Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "2    MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "\n",
       "   scholarship  \n",
       "0       5000.0  \n",
       "1       6000.0  \n",
       "2       8500.0  "
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let us load the dataset again into dataframe\n",
    "import pandas as pd\n",
    "df = pd.read_csv('datasets/groupdata.csv')\n",
    "df.head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "id": "7ccd48bb",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=16, step=1)"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.index"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34a875f9",
   "metadata": {},
   "source": [
    ">Let us suppose we have above dataframe, in which the row indices are just integer values associated with the position of every row. We want to assign some meaningful indices to the rows for better understanding. Suppose, we want to set values of the column rollno as index of this dataframe. So we donot want positional indices rather want some meaningful string indices, which are roll numbers of students in this case. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b8df0c60",
   "metadata": {},
   "source": [
    "**Changing Row Indices:** The `df.set_index()` method can be used to change row index of a dataframe using an existing column(s)\n",
    "`df.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)`\n",
    "\n",
    "- Where\n",
    "    - `keys` is the column label \n",
    "    - `drop=True`, to drop the column from data part of dataframe that is being used as column index now\n",
    "    - `append=False`, Set it to True if you want to append columns to existing index\n",
    "    - `inplace=False`, Set it to True to make changes in the original dataframe, i.e., do not create a new object\n",
    "    - `verify_integrity=False`, Set it to True to check the new index for duplicates. Default value of False will improve the performance of this method.\n",
    "\n",
    "Returns Dataframe if `inplace=False` or None if `inplace=True`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "id": "0c6413f7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>roll no</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>MS01</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS02</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS03</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        roll no     name  age    address    session    group  gender  subj1  \\\n",
       "roll no                                                                       \n",
       "MS01       MS01     Rauf   52     Lahore    MORNING  group C    Male   78.3   \n",
       "MS02       MS02     Arif   51  Islamabad        AFT  group A    Male   70.5   \n",
       "MS03       MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   \n",
       "\n",
       "         subj2  scholarship  \n",
       "roll no                      \n",
       "MS01      84.4       5000.0  \n",
       "MS02      60.5       6000.0  \n",
       "MS03      75.1       8500.0  "
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = df.set_index(keys='roll no', drop=False)\n",
    "df1.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "435b624c",
   "metadata": {},
   "source": [
    "Note: The rollno column still exist as part of the dataframe. To drop it set `drop=True`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e3c2ba53",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "id": "ab6d2fe3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>roll no</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>MS01</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS02</th>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS03</th>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            name  age    address    session    group  gender  subj1  subj2  \\\n",
       "roll no                                                                      \n",
       "MS01        Rauf   52     Lahore    MORNING  group C    Male   78.3   84.4   \n",
       "MS02        Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "MS03     Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "\n",
       "         scholarship  \n",
       "roll no               \n",
       "MS01          5000.0  \n",
       "MS02          6000.0  \n",
       "MS03          8500.0  "
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = df.set_index(keys='roll no', drop=True)\n",
    "df1.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "042e1a5f",
   "metadata": {},
   "source": [
    "Another point to note is that no change has been made to the original dataframe since `inplace` arguement is by default False. Let us verify this"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "id": "7a7e055a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  roll no     name  age    address    session    group  gender  subj1  subj2  \\\n",
       "0    MS01     Rauf   52     Lahore    MORNING  group C    Male   78.3   84.4   \n",
       "1    MS02     Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "2    MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "3    MS04   Hadeed   20     Lahore        MOR  group A    Male   82.0   84.3   \n",
       "4    MS05     Zara   40   Peshawer        AFT  group D  Female   65.9   72.8   \n",
       "\n",
       "   scholarship  \n",
       "0       5000.0  \n",
       "1       6000.0  \n",
       "2       8500.0  \n",
       "3       4000.0  \n",
       "4       3500.0  "
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7089ff61",
   "metadata": {},
   "source": [
    "Instead of returning a dataframe, the `df.set_index()` method can change the column index inplace. Let us do that now"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "id": "6c3de2ed",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>roll no</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>MS01</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS02</th>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS03</th>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS04</th>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS05</th>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            name  age    address    session    group  gender  subj1  subj2  \\\n",
       "roll no                                                                      \n",
       "MS01        Rauf   52     Lahore    MORNING  group C    Male   78.3   84.4   \n",
       "MS02        Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "MS03     Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "MS04      Hadeed   20     Lahore        MOR  group A    Male   82.0   84.3   \n",
       "MS05        Zara   40   Peshawer        AFT  group D  Female   65.9   72.8   \n",
       "\n",
       "         scholarship  \n",
       "roll no               \n",
       "MS01          5000.0  \n",
       "MS02          6000.0  \n",
       "MS03          8500.0  \n",
       "MS04          4000.0  \n",
       "MS05          3500.0  "
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.set_index(keys='roll no', drop=True, inplace=True)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "79628642",
   "metadata": {},
   "source": [
    ">Note that the roll no column has become the index now, it is no more data of the dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "id": "951244e1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['MS01', 'MS02', 'MS03', 'MS04', 'MS05', 'MS06', 'MS07', 'MS08', 'MS09',\n",
       "       'MS10', 'MS11', 'MS12', 'MS13', 'MS14', 'MS15', 'MS16'],\n",
       "      dtype='object', name='roll no')"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e5463f8f",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "6a6e75bc",
   "metadata": {},
   "source": [
    "**Changing Row Indices Back to Positional:** The `df.reset_index()` method is used to reset the row index of the dataframe back to positional integer indices\n",
    "\n",
    "`df.reset_index(drop=False, inplace=False)`\n",
    "Where\n",
    "- `drop=False`, Do not try to insert index into dataframe columns. This resets the index to the default integer index.\n",
    "- `inplace=False`,  Modify the DataFrame in place (do not create a new object).\n",
    "\n",
    "Returns Dataframe if `inplace=False` or None if `inplace=True`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "id": "e210fbb9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>roll no</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>MS01</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS02</th>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS03</th>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS04</th>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MS05</th>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            name  age    address    session    group  gender  subj1  subj2  \\\n",
       "roll no                                                                      \n",
       "MS01        Rauf   52     Lahore    MORNING  group C    Male   78.3   84.4   \n",
       "MS02        Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "MS03     Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "MS04      Hadeed   20     Lahore        MOR  group A    Male   82.0   84.3   \n",
       "MS05        Zara   40   Peshawer        AFT  group D  Female   65.9   72.8   \n",
       "\n",
       "         scholarship  \n",
       "roll no               \n",
       "MS01          5000.0  \n",
       "MS02          6000.0  \n",
       "MS03          8500.0  \n",
       "MS04          4000.0  \n",
       "MS05          3500.0  "
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "id": "f7b5e6d4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(16, 9)"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "id": "8aee9dd3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=16, step=1)"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# reset the index\n",
    "df2 = df.reset_index()\n",
    "df2.index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "id": "4704e4f6",
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "054SCKpufWir",
    "outputId": "050ced56-1fa7-4c30-cbe7-f2dadbe2d710"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  roll no     name  age    address    session    group  gender  subj1  subj2  \\\n",
       "0    MS01     Rauf   52     Lahore    MORNING  group C    Male   78.3   84.4   \n",
       "1    MS02     Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "2    MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "\n",
       "   scholarship  \n",
       "0       5000.0  \n",
       "1       6000.0  \n",
       "2       8500.0  "
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "0db2f2f3",
   "metadata": {},
   "outputs": [],
   "source": [
    "df3 = df2.reset_index()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "b7d73dee",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>index</th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5</td>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>6</td>\n",
       "      <td>MS07</td>\n",
       "      <td>Zobia</td>\n",
       "      <td>40</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>7</td>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>8</td>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>9</td>\n",
       "      <td>MS10</td>\n",
       "      <td>Shahid</td>\n",
       "      <td>38</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>10</td>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>11</td>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>12</td>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>13</td>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>14</td>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>15</td>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    index roll no       name  age    address    session    group  gender  \\\n",
       "0       0    MS01       Rauf   52     Lahore    MORNING  group C    Male   \n",
       "1       1    MS02       Arif   51  Islamabad        AFT  group A    Male   \n",
       "2       2    MS03    Shaista   35    Karachi  AFTERNOON  group B  Female   \n",
       "3       3    MS04     Hadeed   20     Lahore        MOR  group A    Male   \n",
       "4       4    MS05       Zara   40   Peshawer        AFT  group D  Female   \n",
       "5       5    MS06      Mohid   16     Lahore    MORNING  group C  Female   \n",
       "6       6    MS07      Zobia   40    Sialkot        AFT  group B  Female   \n",
       "7       7    MS08     Idrees   51     Multan    MORNING  group D    Male   \n",
       "8       8    MS09      Jamil   53    Karachi        AFT  group C    Male   \n",
       "9       9    MS10     Shahid   38     Lahore  AFTERNOON  group D    Male   \n",
       "10     10    MS11    Khurram   35  Islamabad        MOR  group B    Male   \n",
       "11     11    MS12       Maaz   25    Karachi  AFTERNOON  group C    Male   \n",
       "12     12    MS13    Mujahid   18     Lahore    MORNING  group D    Male   \n",
       "13     13    MS14       Sara   28     Multan  AFTERNOON  group A  Female   \n",
       "14     14    MS15     Fatima   33    Sialkot        AFT  group C  Female   \n",
       "15     15    MS16  Kakamanna   42     Multan  AFTERNOON  group A    Male   \n",
       "\n",
       "    subj1  subj2  scholarship  \n",
       "0    78.3   84.4       5000.0  \n",
       "1    70.5   60.5       6000.0  \n",
       "2    64.9   75.1       8500.0  \n",
       "3    82.0   84.3       4000.0  \n",
       "4    65.9   72.8       3500.0  \n",
       "5    69.3   78.6          NaN  \n",
       "6    90.2    NaN       4000.0  \n",
       "7    84.1   76.0       8000.0  \n",
       "8    90.5   81.3       3500.0  \n",
       "9    90.5   81.3       3800.0  \n",
       "10   90.5   81.3       6000.0  \n",
       "11   90.5   81.3          NaN  \n",
       "12    NaN   76.5       7000.0  \n",
       "13   84.1   76.0       8000.0  \n",
       "14   90.5   81.3       3500.0  \n",
       "15   90.5   81.3       3800.0  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "36a4b115",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "bdac5b44",
   "metadata": {},
   "source": [
    "## 2. Selecting Column(s) and Row(s) of a Dataframe using `df[]` \n",
    "\n",
    "<img align=\"right\" width=\"400\" height=\"400\"  src=\"images/groupdata.png\"  >\n",
    "\n",
    "- Consider this dataframe which is sorted by ages. Note the row indices has been randomized and they do not correspond to positional indices (which we normally visualize as 0,1,2,3...)\n",
    "- To access column(s) of a dataframe:\n",
    "    - To access single column, mention the column index/label inside `[]`, which in this case are strings, however, can be integer values as well. It will return a new Series object.\n",
    "    - To access multiple columns, pass column indices/labels as a list inside `[]`. It will return a new Dataframe object.\n",
    "- To access rows(s) of a dataframe:    \n",
    "    - Mention the **positional** row indices as a slice object `[start:stop:step]`, (In this case the positional indices do not match with the actual row indices). It will return a new Dataframe object.\n",
    "        - `start`: specifies from where the slicing should start, inclusive (default is 0) \n",
    "        - `stop`: specifies where it has to stop, exclusive (default is end of the array) \n",
    "        - `step`:  is by-default 1\n",
    "    \n",
    "    \n",
    "**Note:** \n",
    "- You cannot use two subscript operators `df[2][5]` to access a specific element of a dataframe as in case of numPy arrays.\n",
    "- You cannot get the subset of a dataframe w.r.t rows and columns at the same time using `df[]`, it can either return a subset of columns only or rows only.\n",
    "- We will soon see the `df.loc[]` and `df.iloc[]` methods that provides simpler, elegant and powerful way to subset a dataframe as compared to `df[]` syntax."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "563052be",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>MS10</td>\n",
       "      <td>Shahid</td>\n",
       "      <td>38</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>MS07</td>\n",
       "      <td>Zobia</td>\n",
       "      <td>40</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group  gender  subj1  \\\n",
       "5     MS06      Mohid   16     Lahore    MORNING  group C  Female   69.3   \n",
       "12    MS13    Mujahid   18     Lahore    MORNING  group D    Male    NaN   \n",
       "3     MS04     Hadeed   20     Lahore        MOR  group A    Male   82.0   \n",
       "11    MS12       Maaz   25    Karachi  AFTERNOON  group C    Male   90.5   \n",
       "13    MS14       Sara   28     Multan  AFTERNOON  group A  Female   84.1   \n",
       "14    MS15     Fatima   33    Sialkot        AFT  group C  Female   90.5   \n",
       "2     MS03    Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   \n",
       "10    MS11    Khurram   35  Islamabad        MOR  group B    Male   90.5   \n",
       "9     MS10     Shahid   38     Lahore  AFTERNOON  group D    Male   90.5   \n",
       "4     MS05       Zara   40   Peshawer        AFT  group D  Female   65.9   \n",
       "6     MS07      Zobia   40    Sialkot        AFT  group B  Female   90.2   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A    Male   90.5   \n",
       "1     MS02       Arif   51  Islamabad        AFT  group A    Male   70.5   \n",
       "7     MS08     Idrees   51     Multan    MORNING  group D    Male   84.1   \n",
       "0     MS01       Rauf   52     Lahore    MORNING  group C    Male   78.3   \n",
       "8     MS09      Jamil   53    Karachi        AFT  group C    Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "5    78.6          NaN  \n",
       "12   76.5       7000.0  \n",
       "3    84.3       4000.0  \n",
       "11   81.3          NaN  \n",
       "13   76.0       8000.0  \n",
       "14   81.3       3500.0  \n",
       "2    75.1       8500.0  \n",
       "10   81.3       6000.0  \n",
       "9    81.3       3800.0  \n",
       "4    72.8       3500.0  \n",
       "6     NaN       4000.0  \n",
       "15   81.3       3800.0  \n",
       "1    60.5       6000.0  \n",
       "7    76.0       8000.0  \n",
       "0    84.4       5000.0  \n",
       "8    81.3       3500.0  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "df_sorted"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "3b45abcc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age  address    session    group  gender  subj1  subj2  \\\n",
       "5     MS06    Mohid   16   Lahore    MORNING  group C  Female   69.3   78.6   \n",
       "12    MS13  Mujahid   18   Lahore    MORNING  group D    Male    NaN   76.5   \n",
       "3     MS04   Hadeed   20   Lahore        MOR  group A    Male   82.0   84.3   \n",
       "11    MS12     Maaz   25  Karachi  AFTERNOON  group C    Male   90.5   81.3   \n",
       "13    MS14     Sara   28   Multan  AFTERNOON  group A  Female   84.1   76.0   \n",
       "\n",
       "    scholarship  \n",
       "5           NaN  \n",
       "12       7000.0  \n",
       "3        4000.0  \n",
       "11          NaN  \n",
       "13       8000.0  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let us read a fresh dataframe and sort it by age column to have a clear understanding about indexing\n",
    "df = pd.read_csv('datasets/groupdata.csv')\n",
    "df_sorted = df.sort_values('age')\n",
    "df_sorted.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "4db05f3f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age address  session    group gender  subj1  subj2  \\\n",
       "12    MS13  Mujahid   18  Lahore  MORNING  group D   Male    NaN   76.5   \n",
       "\n",
       "    scholarship  \n",
       "12       7000.0  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted[1:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b0cf706f",
   "metadata": {},
   "source": [
    "**Example 1:** Select the data under the column `name`. Since the column labels are strings, so we mention it in single quotes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "786fe173",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "5         Mohid\n",
      "12      Mujahid\n",
      "3        Hadeed\n",
      "11         Maaz\n",
      "13         Sara\n",
      "14       Fatima\n",
      "2       Shaista\n",
      "10      Khurram\n",
      "9        Shahid\n",
      "4          Zara\n",
      "6         Zobia\n",
      "15    Kakamanna\n",
      "1          Arif\n",
      "7        Idrees\n",
      "0          Rauf\n",
      "8         Jamil\n",
      "Name: name, dtype: object\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "pandas.core.series.Series"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s1 = df_sorted['name']\n",
    "print(s1)\n",
    "type(s1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bcd12999",
   "metadata": {},
   "source": [
    "> The result is a new Series object. Since this is a series, so you can use many of the methods that you can use on Pandas Series and Dataframes, in a chained format, as shown below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "9dd6895e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0       Rauf\n",
       "1       Arif\n",
       "2    Shaista\n",
       "3     Hadeed\n",
       "4       Zara\n",
       "Name: name, dtype: object"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s1.head()\n",
    "# or\n",
    "df['name'].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c5bdf554",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "99e4dfe1",
   "metadata": {},
   "source": [
    "**Example 2:** To select multiple columns of a dataframe, pass a list of column names. The result is a new DataFrame object with the selected columns. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "f417bcd1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>gender</th>\n",
       "      <th>age</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Female</td>\n",
       "      <td>16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Male</td>\n",
       "      <td>18</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Male</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Male</td>\n",
       "      <td>25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Female</td>\n",
       "      <td>28</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no  gender  age\n",
       "5     MS06  Female   16\n",
       "12    MS13    Male   18\n",
       "3     MS04    Male   20\n",
       "11    MS12    Male   25\n",
       "13    MS14  Female   28"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Just get the first five rows for the columns `roll no`, `gender` and `age`\n",
    "d1 = df_sorted[['roll no', 'gender', 'age']].head()\n",
    "d1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a91ae4e2",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "5936f688",
   "metadata": {},
   "source": [
    "**Example 3:** Select the data of a single row at position 1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "072691e5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age address  session    group gender  subj1  subj2  \\\n",
       "12    MS13  Mujahid   18  Lahore  MORNING  group D   Male    NaN   76.5   \n",
       "\n",
       "    scholarship  \n",
       "12       7000.0  "
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted[1:2:1]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "08dc3331",
   "metadata": {},
   "source": [
    ">Point to note that in `df_sorted` dataframe at position 1, we have record of Muhahid having dataframe index 12"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "129853ae",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "18289301",
   "metadata": {},
   "source": [
    "**Example 4:** Select the rows from positional index 2 to 3."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "a8a281b9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no    name  age  address    session    group gender  subj1  subj2  \\\n",
       "3     MS04  Hadeed   20   Lahore        MOR  group A   Male   82.0   84.3   \n",
       "11    MS12    Maaz   25  Karachi  AFTERNOON  group C   Male   90.5   81.3   \n",
       "\n",
       "    scholarship  \n",
       "3        4000.0  \n",
       "11          NaN  "
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted[2:4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0b20029b",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "3399a250",
   "metadata": {},
   "source": [
    "**Example 5:** Select the rows from positional index 0, 5, 10, and 15"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "a1d73eb4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5      Mohid\n",
       "14    Fatima\n",
       "6      Zobia\n",
       "8      Jamil\n",
       "Name: name, dtype: object"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted[::5]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "926ae0a4",
   "metadata": {},
   "source": [
    ">Note that the output dataframe contains a subset of original dataframe. However, the index or row labels stays with the rows. It is not renumbered. So this means that every row is identified by a row label, which remain associated wih the row or record until you decide to reset the index"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36db08b2",
   "metadata": {},
   "source": [
    "#### Resetting the Index of Subset of a Dataframe\n",
    "- When we slice data from a datafeame, the row index of resulting dataframe may not be contiguous values.\n",
    "- You can reset it using the `df.reset_index()` method discussed above as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "c5c9fba2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group  gender  subj1  \\\n",
       "14    MS15     Fatima   33    Sialkot        AFT  group C  Female   90.5   \n",
       "10    MS11    Khurram   35  Islamabad        MOR  group B    Male   90.5   \n",
       "4     MS05       Zara   40   Peshawer        AFT  group D  Female   65.9   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A    Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "14   81.3       3500.0  \n",
       "10   81.3       6000.0  \n",
       "4    72.8       3500.0  \n",
       "15   81.3       3800.0  "
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = df_sorted[5:12:2]\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "9d7555f8",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>index</th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>14</td>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>10</td>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>4</td>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>15</td>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   index roll no       name  age    address    session    group  gender  \\\n",
       "0     14    MS15     Fatima   33    Sialkot        AFT  group C  Female   \n",
       "1     10    MS11    Khurram   35  Islamabad        MOR  group B    Male   \n",
       "2      4    MS05       Zara   40   Peshawer        AFT  group D  Female   \n",
       "3     15    MS16  Kakamanna   42     Multan  AFTERNOON  group A    Male   \n",
       "\n",
       "   subj1  subj2  scholarship  \n",
       "0   90.5   81.3       3500.0  \n",
       "1   90.5   81.3       6000.0  \n",
       "2   65.9   72.8       3500.0  \n",
       "3   90.5   81.3       3800.0  "
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3 = df2.reset_index()\n",
    "df3"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8a48dbb0",
   "metadata": {},
   "source": [
    ">Note that the index has been reset, however, the old index is now added as a column in the dataframe. Mostly this is not required, so pass the `drop=True` argument to `reset_index()` method to avoid this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "9f2ae7eb",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  roll no       name  age    address    session    group  gender  subj1  \\\n",
       "0    MS15     Fatima   33    Sialkot        AFT  group C  Female   90.5   \n",
       "1    MS11    Khurram   35  Islamabad        MOR  group B    Male   90.5   \n",
       "2    MS05       Zara   40   Peshawer        AFT  group D  Female   65.9   \n",
       "3    MS16  Kakamanna   42     Multan  AFTERNOON  group A    Male   90.5   \n",
       "\n",
       "   subj2  scholarship  \n",
       "0   81.3       3500.0  \n",
       "1   81.3       6000.0  \n",
       "2   72.8       3500.0  \n",
       "3   81.3       3800.0  "
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df4 = df2.reset_index(drop=True)\n",
    "df4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d687a580",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "f45dc3cb",
   "metadata": {},
   "source": [
    "## 3. Selecting Rows and Columns using `iloc` Method\n",
    "<img align=\"right\" width=\"400\" height=\"400\"  src=\"images/groupdata.png\"  >\n",
    "\n",
    "- The **`df.iloc[]`** is more powerful than the **`df[]`**, as it allows to filter rows as well as columns of user choice at the same time.\n",
    "- It is used for filtering rows and selecting columns by **integer position** (0 to n-1) (neither by row index value/label nor by column index value/label). So you cannot mention the column names like `age` rather you need to give its positional index and that is 2.\n",
    "```\n",
    "df.iloc[rowstoselect, colstoselect]\n",
    "```\n",
    "- You can place a collon in any of the two arguments to select all rows or all columns.\n",
    "- Another point to keep in mind is that the indices are by position (0 to n-1) and not by actual values of row and column indices. \n",
    "- Allowed inputs within `[ , ]` are:\n",
    "     - A single integer, e.g. ``5`` (note that ``5`` is interpreted as an integer position along the index).\n",
    "     - A list or array of integers, e.g.  `[9, 2, 7]`.\n",
    "     - A slice object with integers, e.g. ``2:9``.\n",
    "     - Note that as with usual Python slices, **stop** index is not included\n",
    "     - **Note:** ``.iloc`` will raise ``IndexError`` if a requested indexer is out-of-bounds, except *slice* indexers which allow out-of-bounds indexing (this conforms with python/numpy *slice* semantics)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e54421d1",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13ed9469",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6afb21df",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "82a89ef2",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4e0a0aa8",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "373e2265",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "8df50cd2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age  address    session    group  gender  subj1  subj2  \\\n",
       "5     MS06    Mohid   16   Lahore    MORNING  group C  Female   69.3   78.6   \n",
       "12    MS13  Mujahid   18   Lahore    MORNING  group D    Male    NaN   76.5   \n",
       "3     MS04   Hadeed   20   Lahore        MOR  group A    Male   82.0   84.3   \n",
       "11    MS12     Maaz   25  Karachi  AFTERNOON  group C    Male   90.5   81.3   \n",
       "13    MS14     Sara   28   Multan  AFTERNOON  group A  Female   84.1   76.0   \n",
       "\n",
       "    scholarship  \n",
       "5           NaN  \n",
       "12       7000.0  \n",
       "3        4000.0  \n",
       "11          NaN  \n",
       "13       8000.0  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let us read a fresh dataframe and sort it by age column to have a clear understanding about indexing\n",
    "import pandas as pd\n",
    "df = pd.read_csv('datasets/groupdata.csv')\n",
    "df_sorted = df.sort_values('age')\n",
    "df_sorted.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f1d3dc0f",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "7474d97c",
   "metadata": {},
   "source": [
    "### a. Selection of Rows Only"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db4e6a54",
   "metadata": {},
   "source": [
    "**Example 1:** Select a single row with positional index  2 and all the columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "f39ad79f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "roll no           MS04\n",
       "name            Hadeed\n",
       "age                 20\n",
       "address         Lahore\n",
       "session            MOR\n",
       "group          group A\n",
       "gender            Male\n",
       "subj1             82.0\n",
       "subj2             84.3\n",
       "scholarship     4000.0\n",
       "Name: 3, dtype: object"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.iloc[2,:]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "607f9b3c",
   "metadata": {},
   "source": [
    "Note that the integer values are interpreted as row# (positional index) of the dataframe. Moreover a Series object is returned"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "671ba668",
   "metadata": {},
   "source": [
    "**Example 2:** Select rows with positional indices 2, 4 and 1 and all the columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "dc9b15be",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age address    session    group  gender  subj1  subj2  \\\n",
       "3     MS04   Hadeed   20  Lahore        MOR  group A    Male   82.0   84.3   \n",
       "13    MS14     Sara   28  Multan  AFTERNOON  group A  Female   84.1   76.0   \n",
       "12    MS13  Mujahid   18  Lahore    MORNING  group D    Male    NaN   76.5   \n",
       "\n",
       "    scholarship  \n",
       "3        4000.0  \n",
       "13       8000.0  \n",
       "12       7000.0  "
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.iloc[[2,4,1], :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f31eae33",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "f0ced7aa",
   "metadata": {},
   "source": [
    "**Example 3:** Select rows with positional indices from 3 to 5 (stop value is not inclusive) and all the columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "0f356dae",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no  name  age  address    session    group  gender  subj1  subj2  \\\n",
       "11    MS12  Maaz   25  Karachi  AFTERNOON  group C    Male   90.5   81.3   \n",
       "13    MS14  Sara   28   Multan  AFTERNOON  group A  Female   84.1   76.0   \n",
       "\n",
       "    scholarship  \n",
       "11          NaN  \n",
       "13       8000.0  "
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.iloc[3:5, :]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ade46ddf",
   "metadata": {},
   "source": [
    ">**In all above examples, if you omit comma and collon for the columns part, Pandas assumes it. I strongly recommend using the above style for clarity of code.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e068a471",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "7b55926e",
   "metadata": {},
   "source": [
    "### b. Selection of Columns Only"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "987fdfcb",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['roll no', 'name', 'age', 'address', 'session', 'group', 'gender',\n",
       "       'subj1', 'subj2', 'scholarship'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "406ecb3c",
   "metadata": {},
   "source": [
    "**Example 1:** Select all the row values under the column at positional index 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "c4ea93a0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5        Lahore\n",
       "12       Lahore\n",
       "3        Lahore\n",
       "11      Karachi\n",
       "13       Multan\n",
       "14      Sialkot\n",
       "2       Karachi\n",
       "10    Islamabad\n",
       "9        Lahore\n",
       "4      Peshawer\n",
       "6       Sialkot\n",
       "15       Multan\n",
       "1     Islamabad\n",
       "7        Multan\n",
       "0        Lahore\n",
       "8       Karachi\n",
       "Name: address, dtype: object"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.iloc[:, 3]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "25d445b9",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "d18d10de",
   "metadata": {},
   "source": [
    "**Example 2:** Select all the row values under the column at positional index 1, 4, and 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "bcef93b7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Mohid</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Mujahid</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Hadeed</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Maaz</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Sara</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>Fatima</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Shaista</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Khurram</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Shahid</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group D</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Zara</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Zobia</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group B</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Arif</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Idrees</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Jamil</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         name    session    group\n",
       "5       Mohid    MORNING  group C\n",
       "12    Mujahid    MORNING  group D\n",
       "3      Hadeed        MOR  group A\n",
       "11       Maaz  AFTERNOON  group C\n",
       "13       Sara  AFTERNOON  group A\n",
       "14     Fatima        AFT  group C\n",
       "2     Shaista  AFTERNOON  group B\n",
       "10    Khurram        MOR  group B\n",
       "9      Shahid  AFTERNOON  group D\n",
       "4        Zara        AFT  group D\n",
       "6       Zobia        AFT  group B\n",
       "15  Kakamanna  AFTERNOON  group A\n",
       "1        Arif        AFT  group A\n",
       "7      Idrees    MORNING  group D\n",
       "0        Rauf    MORNING  group C\n",
       "8       Jamil        AFT  group C"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.iloc[:, [1,4,5]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "522d549c",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "2751b609",
   "metadata": {},
   "source": [
    "**Example 3:** Select all the row values under the columns from position 2 to 4 (Note that the stop index is not inclusive)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "9f62f365",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>38</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>40</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    age    address    session\n",
       "5    16     Lahore    MORNING\n",
       "12   18     Lahore    MORNING\n",
       "3    20     Lahore        MOR\n",
       "11   25    Karachi  AFTERNOON\n",
       "13   28     Multan  AFTERNOON\n",
       "14   33    Sialkot        AFT\n",
       "2    35    Karachi  AFTERNOON\n",
       "10   35  Islamabad        MOR\n",
       "9    38     Lahore  AFTERNOON\n",
       "4    40   Peshawer        AFT\n",
       "6    40    Sialkot        AFT\n",
       "15   42     Multan  AFTERNOON\n",
       "1    51  Islamabad        AFT\n",
       "7    51     Multan    MORNING\n",
       "0    52     Lahore    MORNING\n",
       "8    53    Karachi        AFT"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.iloc[:, 2:5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6cfd4f30",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "04645731",
   "metadata": {},
   "source": [
    "### c. Selection of Rows + Columns\n",
    "```\n",
    "df.iloc[whatrowsIwant, whatcolumnsIwant]\n",
    "```\n",
    "- You can use a single value, a list of multiple values, or a slice object for selecting rows\n",
    "- You can use a single value, a list of multiple values, or a slice object for selecting columns"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d23145c7",
   "metadata": {},
   "source": [
    "**Example 1:** Select only the rows at positional index 3 and 0, and from those two rows select only columns at positional index 1 and 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "240720c4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>group</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Maaz</td>\n",
       "      <td>group C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Mohid</td>\n",
       "      <td>group C</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     name    group\n",
       "11   Maaz  group C\n",
       "5   Mohid  group C"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.iloc[[3, 0], [1, 5]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "608e11f0",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "540dae0e",
   "metadata": {},
   "source": [
    "**Example 2:** Select only the rows at positional index 0 to 4 (stop index is not inclusive), and from those two rows select only columns at positional index 2 and 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "a707b416",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    age  address\n",
       "5    16   Lahore\n",
       "12   18   Lahore\n",
       "3    20   Lahore\n",
       "11   25  Karachi\n",
       "13   28   Multan"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.iloc[0:5, 2:4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a7a0d4f6",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "d92a8982",
   "metadata": {},
   "source": [
    "## 4. Selecting Rows and Columns using `df.loc[]` Method\n",
    "<img align=\"right\" width=\"400\" height=\"400\"  src=\"images/groupdata.png\"  >\n",
    "\n",
    "- The **`df.loc[]`** is also used for filtering rows and selecting columns but by row index value/label or by column index value/label (NOT by position). \n",
    "```\n",
    "df.loc[rowstoselect, colstoselect]\n",
    "```\n",
    "- You can place a collon in any of the two arguments to select all rows or all columns.\n",
    "- Another point to keep in mind is that the indices are NOT by position, rather by actual values of row and column indices. \n",
    "- Allowed inputs within `[ , ]` are:\n",
    "     - A single label, e.g. `5` or `'a'`, (note that `5` is interpreted as actual index/label **NOT** as an integer position along the index).\n",
    "     - A list or array of labels, e.g. `[9, 2, 7]` or `['ms07', 'ms02', 'ms08']`.\n",
    "     - A slice object with labels, e.g. `[3:6:2]` or `['ms05':'ms09']`.\n",
    "     - **Warning:** Note that contrary to usual Python slices, **both** the start and the stop are included"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "9211ffa3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age  address    session    group  gender  subj1  subj2  \\\n",
       "5     MS06    Mohid   16   Lahore    MORNING  group C  Female   69.3   78.6   \n",
       "12    MS13  Mujahid   18   Lahore    MORNING  group D    Male    NaN   76.5   \n",
       "3     MS04   Hadeed   20   Lahore        MOR  group A    Male   82.0   84.3   \n",
       "11    MS12     Maaz   25  Karachi  AFTERNOON  group C    Male   90.5   81.3   \n",
       "13    MS14     Sara   28   Multan  AFTERNOON  group A  Female   84.1   76.0   \n",
       "\n",
       "    scholarship  \n",
       "5           NaN  \n",
       "12       7000.0  \n",
       "3        4000.0  \n",
       "11          NaN  \n",
       "13       8000.0  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let us read a fresh dataframe and sort it by age column to have a clear understanding about indexing\n",
    "import pandas as pd\n",
    "df = pd.read_csv('datasets/groupdata.csv')\n",
    "df_sorted = df.sort_values('age')\n",
    "df_sorted.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "31cc0e4c",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "4a1a7d13",
   "metadata": {},
   "source": [
    "### a. Selection of Rows Only"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "885622be",
   "metadata": {},
   "source": [
    "**Example 1:** Select a single row with row index 2 and all the columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "5d806857",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "roll no             MS03\n",
       "name             Shaista\n",
       "age                   35\n",
       "address          Karachi\n",
       "session        AFTERNOON\n",
       "group            group B\n",
       "gender            Female\n",
       "subj1               64.9\n",
       "subj2               75.1\n",
       "scholarship       8500.0\n",
       "Name: 2, dtype: object"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[2,:]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4cc79e65",
   "metadata": {},
   "source": [
    "**Example 2:** Select rows with row indices 2, 4 and 1 and all the columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "531547c5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  roll no     name  age    address    session    group  gender  subj1  subj2  \\\n",
       "2    MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "4    MS05     Zara   40   Peshawer        AFT  group D  Female   65.9   72.8   \n",
       "1    MS02     Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "\n",
       "   scholarship  \n",
       "2       8500.0  \n",
       "4       3500.0  \n",
       "1       6000.0  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[[2,4,1], :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5994ef6f",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "4f4e34fb",
   "metadata": {},
   "source": [
    "**Example 3:** Select rows with row indices from 5 to 3 (stop value is inclusive) and all the columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "c342f771",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age  address    session    group  gender  subj1  subj2  \\\n",
       "5     MS06    Mohid   16   Lahore    MORNING  group C  Female   69.3   78.6   \n",
       "12    MS13  Mujahid   18   Lahore    MORNING  group D    Male    NaN   76.5   \n",
       "3     MS04   Hadeed   20   Lahore        MOR  group A    Male   82.0   84.3   \n",
       "11    MS12     Maaz   25  Karachi  AFTERNOON  group C    Male   90.5   81.3   \n",
       "13    MS14     Sara   28   Multan  AFTERNOON  group A  Female   84.1   76.0   \n",
       "14    MS15   Fatima   33  Sialkot        AFT  group C  Female   90.5   81.3   \n",
       "2     MS03  Shaista   35  Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "\n",
       "    scholarship  \n",
       "5           NaN  \n",
       "12       7000.0  \n",
       "3        4000.0  \n",
       "11          NaN  \n",
       "13       8000.0  \n",
       "14       3500.0  \n",
       "2        8500.0  "
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[5:2, :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "e7e7cf2d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age address  session    group  gender  subj1  subj2  \\\n",
       "5     MS06    Mohid   16  Lahore  MORNING  group C  Female   69.3   78.6   \n",
       "12    MS13  Mujahid   18  Lahore  MORNING  group D    Male    NaN   76.5   \n",
       "3     MS04   Hadeed   20  Lahore      MOR  group A    Male   82.0   84.3   \n",
       "\n",
       "    scholarship  \n",
       "5           NaN  \n",
       "12       7000.0  \n",
       "3        4000.0  "
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[5:3, :]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d81adedd",
   "metadata": {},
   "source": [
    "**Question:** If you give the slice as `3:5`, it will select no row. Can anyone guess why it is so?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "43189731",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3     group A\n",
       "12    group D\n",
       "5     group C\n",
       "Name: group, dtype: object"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[3:5:-1, :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "80ac8b1e",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "f3319062",
   "metadata": {},
   "source": [
    ">**In all above examples, if you omit comma and collon for the columns part, Pandas assumes it. I strongly recommend using the above style for clarity of code.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "91e4d7b0",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "6390dab3",
   "metadata": {},
   "source": [
    "### b. Selection of Columns Only"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "id": "522af9aa",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['roll no', 'name', 'age', 'address', 'session', 'group', 'gender',\n",
       "       'subj1', 'subj2', 'scholarship'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "931e08a7",
   "metadata": {},
   "source": [
    "**Example 1:** Select all the row values under the column with label `name`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "id": "2cc48ff8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5         Mohid\n",
       "12      Mujahid\n",
       "3        Hadeed\n",
       "11         Maaz\n",
       "13         Sara\n",
       "14       Fatima\n",
       "2       Shaista\n",
       "10      Khurram\n",
       "9        Shahid\n",
       "4          Zara\n",
       "6         Zobia\n",
       "15    Kakamanna\n",
       "1          Arif\n",
       "7        Idrees\n",
       "0          Rauf\n",
       "8         Jamil\n",
       "Name: name, dtype: object"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[:, 'name']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a6018a75",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "57f887bf",
   "metadata": {},
   "source": [
    "**Example 2:** Select all the row values under the columns with labels `name`, `address`, and `scholarship`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "id": "53d61516",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>address</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Mohid</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Mujahid</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Hadeed</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Maaz</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Sara</td>\n",
       "      <td>Multan</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>Fatima</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Shaista</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Khurram</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Shahid</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Zara</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Zobia</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>Multan</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Arif</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Idrees</td>\n",
       "      <td>Multan</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Jamil</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         name    address  scholarship\n",
       "5       Mohid     Lahore          NaN\n",
       "12    Mujahid     Lahore       7000.0\n",
       "3      Hadeed     Lahore       4000.0\n",
       "11       Maaz    Karachi          NaN\n",
       "13       Sara     Multan       8000.0\n",
       "14     Fatima    Sialkot       3500.0\n",
       "2     Shaista    Karachi       8500.0\n",
       "10    Khurram  Islamabad       6000.0\n",
       "9      Shahid     Lahore       3800.0\n",
       "4        Zara   Peshawer       3500.0\n",
       "6       Zobia    Sialkot       4000.0\n",
       "15  Kakamanna     Multan       3800.0\n",
       "1        Arif  Islamabad       6000.0\n",
       "7      Idrees     Multan       8000.0\n",
       "0        Rauf     Lahore       5000.0\n",
       "8       Jamil    Karachi       3500.0"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[:, ['name', 'address', 'scholarship']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a09667a2",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "89016cac",
   "metadata": {},
   "source": [
    "**Example 3:** Select all the row values under the columns from a column range given as column labels (Note that the stop index is inclusive)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "c119c42e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Empty DataFrame\n",
       "Columns: []\n",
       "Index: [5, 12, 3, 11, 13, 14, 2, 10, 9, 4, 6, 15, 1, 7, 0, 8]"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[:, 'address':'name']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "f54ab064",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Shahid</td>\n",
       "      <td>38</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Zobia</td>\n",
       "      <td>40</td>\n",
       "      <td>Sialkot</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         name  age    address\n",
       "5       Mohid   16     Lahore\n",
       "12    Mujahid   18     Lahore\n",
       "3      Hadeed   20     Lahore\n",
       "11       Maaz   25    Karachi\n",
       "13       Sara   28     Multan\n",
       "14     Fatima   33    Sialkot\n",
       "2     Shaista   35    Karachi\n",
       "10    Khurram   35  Islamabad\n",
       "9      Shahid   38     Lahore\n",
       "4        Zara   40   Peshawer\n",
       "6       Zobia   40    Sialkot\n",
       "15  Kakamanna   42     Multan\n",
       "1        Arif   51  Islamabad\n",
       "7      Idrees   51     Multan\n",
       "0        Rauf   52     Lahore\n",
       "8       Jamil   53    Karachi"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[:, 'name':'address']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "eca0e6fa",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Mohid</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Mujahid</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Hadeed</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Maaz</td>\n",
       "      <td>Karachi</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Sara</td>\n",
       "      <td>Multan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>Fatima</td>\n",
       "      <td>Sialkot</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Shaista</td>\n",
       "      <td>Karachi</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Khurram</td>\n",
       "      <td>Islamabad</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Shahid</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Zara</td>\n",
       "      <td>Peshawer</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Zobia</td>\n",
       "      <td>Sialkot</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>Multan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Arif</td>\n",
       "      <td>Islamabad</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Idrees</td>\n",
       "      <td>Multan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Jamil</td>\n",
       "      <td>Karachi</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         name    address\n",
       "5       Mohid     Lahore\n",
       "12    Mujahid     Lahore\n",
       "3      Hadeed     Lahore\n",
       "11       Maaz    Karachi\n",
       "13       Sara     Multan\n",
       "14     Fatima    Sialkot\n",
       "2     Shaista    Karachi\n",
       "10    Khurram  Islamabad\n",
       "9      Shahid     Lahore\n",
       "4        Zara   Peshawer\n",
       "6       Zobia    Sialkot\n",
       "15  Kakamanna     Multan\n",
       "1        Arif  Islamabad\n",
       "7      Idrees     Multan\n",
       "0        Rauf     Lahore\n",
       "8       Jamil    Karachi"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[:, 'name':'address':2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4dae4af5",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "205220fc",
   "metadata": {},
   "source": [
    "### c. Selection of Rows + Columns\n",
    "```\n",
    "df.iloc[whatrowsIwant, whatcolumnsIwant]\n",
    "```\n",
    "- You can use a single value, a list of multiple values, or a slice object for selecting rows\n",
    "- You can use a single value, a list of multiple values, or a slice object for selecting columns"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d75e875a",
   "metadata": {},
   "source": [
    "**Example 1:** Select rows with row indices 3 and 0, and from those two rows select only columns at column labels `name` and `address`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "7b356699",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>address</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Hadeed</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>Lahore</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     name address\n",
       "3  Hadeed  Lahore\n",
       "0    Rauf  Lahore"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[[3, 0], ['name', 'address']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13174faa",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "44d94e18",
   "metadata": {},
   "source": [
    "**Example 2:** Select rows with row indices 0 to 5 (Stop index is inclusive), and from those six rows select columns `name`, `age` and `session`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "80df8045",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>session</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>MOR</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      name  age    session\n",
       "3   Hadeed   20        MOR\n",
       "11    Maaz   25  AFTERNOON\n",
       "13    Sara   28  AFTERNOON"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[3:13, ['name', 'age', 'session']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4487c360",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "02d3ea27",
   "metadata": {},
   "source": [
    "**Question:** You might be expecting `3:13` will return 11 rows, but it has returned only three. Can anyone guess why it is so?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "id": "4958d945",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>session</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>MORNING</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>MORNING</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>MOR</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>MOR</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Shahid</td>\n",
       "      <td>38</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Zobia</td>\n",
       "      <td>40</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>AFTERNOON</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>MORNING</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>MORNING</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>AFT</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         name  age    session\n",
       "5       Mohid   16    MORNING\n",
       "12    Mujahid   18    MORNING\n",
       "3      Hadeed   20        MOR\n",
       "11       Maaz   25  AFTERNOON\n",
       "13       Sara   28  AFTERNOON\n",
       "14     Fatima   33        AFT\n",
       "2     Shaista   35  AFTERNOON\n",
       "10    Khurram   35        MOR\n",
       "9      Shahid   38  AFTERNOON\n",
       "4        Zara   40        AFT\n",
       "6       Zobia   40        AFT\n",
       "15  Kakamanna   42  AFTERNOON\n",
       "1        Arif   51        AFT\n",
       "7      Idrees   51    MORNING\n",
       "0        Rauf   52    MORNING\n",
       "8       Jamil   53        AFT"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sorted.loc[5:8, ['name', 'age', 'session']]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0e73e5a6",
   "metadata": {},
   "source": [
    "The range `5:8` has returned all the 16 rows in the dataframe :)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "85d0c4a9",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "5cf91bb8",
   "metadata": {},
   "source": [
    "# After having understood all of this my recommendation is:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "83e8e058",
   "metadata": {},
   "source": [
    ">**Always keep the row indices of your dataframe as 0, 1, 2, 3, 4, ... and the column indices as meaningful labels. If after a slicing or sorting operation, the row indices are a bit disturbed, use `df.reset_index()` method to adjust your row indices match the positional indices.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a6fa1808",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "0e123a4b",
   "metadata": {},
   "source": [
    "## 5. Conditional Selection\n",
    "- Suppose we want to select only those rows where the age value is greater than 40. Note this time the dataframe has row indices that match with the positional indices.\n",
    "\n",
    "<img align=\"left\" width=\"400\" height=\"400\"  src=\"images/groupdata2.png\"  >\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "id": "2f853b18",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>MS07</td>\n",
       "      <td>Zobia</td>\n",
       "      <td>40</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>MS10</td>\n",
       "      <td>Shahid</td>\n",
       "      <td>38</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group  gender  subj1  \\\n",
       "0     MS01       Rauf   52     Lahore    MORNING  group C    Male   78.3   \n",
       "1     MS02       Arif   51  Islamabad        AFT  group A    Male   70.5   \n",
       "2     MS03    Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   \n",
       "3     MS04     Hadeed   20     Lahore        MOR  group A    Male   82.0   \n",
       "4     MS05       Zara   40   Peshawer        AFT  group D  Female   65.9   \n",
       "5     MS06      Mohid   16     Lahore    MORNING  group C  Female   69.3   \n",
       "6     MS07      Zobia   40    Sialkot        AFT  group B  Female   90.2   \n",
       "7     MS08     Idrees   51     Multan    MORNING  group D    Male   84.1   \n",
       "8     MS09      Jamil   53    Karachi        AFT  group C    Male   90.5   \n",
       "9     MS10     Shahid   38     Lahore  AFTERNOON  group D    Male   90.5   \n",
       "10    MS11    Khurram   35  Islamabad        MOR  group B    Male   90.5   \n",
       "11    MS12       Maaz   25    Karachi  AFTERNOON  group C    Male   90.5   \n",
       "12    MS13    Mujahid   18     Lahore    MORNING  group D    Male    NaN   \n",
       "13    MS14       Sara   28     Multan  AFTERNOON  group A  Female   84.1   \n",
       "14    MS15     Fatima   33    Sialkot        AFT  group C  Female   90.5   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A    Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "0    84.4       5000.0  \n",
       "1    60.5       6000.0  \n",
       "2    75.1       8500.0  \n",
       "3    84.3       4000.0  \n",
       "4    72.8       3500.0  \n",
       "5    78.6          NaN  \n",
       "6     NaN       4000.0  \n",
       "7    76.0       8000.0  \n",
       "8    81.3       3500.0  \n",
       "9    81.3       3800.0  \n",
       "10   81.3       6000.0  \n",
       "11   81.3          NaN  \n",
       "12   76.5       7000.0  \n",
       "13   76.0       8000.0  \n",
       "14   81.3       3500.0  \n",
       "15   81.3       3800.0  "
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let us read a fresh dataframe and sort it by age column to have a clear understanding about indexing\n",
    "import pandas as pd\n",
    "df = pd.read_csv('datasets/groupdata.csv')\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df7b948d",
   "metadata": {},
   "source": [
    "### a. Option 1:\n",
    "- Create a Python list having Boolean values of exact same length as the rows of the dataframe \n",
    "- The value in the list need to be True for the row which we want to select\n",
    "- Convert the Python list to a Pandas series\n",
    "- Finally pass that series to the dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "id": "f2786c0a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     52\n",
       "1     51\n",
       "2     35\n",
       "3     20\n",
       "4     40\n",
       "5     16\n",
       "6     40\n",
       "7     51\n",
       "8     53\n",
       "9     38\n",
       "10    35\n",
       "11    25\n",
       "12    18\n",
       "13    28\n",
       "14    33\n",
       "15    42\n",
       "Name: age, dtype: int64"
      ]
     },
     "execution_count": 83,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.age"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "id": "2ac3b39e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[True, True, False, False, False, False, False, True, True, False, False, False, False, False, False, True]\n"
     ]
    }
   ],
   "source": [
    "list1 = []\n",
    "for length in df.age:\n",
    "    if length > 40:\n",
    "        list1.append(True)\n",
    "    else:\n",
    "        list1.append(False)\n",
    "print(list1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "id": "b705b8ae",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group gender  subj1  \\\n",
       "0     MS01       Rauf   52     Lahore    MORNING  group C   Male   78.3   \n",
       "1     MS02       Arif   51  Islamabad        AFT  group A   Male   70.5   \n",
       "7     MS08     Idrees   51     Multan    MORNING  group D   Male   84.1   \n",
       "8     MS09      Jamil   53    Karachi        AFT  group C   Male   90.5   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A   Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "0    84.4       5000.0  \n",
       "1    60.5       6000.0  \n",
       "7    76.0       8000.0  \n",
       "8    81.3       3500.0  \n",
       "15   81.3       3800.0  "
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[list1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a251de2d",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "924aa546",
   "metadata": {},
   "source": [
    "### b. Option 2:\n",
    "- Instead of creating a Boolean list using the loop, use the condition inside the `df[cond]` operator, that will automatically generate the Boolean list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "id": "1e7b82e6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group gender  subj1  \\\n",
       "0     MS01       Rauf   52     Lahore    MORNING  group C   Male   78.3   \n",
       "1     MS02       Arif   51  Islamabad        AFT  group A   Male   70.5   \n",
       "7     MS08     Idrees   51     Multan    MORNING  group D   Male   84.1   \n",
       "8     MS09      Jamil   53    Karachi        AFT  group C   Male   90.5   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A   Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "0    84.4       5000.0  \n",
       "1    60.5       6000.0  \n",
       "7    76.0       8000.0  \n",
       "8    81.3       3500.0  \n",
       "15   81.3       3800.0  "
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df['age']>40]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "70fcfd32",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group gender  subj1  \\\n",
       "0     MS01       Rauf   52     Lahore    MORNING  group C   Male   78.3   \n",
       "1     MS02       Arif   51  Islamabad        AFT  group A   Male   70.5   \n",
       "7     MS08     Idrees   51     Multan    MORNING  group D   Male   84.1   \n",
       "8     MS09      Jamil   53    Karachi        AFT  group C   Male   90.5   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A   Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "0    84.4       5000.0  \n",
       "1    60.5       6000.0  \n",
       "7    76.0       8000.0  \n",
       "8    81.3       3500.0  \n",
       "15   81.3       3800.0  "
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df.age >40]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f60a6f70",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "5c503e91",
   "metadata": {},
   "source": [
    "### c. Option 3:\n",
    "- Best way is to use the **`df.loc[cond]`** method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "id": "2f5e1006",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group gender  subj1  \\\n",
       "0     MS01       Rauf   52     Lahore    MORNING  group C   Male   78.3   \n",
       "1     MS02       Arif   51  Islamabad        AFT  group A   Male   70.5   \n",
       "7     MS08     Idrees   51     Multan    MORNING  group D   Male   84.1   \n",
       "8     MS09      Jamil   53    Karachi        AFT  group C   Male   90.5   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A   Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "0    84.4       5000.0  \n",
       "1    60.5       6000.0  \n",
       "7    76.0       8000.0  \n",
       "8    81.3       3500.0  \n",
       "15   81.3       3800.0  "
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[df.age > 40]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "id": "3936aad0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         name  age\n",
       "0        Rauf   52\n",
       "1        Arif   51\n",
       "7      Idrees   51\n",
       "8       Jamil   53\n",
       "15  Kakamanna   42"
      ]
     },
     "execution_count": 86,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Using 'loc' gives you the facility to slice the required columns as well\n",
    "df.loc[df.age > 40, ['name', 'age']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7bc2cd79",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "481d208f",
   "metadata": {},
   "source": [
    "### d. Conditional Selection based on  Multiple Conditions\n",
    "- Suppose we want to get all the records of the dataframe where the age value is greater than 40 and belong to Multan\n",
    "- For this use multiple conditions inside parenthesis and use logical operators (`&`, `|`) in between\n",
    "```\n",
    "df[(condition1) op (condition2) op (condition3)]\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "55c0a5b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "id": "b8a3f9d4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>MS10</td>\n",
       "      <td>Shahid</td>\n",
       "      <td>38</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age    address    session    group  gender  subj1  subj2  \\\n",
       "2     MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "3     MS04   Hadeed   20     Lahore        MOR  group A    Male   82.0   84.3   \n",
       "5     MS06    Mohid   16     Lahore    MORNING  group C  Female   69.3   78.6   \n",
       "9     MS10   Shahid   38     Lahore  AFTERNOON  group D    Male   90.5   81.3   \n",
       "10    MS11  Khurram   35  Islamabad        MOR  group B    Male   90.5   81.3   \n",
       "11    MS12     Maaz   25    Karachi  AFTERNOON  group C    Male   90.5   81.3   \n",
       "12    MS13  Mujahid   18     Lahore    MORNING  group D    Male    NaN   76.5   \n",
       "14    MS15   Fatima   33    Sialkot        AFT  group C  Female   90.5   81.3   \n",
       "\n",
       "    scholarship  \n",
       "2        8500.0  \n",
       "3        4000.0  \n",
       "5           NaN  \n",
       "9        3800.0  \n",
       "10       6000.0  \n",
       "11          NaN  \n",
       "12       7000.0  \n",
       "14       3500.0  "
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[(df.age < 40) & (df.address != 'Multan')]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "799206fa",
   "metadata": {},
   "outputs": [],
   "source": [
    "df[(df.age > 40) & (df.address == 'Multan')]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9ee9ada0",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "id": "4821330d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group gender  subj1  \\\n",
       "1     MS02       Arif   51  Islamabad        AFT  group A   Male   70.5   \n",
       "3     MS04     Hadeed   20     Lahore        MOR  group A   Male   82.0   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A   Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "1    60.5       6000.0  \n",
       "3    84.3       4000.0  \n",
       "15   81.3       3800.0  "
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Select records of group A male students only\n",
    "df1 = df[(df.group == 'group A') & (df.gender == 'Male')]  \n",
    "df1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "69f8f258",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "id": "bbb05754",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>MS07</td>\n",
       "      <td>Zobia</td>\n",
       "      <td>40</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age  address    session    group  gender  subj1  subj2  \\\n",
       "2     MS03  Shaista   35  Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "6     MS07    Zobia   40  Sialkot        AFT  group B  Female   90.2    NaN   \n",
       "8     MS09    Jamil   53  Karachi        AFT  group C    Male   90.5   81.3   \n",
       "11    MS12     Maaz   25  Karachi  AFTERNOON  group C    Male   90.5   81.3   \n",
       "14    MS15   Fatima   33  Sialkot        AFT  group C  Female   90.5   81.3   \n",
       "\n",
       "    scholarship  \n",
       "2        8500.0  \n",
       "6        4000.0  \n",
       "8        3500.0  \n",
       "11          NaN  \n",
       "14       3500.0  "
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Select the records of students who belong to Sialkot or Karachi\n",
    "df1 = df[(df.address == 'Sialkot') | (df.address == 'Karachi')]\n",
    "df1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24114c02",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "id": "75d9fc93",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no    name  age   address    session    group  gender  subj1  subj2  \\\n",
       "4     MS05    Zara   40  Peshawer        AFT  group D  Female   65.9   72.8   \n",
       "7     MS08  Idrees   51    Multan    MORNING  group D    Male   84.1   76.0   \n",
       "13    MS14    Sara   28    Multan  AFTERNOON  group A  Female   84.1   76.0   \n",
       "\n",
       "    scholarship  \n",
       "4        3500.0  \n",
       "7        8000.0  \n",
       "13       8000.0  "
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Select records who lives outside Karachi and earn a scholarship of greater than 7000, or lives in Peshawer\n",
    "out = df[(df.address != 'Karachi') & (df.scholarship > 7000) | (df.address == 'Peshawer')]\n",
    "out\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5e557480",
   "metadata": {},
   "source": [
    "**If there are many conditions connected with or operator, you can simplify it using the `series.isin()` method as shown below:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "id": "4f94f4c1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no     name  age    address    session    group  gender  subj1  subj2  \\\n",
       "1     MS02     Arif   51  Islamabad        AFT  group A    Male   70.5   60.5   \n",
       "2     MS03  Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   75.1   \n",
       "4     MS05     Zara   40   Peshawer        AFT  group D  Female   65.9   72.8   \n",
       "8     MS09    Jamil   53    Karachi        AFT  group C    Male   90.5   81.3   \n",
       "10    MS11  Khurram   35  Islamabad        MOR  group B    Male   90.5   81.3   \n",
       "11    MS12     Maaz   25    Karachi  AFTERNOON  group C    Male   90.5   81.3   \n",
       "\n",
       "    scholarship  \n",
       "1        6000.0  \n",
       "2        8500.0  \n",
       "4        3500.0  \n",
       "8        3500.0  \n",
       "10       6000.0  \n",
       "11          NaN  "
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df.address.isin(['Karachi', 'Peshawer', 'Islamabad'])]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a968651d",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "787bd353",
   "metadata": {},
   "source": [
    "## 6. Selecting columns of a specific data type\n",
    "- The `df.select_dtypes()` method is used to get a subset of the dataframe to select columns of a specific datatype(s) \n",
    "```\n",
    "df.select_dtypes(include, exclude)\n",
    "```\n",
    "\n",
    "- `include` and `exclude` arguments can be scalar or list-like\n",
    "- Atleast one of these parameters must be supplied"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "id": "ed488fd1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "roll no         object\n",
       "name            object\n",
       "age              int64\n",
       "address         object\n",
       "session         object\n",
       "group           object\n",
       "gender          object\n",
       "subj1          float64\n",
       "subj2          float64\n",
       "scholarship    float64\n",
       "dtype: object"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let us first check the data types of each column\n",
    "df.dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "id": "077b0967",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>90.2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    subj1  subj2  scholarship\n",
       "0    78.3   84.4       5000.0\n",
       "1    70.5   60.5       6000.0\n",
       "2    64.9   75.1       8500.0\n",
       "3    82.0   84.3       4000.0\n",
       "4    65.9   72.8       3500.0\n",
       "5    69.3   78.6          NaN\n",
       "6    90.2    NaN       4000.0\n",
       "7    84.1   76.0       8000.0\n",
       "8    90.5   81.3       3500.0\n",
       "9    90.5   81.3       3800.0\n",
       "10   90.5   81.3       6000.0\n",
       "11   90.5   81.3          NaN\n",
       "12    NaN   76.5       7000.0\n",
       "13   84.1   76.0       8000.0\n",
       "14   90.5   81.3       3500.0\n",
       "15   90.5   81.3       3800.0"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Select the columns with object data type (categorical variables) only`\n",
    "df.select_dtypes(include='float64')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "id": "bd895191",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>52</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>51</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>35</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>20</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>40</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   age  subj1  subj2  scholarship\n",
       "0   52   78.3   84.4       5000.0\n",
       "1   51   70.5   60.5       6000.0\n",
       "2   35   64.9   75.1       8500.0\n",
       "3   20   82.0   84.3       4000.0\n",
       "4   40   65.9   72.8       3500.0"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Select the columns with int64 datatype\n",
    "df.select_dtypes(include=['int64', 'float64']).head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "953eb361",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "36c7b823",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fe1c9368",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d0a04543",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9718eb42",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a0fbd2c6",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "e183e117",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>roll no</th>\n",
       "      <th>name</th>\n",
       "      <th>age</th>\n",
       "      <th>address</th>\n",
       "      <th>session</th>\n",
       "      <th>group</th>\n",
       "      <th>gender</th>\n",
       "      <th>subj1</th>\n",
       "      <th>subj2</th>\n",
       "      <th>scholarship</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>MS06</td>\n",
       "      <td>Mohid</td>\n",
       "      <td>16</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>69.3</td>\n",
       "      <td>78.6</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>MS13</td>\n",
       "      <td>Mujahid</td>\n",
       "      <td>18</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76.5</td>\n",
       "      <td>7000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>MS04</td>\n",
       "      <td>Hadeed</td>\n",
       "      <td>20</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>82.0</td>\n",
       "      <td>84.3</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>MS12</td>\n",
       "      <td>Maaz</td>\n",
       "      <td>25</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>MS14</td>\n",
       "      <td>Sara</td>\n",
       "      <td>28</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Female</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>MS15</td>\n",
       "      <td>Fatima</td>\n",
       "      <td>33</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>MS03</td>\n",
       "      <td>Shaista</td>\n",
       "      <td>35</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>64.9</td>\n",
       "      <td>75.1</td>\n",
       "      <td>8500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>MS11</td>\n",
       "      <td>Khurram</td>\n",
       "      <td>35</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>MOR</td>\n",
       "      <td>group B</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>MS10</td>\n",
       "      <td>Shahid</td>\n",
       "      <td>38</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>MS05</td>\n",
       "      <td>Zara</td>\n",
       "      <td>40</td>\n",
       "      <td>Peshawer</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group D</td>\n",
       "      <td>Female</td>\n",
       "      <td>65.9</td>\n",
       "      <td>72.8</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>MS07</td>\n",
       "      <td>Zobia</td>\n",
       "      <td>40</td>\n",
       "      <td>Sialkot</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group B</td>\n",
       "      <td>Female</td>\n",
       "      <td>90.2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>MS16</td>\n",
       "      <td>Kakamanna</td>\n",
       "      <td>42</td>\n",
       "      <td>Multan</td>\n",
       "      <td>AFTERNOON</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3800.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>MS02</td>\n",
       "      <td>Arif</td>\n",
       "      <td>51</td>\n",
       "      <td>Islamabad</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group A</td>\n",
       "      <td>Male</td>\n",
       "      <td>70.5</td>\n",
       "      <td>60.5</td>\n",
       "      <td>6000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>MS08</td>\n",
       "      <td>Idrees</td>\n",
       "      <td>51</td>\n",
       "      <td>Multan</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group D</td>\n",
       "      <td>Male</td>\n",
       "      <td>84.1</td>\n",
       "      <td>76.0</td>\n",
       "      <td>8000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>MS01</td>\n",
       "      <td>Rauf</td>\n",
       "      <td>52</td>\n",
       "      <td>Lahore</td>\n",
       "      <td>MORNING</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>78.3</td>\n",
       "      <td>84.4</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>MS09</td>\n",
       "      <td>Jamil</td>\n",
       "      <td>53</td>\n",
       "      <td>Karachi</td>\n",
       "      <td>AFT</td>\n",
       "      <td>group C</td>\n",
       "      <td>Male</td>\n",
       "      <td>90.5</td>\n",
       "      <td>81.3</td>\n",
       "      <td>3500.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   roll no       name  age    address    session    group  gender  subj1  \\\n",
       "5     MS06      Mohid   16     Lahore    MORNING  group C  Female   69.3   \n",
       "12    MS13    Mujahid   18     Lahore    MORNING  group D    Male    NaN   \n",
       "3     MS04     Hadeed   20     Lahore        MOR  group A    Male   82.0   \n",
       "11    MS12       Maaz   25    Karachi  AFTERNOON  group C    Male   90.5   \n",
       "13    MS14       Sara   28     Multan  AFTERNOON  group A  Female   84.1   \n",
       "14    MS15     Fatima   33    Sialkot        AFT  group C  Female   90.5   \n",
       "2     MS03    Shaista   35    Karachi  AFTERNOON  group B  Female   64.9   \n",
       "10    MS11    Khurram   35  Islamabad        MOR  group B    Male   90.5   \n",
       "9     MS10     Shahid   38     Lahore  AFTERNOON  group D    Male   90.5   \n",
       "4     MS05       Zara   40   Peshawer        AFT  group D  Female   65.9   \n",
       "6     MS07      Zobia   40    Sialkot        AFT  group B  Female   90.2   \n",
       "15    MS16  Kakamanna   42     Multan  AFTERNOON  group A    Male   90.5   \n",
       "1     MS02       Arif   51  Islamabad        AFT  group A    Male   70.5   \n",
       "7     MS08     Idrees   51     Multan    MORNING  group D    Male   84.1   \n",
       "0     MS01       Rauf   52     Lahore    MORNING  group C    Male   78.3   \n",
       "8     MS09      Jamil   53    Karachi        AFT  group C    Male   90.5   \n",
       "\n",
       "    subj2  scholarship  \n",
       "5    78.6          NaN  \n",
       "12   76.5       7000.0  \n",
       "3    84.3       4000.0  \n",
       "11   81.3          NaN  \n",
       "13   76.0       8000.0  \n",
       "14   81.3       3500.0  \n",
       "2    75.1       8500.0  \n",
       "10   81.3       6000.0  \n",
       "9    81.3       3800.0  \n",
       "4    72.8       3500.0  \n",
       "6     NaN       4000.0  \n",
       "15   81.3       3800.0  \n",
       "1    60.5       6000.0  \n",
       "7    76.0       8000.0  \n",
       "0    84.4       5000.0  \n",
       "8    81.3       3500.0  "
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let us read a fresh dataframe and sort it by age column to have a clear understanding about indexing\n",
    "import pandas as pd\n",
    "df = pd.read_csv('datasets/groupdata.csv')\n",
    "df = df.sort_values('age')\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "5b82d058",
   "metadata": {},
   "outputs": [],
   "source": [
    "dfgb = df.groupby('address')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "0bf78074",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3800.0"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby('address').get_group('Multan').scholarship.min()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c718dae8",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "022bc741",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(16, 7)"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# It will delete the Scholarship column\n",
    "df_sorted.drop(columns=df.columns[df.apply(lambda col: col.isnull().sum() >= 1)], axis=1, inplace=True)\n",
    "df_sorted.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d879e8b4",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}