{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "4a87b5ef",
   "metadata": {},
   "source": [
    "---   \n",
    " <img align=\"left\" width=\"75\" height=\"75\"  src=\"https://upload.wikimedia.org/wikipedia/en/c/c8/University_of_the_Punjab_logo.png\"> \n",
    "\n",
    "<h1 align=\"center\">Department of Data Science</h1>\n",
    "<h1 align=\"center\">Course: Tools and Techniques for Data Science</h1>\n",
    "\n",
    "---\n",
    "<h3><div align=\"right\">Instructor: Muhammad Arif Butt, Ph.D.</div></h3>    "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab0dc25c",
   "metadata": {},
   "source": [
    "<h1 align=\"center\">Lecture 3.18 (Pandas-10)</h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19f82705",
   "metadata": {},
   "source": [
    "## _Merging, Joining, Concatenating and Appending Dataframes_\n",
    "<img align=\"right\" width=\"400\" height=\"400\"  src=\"images/pandas-apps.png\"  >"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "68b37b21",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9dffd4a",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2a90b6a5",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "12db95e1",
   "metadata": {},
   "source": [
    "## Learning agenda of this notebook\n",
    "\n",
    "**PART-I: (Merging and Joining)**\n",
    "1. Merging DataFrames using `pd.merge()` method\n",
    "   - Perform **Inner Join** (which is default)\n",
    "   - Peform **Outer**/**Full Outer Join**\n",
    "   - Perform **Left Outer Join**\n",
    "   - Perform **Right Outer Join**<br><br>\n",
    "2. Additional Parameters to `pd.merge()` Method  \n",
    "   - Use of `indicator` parameter to indicate the df to which the value belong\n",
    "   - Use of `suffixes` parameter to differentiate between common column labels\n",
    "   - Use of `validate` parameter to check for duplicate keys\n",
    "   \n",
    "**PART-II: (Concatenating and Appending)**    \n",
    "\n",
    "3. Row wise Concatenation using `pd.concat()`\n",
    "\n",
    "4. Column wise Concatenation using `pd.concat()`\n",
    "\n",
    "5. Adding a Single Row/Column in a Dataframe using `pd.concat()`\n",
    "\n",
    "6. Appending Dataframes using `df.append()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4272d642",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "932255a2",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "33c142c4",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "b820b858",
   "metadata": {},
   "source": [
    "# Part-I (Merging and Joining)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b919365a",
   "metadata": {},
   "source": [
    "## 1. Merging DataFrames using `pd.merge()` Method\n",
    "Pandas `pd.merge()` is a versatile method to perform all standard database join operations between DataFrame or named Series objects.\n",
    "\n",
    "```\n",
    "pd.merge(left, right, how=\"inner\", indicator=False, on=None, suffixes=(\"_x\", \"_y\"), validate=None)\n",
    "```\n",
    "Where,\n",
    "- **`left`:** A DataFrame or named Series object.\n",
    "- **`right`:** Another DataFrame or named Series object.\n",
    "- **`how`:** specifies the type of join {`inner`, `outer`, `left`, `right`} (default is `inner`)\n",
    "- **`on`:** Column or index level names to join on. Must be found in both the left and right DataFrame and/or Series objects. \n",
    "- **`indicator`:** If set to True, adds a column to the output DataFrame called **`_merge`** with information on the source of each row {`left_only` means, this element is present only in left Dataframe, `right_only` means this is present only in right dataframe, `both` means they are present in both\n",
    "- **`suffixes`:** A tuple of string suffixes to apply to overlapping columns. Defaults to ('_x', '_y').\n",
    "- **`validate`:** If specified, checks for uniqueness of keys. This parameter can take following four values (default is None):\n",
    "    - “one_to_one” or “1:1”: checks if merge keys are unique in both left and right datasets.\n",
    "    - “one_to_many” or “1:m”: checks if merge keys are unique in left dataset.\n",
    "    - “many_to_one” or “m:1”: checks if merge keys are unique in right dataset.\n",
    "    - “many_to_many” or “m:m”: allowed, but does not result in checks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "73dafdca",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2f0c9c87",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9cff27b2",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "6946785c",
   "metadata": {},
   "source": [
    "### a. Inner  Join:\n",
    "\n",
    "It is the most common type of join you’ll be working with. It returns a dataframe with only those rows that have common characteristics.\n",
    "An inner join requires each row in the two joined dataframes to have matching column values. This is similar to the intersection of two sets.\n",
    "\n",
    "<img align=\"center\" width=\"900\" height=\"600\"  src=\"images/join-inner.png\"  >"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "573706e5",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "d880f298",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Muree</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Sialkot</td>\n",
       "      <td>32</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  temperature\n",
       "0    Lahore           39\n",
       "1     Muree           14\n",
       "2  Peshawer           29\n",
       "3   Sialkot           32"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let us create a simple data frame\n",
    "import pandas as pd\n",
    "\n",
    "# This dataframe doesn't have entry for Lahore\n",
    "df_temp = pd.DataFrame({\n",
    "    'city': ['Lahore', 'Muree', 'Peshawer', 'Sialkot'],\n",
    "    'temperature' : [39, 14, 29, 32]\n",
    "})\n",
    "df_temp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "bea9cf5d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  humidity\n",
       "0   Karachi        76\n",
       "1    Lahore        95\n",
       "2  Peshawer        72\n",
       "3    Lahore        70\n",
       "4     Muree        75"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "# This Dataframe has an extra entry for Multan\n",
    "df_hum = pd.DataFrame({\n",
    "    'city': [ 'Karachi', 'Lahore', 'Peshawer', 'Lahore', 'Muree'],\n",
    "    'humidity' : [76, 95, 72, 70, 75]\n",
    "})\n",
    "df_hum"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ed2fa36",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8120fb57",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3a4de6a9",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "7e1325d0",
   "metadata": {},
   "source": [
    "**Note the column `city` on which we want to perform an inner join, in the two dataframes has only four cities in common. So the resulting dataframe will have only four rows that are common in both dataframes**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "934ce070",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>39</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Muree</td>\n",
       "      <td>14</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>29</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  temperature  humidity\n",
       "0    Lahore           39        95\n",
       "1    Lahore           39        70\n",
       "2     Muree           14        75\n",
       "3  Peshawer           29        72"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d1 = pd.merge(df_temp, df_hum, how='inner')\n",
    "d1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "0c2e41af",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "      <th>_merge</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>39</td>\n",
       "      <td>70</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Muree</td>\n",
       "      <td>14</td>\n",
       "      <td>75</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>29</td>\n",
       "      <td>72</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  temperature  humidity _merge\n",
       "0    Lahore           39        95   both\n",
       "1    Lahore           39        70   both\n",
       "2     Muree           14        75   both\n",
       "3  Peshawer           29        72   both"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# merge will perform only for those cities that are common in both, which means it by-default performs inner-join\n",
    "d1 = pd.merge(df_temp, df_hum, on='city', how = 'inner', indicator=True)\n",
    "d1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "dbb58d7d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>humidity</th>\n",
       "      <th>temperature</th>\n",
       "      <th>_merge</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>95</td>\n",
       "      <td>39</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>70</td>\n",
       "      <td>39</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>72</td>\n",
       "      <td>29</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Muree</td>\n",
       "      <td>75</td>\n",
       "      <td>14</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  humidity  temperature _merge\n",
       "0    Lahore        95           39   both\n",
       "1    Lahore        70           39   both\n",
       "2  Peshawer        72           29   both\n",
       "3     Muree        75           14   both"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Note only the sequence of o/p dataframe changes once we change the order in case of inner join\n",
    "d1 = pd.merge(df_hum, df_temp, on='city', how = 'inner', indicator=True)\n",
    "d1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "247e167a",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6ce3fa49",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "70c6ee02",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d378c092",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "965cd00c",
   "metadata": {},
   "source": [
    "### b. Full Join:\n",
    "Also known as Full Outer Join, returns all those records which either have a match in the left or right dataframe. This is similar to the union of two sets.\n",
    "\n",
    "<img align=\"center\" width=\"900\" height=\"600\"  src=\"images/join-fullouter.png\"  >"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4ca0bcc3",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "42016dbd",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b0033b72",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ae1c49c3",
   "metadata": {},
   "outputs": [],
   "source": [
    "df_temp = pd.DataFrame({\n",
    "    'city': ['Lahore', 'Muree', 'Peshawer', 'Sialkot'],\n",
    "    'temperature' : [39, 14, 29, 32],\n",
    "})\n",
    "df_temp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "56d45d38",
   "metadata": {},
   "outputs": [],
   "source": [
    "df_hum = pd.DataFrame({\n",
    "    'city': [ 'Karachi', 'Lahore', 'Peshawer', 'Lahore', 'Muree'],\n",
    "    'humidity' : [76, 95, 72, 70, 75],\n",
    "})\n",
    "df_hum"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "015bf262",
   "metadata": {},
   "source": [
    "**Note the column `city` on which we want to perform a full outer join, in the two dataframes has a union of  seven cities. So the resulting dataframe will have seven rows**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ff6531e4",
   "metadata": {},
   "outputs": [],
   "source": [
    "d2 = pd.merge(df_temp, df_hum, on='city', how='outer', indicator=True)\n",
    "d2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a594b233",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Note only the sequence of o/p dataframe changes once we change the order in case of inner join\n",
    "d3 = pd.merge(df_hum, df_temp, on='city', how='outer', indicator=True)\n",
    "d3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d4f35e06",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8e1d58bd",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b47a7599",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d2dfad95",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "117ae3fb",
   "metadata": {},
   "source": [
    "###  c. Left Join\n",
    "Also known as Left outer join. It is simply performs an inner join plus all the non-matching rows of the left dataframe are taken as it is filled with NaN for columns of the right dataframe.\n",
    "\n",
    "<img align=\"center\" width=\"900\" height=\"600\"  src=\"images/join-leftouter.png\"  >"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "37dfc333",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2a77e4a1",
   "metadata": {},
   "outputs": [],
   "source": [
    "df_temp = pd.DataFrame({\n",
    "    'city': ['Lahore', 'Muree', 'Peshawer', 'Sialkot'],\n",
    "    'temperature' : [39, 14, 29, 32],\n",
    "})\n",
    "df_temp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a99c5976",
   "metadata": {},
   "outputs": [],
   "source": [
    "df_hum = pd.DataFrame({\n",
    "    'city': [ 'Karachi', 'Lahore', 'Peshawer', 'Lahore', 'Muree'],\n",
    "    'humidity' : [76, 95, 72, 70, 75],\n",
    "})\n",
    "df_hum"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1565198",
   "metadata": {},
   "source": [
    "**Note the column `city` on which we want to perform a left outer join, in the two dataframes has an intersection of four rows. Other than these four rows, record of city Sialkot from left dataframe will also be included in the resulting dataframe being a left outer join**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "39e6bedf",
   "metadata": {},
   "outputs": [],
   "source": [
    "# In left outer join, it takes all the rows from left dataframe and only common rows from right dataframe\n",
    "d3 = pd.merge(df_temp, df_hum, on='city', how='left', indicator=True)\n",
    "d3"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "516c5e5a",
   "metadata": {},
   "source": [
    "**Left and right outer join also depend on the order of Dataframes that are passed to merge() function. Let us change the order and understand this**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a1050bad",
   "metadata": {},
   "outputs": [],
   "source": [
    "d4 = pd.merge(df_hum, df_temp, on='city', how='left', indicator=True)\n",
    "d4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c26ba60",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8ccf530e",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "01a0b8cc",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1770311c",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "e6d366a5",
   "metadata": {},
   "source": [
    "### d. Right  Join\n",
    "Also known as Right outer join. It is simply performs an inner join plus  all the non-matching rows of the right dataframe are taken as it is filled with NaN for columns of the left dataframe.\n",
    "\n",
    "<img align=\"center\" width=\"900\" height=\"600\"  src=\"images/join-rightouter.png\"  >"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dce2e9be",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Muree</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Sialkot</td>\n",
       "      <td>32</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  temperature\n",
       "0    Lahore           39\n",
       "1     Muree           14\n",
       "2  Peshawer           29\n",
       "3   Sialkot           32"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "df_temp = pd.DataFrame({\n",
    "    'city': ['Lahore', 'Muree', 'Peshawer', 'Sialkot'],\n",
    "    'temperature' : [39, 14, 29, 32],\n",
    "})\n",
    "df_temp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "e332bc9b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  humidity\n",
       "0   Karachi        76\n",
       "1    Lahore        95\n",
       "2  Peshawer        72\n",
       "3    Lahore        70\n",
       "4     Muree        75"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_hum = pd.DataFrame({\n",
    "    'city': [ 'Karachi', 'Lahore', 'Peshawer', 'Lahore', 'Muree'],\n",
    "    'humidity' : [76, 95, 72, 70, 75],\n",
    "})\n",
    "df_hum"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f2fad33",
   "metadata": {},
   "source": [
    "**Note the column `city` on which we want to perform a right outer join, in the two dataframes has an intersection of four rows. Other than these four rows, record of city Karachi from right dataframe will also be included in the resulting dataframe being a right outer join**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "378b81da",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "      <th>_merge</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>NaN</td>\n",
       "      <td>76</td>\n",
       "      <td>right_only</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>39.0</td>\n",
       "      <td>95</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>29.0</td>\n",
       "      <td>72</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>39.0</td>\n",
       "      <td>70</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>14.0</td>\n",
       "      <td>75</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  temperature  humidity      _merge\n",
       "0   Karachi          NaN        76  right_only\n",
       "1    Lahore         39.0        95        both\n",
       "2  Peshawer         29.0        72        both\n",
       "3    Lahore         39.0        70        both\n",
       "4     Muree         14.0        75        both"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# In Right outer join, it takes all the rows from Right dataframe and only common rows from left dataframe\n",
    "df3 = pd.merge(df_temp, df_hum, on='city', how='right', indicator=True)\n",
    "df3"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a86b2254",
   "metadata": {},
   "source": [
    "**Left and right outer join also depend on the order of Dataframes that are passed to merge() function. Let us change the order and understand this**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "ff470642",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>humidity</th>\n",
       "      <th>temperature</th>\n",
       "      <th>_merge</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>95.0</td>\n",
       "      <td>39</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>70.0</td>\n",
       "      <td>39</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Muree</td>\n",
       "      <td>75.0</td>\n",
       "      <td>14</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>72.0</td>\n",
       "      <td>29</td>\n",
       "      <td>both</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Sialkot</td>\n",
       "      <td>NaN</td>\n",
       "      <td>32</td>\n",
       "      <td>right_only</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  humidity  temperature      _merge\n",
       "0    Lahore      95.0           39        both\n",
       "1    Lahore      70.0           39        both\n",
       "2     Muree      75.0           14        both\n",
       "3  Peshawer      72.0           29        both\n",
       "4   Sialkot       NaN           32  right_only"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df4 = pd.merge(df_hum, df_temp, on='city', how='right', indicator=True)\n",
    "df4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1e9d89c",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "02431004",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "824a1129",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "ad5a5458",
   "metadata": {},
   "source": [
    "## 2. Additional Parameters to `pd.merge()` Method"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6ff418a6",
   "metadata": {},
   "source": [
    "####  Use of `suffixes` Parameter\n",
    "- When you merge dataframes having columns with same labels, other than the one on which you are joining ('city`)\n",
    "- The resulting dataframe will have appended suffixes (`_x`, `_y`) with column labels to differentiate b/w columns of both dataframes\n",
    "- For better understanding you can pass `suffixes`.....\n",
    "- Let us understand this by example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "e367b100",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  temperature  humidity\n",
       "0    Lahore           35        76\n",
       "1   Karachi           39        95\n",
       "2  Peshawer           33        72"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer'],\n",
    "    'temperature' : [35, 39, 33],\n",
    "    'humidity' : [76, 95, 72]\n",
    "})\n",
    "\n",
    "df1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "d5092f29",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>47</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0    Karachi           41        88\n",
       "1   Peshawer           44        99\n",
       "2  Islamabad           47        79"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.DataFrame({\n",
    "    'city': [ 'Karachi', 'Peshawer', 'Islamabad'],\n",
    "    'temperature' : [41, 44, 47],\n",
    "    'humidity' : [88, 99, 79]\n",
    "})\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "b729d62b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature_x</th>\n",
       "      <th>humidity_x</th>\n",
       "      <th>temperature_y</th>\n",
       "      <th>humidity_y</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  temperature_x  humidity_x  temperature_y  humidity_y\n",
       "0   Karachi             39          95             41          88\n",
       "1  Peshawer             33          72             44          99"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3 = pd.merge(df1, df2, on='city', how='inner')\n",
    "df3"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5c2d02e8",
   "metadata": {},
   "source": [
    "- **Note that `merge` has automatically appended suffixes with column labels to differentiate b/w columns of both dataframes**\n",
    "- **You can use the `suffixes` parameter to `pd.merge()` method to specify the suffixes other than `_x` and `_y` to something more meaningful.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "5861db25",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature_left</th>\n",
       "      <th>humidity_left</th>\n",
       "      <th>temperature_right</th>\n",
       "      <th>humidity_right</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       city  temperature_left  humidity_left  temperature_right  \\\n",
       "0   Karachi                39             95                 41   \n",
       "1  Peshawer                33             72                 44   \n",
       "\n",
       "   humidity_right  \n",
       "0              88  \n",
       "1              99  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d3 = pd.merge(df1, df2, on='city', how='inner', suffixes=('_left','_right'))\n",
    "d3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6fd64dd8",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ea2b5ad",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dd5940c8",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "30693470",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3cf67c74",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "40ee79f2",
   "metadata": {},
   "source": [
    "####  Use `validate` Parameter to Check for Duplicate Keys\n",
    "- We can use the `validate` parameter to the `pd.merge()` method to check for uniqueness of keys. This parameter can take following four values (default is None):\n",
    "    - `one_to_one` or `1:1`: checks if merge keys are unique in both left and right datasets.\n",
    "    - `one_to_many` or `1:m`: checks if merge keys are unique in left dataset.\n",
    "    - `many_to_one` or `m:1`: checks if merge keys are unique in right dataset.\n",
    "    - `many_to_many` or `m:m`: allowed, but does not result in checks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "40e9ff4a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      city  temperature\n",
       "0   Lahore           35\n",
       "1  Karachi           39\n",
       "2    Muree           15"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Muree'],\n",
    "    'temperature' : [35, 39, 15],\n",
    "})\n",
    "df1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "53191111",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  humidity\n",
       "0     Lahore        76\n",
       "1    Karachi        95\n",
       "2  Islamabad        72\n",
       "3     Lahore        76"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Islamabad', 'Lahore'],\n",
    "    'humidity' : [76, 95, 72, 76],\n",
    "})\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5dcfaf7a",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e6df3cec",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "d0bd6257",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Muree</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         city  humidity\n",
       "0      Lahore       NaN\n",
       "1     Karachi       NaN\n",
       "2       Muree       NaN\n",
       "3      Lahore       NaN\n",
       "4     Karachi       NaN\n",
       "5   Islamabad       NaN\n",
       "6      Lahore       NaN\n",
       "7      Lahore      76.0\n",
       "8     Karachi      95.0\n",
       "9   Islamabad      72.0\n",
       "10     Lahore      76.0"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.concat([df1, df2] , ignore_index=True)\n",
    "df1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "91661194",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Muree</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>Muree</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>Muree</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>38</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>40</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>41</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>42</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>43</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>44</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>46</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>47</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>49</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>51</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>52</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         city  humidity\n",
       "0      Lahore       NaN\n",
       "1     Karachi       NaN\n",
       "2       Muree       NaN\n",
       "3      Lahore      76.0\n",
       "4     Karachi      95.0\n",
       "5   Islamabad      72.0\n",
       "6      Lahore      76.0\n",
       "7      Lahore      76.0\n",
       "8     Karachi      95.0\n",
       "9   Islamabad      72.0\n",
       "10     Lahore      76.0\n",
       "11     Lahore      76.0\n",
       "12    Karachi      95.0\n",
       "13  Islamabad      72.0\n",
       "14     Lahore      76.0\n",
       "15     Lahore       NaN\n",
       "16    Karachi       NaN\n",
       "17      Muree       NaN\n",
       "18     Lahore      76.0\n",
       "19    Karachi      95.0\n",
       "20  Islamabad      72.0\n",
       "21     Lahore      76.0\n",
       "22     Lahore      76.0\n",
       "23    Karachi      95.0\n",
       "24  Islamabad      72.0\n",
       "25     Lahore      76.0\n",
       "26     Lahore      76.0\n",
       "27    Karachi      95.0\n",
       "28  Islamabad      72.0\n",
       "29     Lahore      76.0\n",
       "30     Lahore      76.0\n",
       "31    Karachi      95.0\n",
       "32  Islamabad      72.0\n",
       "33     Lahore      76.0\n",
       "34     Lahore       NaN\n",
       "35    Karachi       NaN\n",
       "36      Muree       NaN\n",
       "37     Lahore      76.0\n",
       "38    Karachi      95.0\n",
       "39  Islamabad      72.0\n",
       "40     Lahore      76.0\n",
       "41     Lahore      76.0\n",
       "42    Karachi      95.0\n",
       "43  Islamabad      72.0\n",
       "44     Lahore      76.0\n",
       "45     Lahore      76.0\n",
       "46    Karachi      95.0\n",
       "47  Islamabad      72.0\n",
       "48     Lahore      76.0\n",
       "49     Lahore      76.0\n",
       "50    Karachi      95.0\n",
       "51  Islamabad      72.0\n",
       "52     Lahore      76.0"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.concat([df1, df2], join='outer', ignore_index=True )\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bb6095db",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "30f82f5f",
   "metadata": {},
   "source": [
    ">**`one_to_one` or `1:1`: checks if merge keys are unique in both left and right dataframes, if not then throw exception**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "548b7339",
   "metadata": {},
   "outputs": [],
   "source": [
    "df1, df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "ae9bb499",
   "metadata": {},
   "outputs": [],
   "source": [
    "#pd.merge(df1, df2, on='city', how='outer', validate='one_to_one')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c3ff2dec",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "52715b67",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "7b4c42a0",
   "metadata": {},
   "source": [
    ">**`one_to_many` or `1:m`: checks if merge keys are unique in left dataframe, if not then throw exception**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "033124a0",
   "metadata": {},
   "outputs": [],
   "source": [
    "df1, df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "47fd4d90",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35.0</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35.0</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39.0</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>NaN</td>\n",
       "      <td>72.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0     Lahore         35.0      76.0\n",
       "1     Lahore         35.0      76.0\n",
       "2    Karachi         39.0      95.0\n",
       "3      Muree         15.0       NaN\n",
       "4  Islamabad          NaN      72.0"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.merge(df1, df2, on='city', how='outer', validate='one_to_many')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2cfb94bb",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3839c38e",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "c4cab63d",
   "metadata": {},
   "source": [
    ">**`many_to_one` or `m:1`: checks if merge keys are unique in right dataframe, if not then throw exception**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "44ed9a0c",
   "metadata": {},
   "outputs": [],
   "source": [
    "df1, df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "588d2d3f",
   "metadata": {},
   "outputs": [],
   "source": [
    "#pd.merge(df1, df2, on='city', how='outer', validate='many_to_one')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "34cef7f2",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e56a98b",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "86e7f5c9",
   "metadata": {},
   "source": [
    ">**`many_to_many` or `m:m`: No checks are performed on keys uniqueness**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4d9f59fb",
   "metadata": {},
   "outputs": [],
   "source": [
    "df1, df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "181fd043",
   "metadata": {},
   "outputs": [],
   "source": [
    "pd.merge(df1, df2, on='city', how='outer', validate='many_to_many')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b77d3f96",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "096fb729",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "084a804d",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f9968408",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ca0391c3",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bcdeaa8f",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "80d035c8",
   "metadata": {},
   "source": [
    "# Part-II (Concatenating and Appending)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b826751f",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "e26ad476",
   "metadata": {},
   "source": [
    "## Concatenation of  DataFrames (Row Wise + Column Wise)\n",
    "\n",
    "<img align=\"left\" width=\"350\" height=\"90\"  src=\"images/row.png\"  >\n",
    "<img align=\"right\" width=\"490\" height=\"100\"  src=\"images/concat_2.png\" >\n",
    "\n",
    "<br><br><br><br><br><br><br><br><br><br>\n",
    "\n",
    "\n",
    "- The `pd.concat()` method is used to concat pandas objects along a particular axis with optional set logic along the other axes. \n",
    "```\n",
    "pd.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, verify_integrity=False)\n",
    "```\n",
    "\n",
    "Where,\n",
    "- `objs`: a sequence or mapping of Series or DataFrame objects\n",
    "- `axis`: The axis to concatenate along. {0/’index’, 1/’columns’}, default 0\n",
    "- `join`{‘inner’, ‘outer’}, Default is `outer` for union. If `inner` that means intersection\n",
    "- `ignore_index`: If True, the resulting axis will be labeled 0, …, n - 1. This is useful if you are concatenating objects where the concatenation axis does not have meaningful indexing information. (default is False)\n",
    "- `keys`: sequence, default None (Construct hierarchical index using the passed keys as the outermost level.)\n",
    "- `verify_integrity` : boolean, default False. Check whether the new concatenated axis contains duplicates. This can be very expensive relative to the actual data concatenation.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "feecd6b4",
   "metadata": {},
   "source": [
    "## 3. Row-Wise Concatenation\n",
    "<img align=\"left\" width=\"350\" height=\"90\"  src=\"images/row.png\"  >"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "1626679f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0     Lahore           35        76\n",
       "1    Karachi           39        95\n",
       "2   Peshawer           33        72\n",
       "3  Islamabad           29        81\n",
       "4      Muree           15        70"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "Pak_Weather = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'temperature' : [35, 39, 33, 29, 15],\n",
    "    'humidity' : [76, 95, 72, 81, 70],\n",
    "})\n",
    "Pak_Weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "adb91c56",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Dubai</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sharja</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Ajman</td>\n",
       "      <td>47</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Abu Dhabi</td>\n",
       "      <td>45</td>\n",
       "      <td>86</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0      Dubai           41        88\n",
       "1     Sharja           44        99\n",
       "2      Ajman           47        79\n",
       "3  Abu Dhabi           45        86"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "UAE_Weather = pd.DataFrame({\n",
    "    'city': [ 'Dubai', 'Sharja', 'Ajman', 'Abu Dhabi'],\n",
    "    'temperature' : [41, 44, 47, 45],\n",
    "    'humidity' : [88, 99, 79, 86],\n",
    "})\n",
    "UAE_Weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "e90c0d88",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Dubai</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sharja</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Ajman</td>\n",
       "      <td>47</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Abu Dhabi</td>\n",
       "      <td>45</td>\n",
       "      <td>86</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0     Lahore           35        76\n",
       "1    Karachi           39        95\n",
       "2   Peshawer           33        72\n",
       "3  Islamabad           29        81\n",
       "4      Muree           15        70\n",
       "0      Dubai           41        88\n",
       "1     Sharja           44        99\n",
       "2      Ajman           47        79\n",
       "3  Abu Dhabi           45        86"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.concat([Pak_Weather, UAE_Weather], join='outer' )\n",
    "df1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3473f4e0",
   "metadata": {},
   "source": [
    "#### Concatenate Dataframes (row-wise)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "cf0a1462",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Dubai</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sharja</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Ajman</td>\n",
       "      <td>47</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Abu Dhabi</td>\n",
       "      <td>45</td>\n",
       "      <td>86</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0     Lahore           35        76\n",
       "1    Karachi           39        95\n",
       "2   Peshawer           33        72\n",
       "3  Islamabad           29        81\n",
       "4      Muree           15        70\n",
       "0      Dubai           41        88\n",
       "1     Sharja           44        99\n",
       "2      Ajman           47        79\n",
       "3  Abu Dhabi           45        86"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.concat([Pak_Weather, UAE_Weather])\n",
    "df1 = pd.concat([Pak_Weather, UAE_Weather], axis=0)\n",
    "df1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7e545d4f",
   "metadata": {},
   "source": [
    "- Notice the index is also concatenated as such\n",
    "- To handle this pass `ignore_index` parameter a value of `True`, so that the resulting axis is be labeled 0, …, n - 1. \n",
    "- Useful if you are concatenating objects where the concatenation axis does not have meaningful indexing information.\n",
    "- Note the index values on the other axes (i.e., columns) have still respected in the join."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "4550a39e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Dubai</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Sharja</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Ajman</td>\n",
       "      <td>47</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Abu Dhabi</td>\n",
       "      <td>45</td>\n",
       "      <td>86</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0     Lahore           35        76\n",
       "1    Karachi           39        95\n",
       "2   Peshawer           33        72\n",
       "3  Islamabad           29        81\n",
       "4      Muree           15        70\n",
       "5      Dubai           41        88\n",
       "6     Sharja           44        99\n",
       "7      Ajman           47        79\n",
       "8  Abu Dhabi           45        86"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.concat([Pak_Weather,UAE_Weather], axis=0, ignore_index=True)\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d4389cd0",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "45a45239",
   "metadata": {},
   "source": [
    "- Other than the numeric index, if you want to have an additional index for your sub groups, you can use the `keys` argument to `pd.concat()` method\n",
    "- It provides multi-indexing\n",
    "- Remember this will work only if the `ignore_index` argument is `False` which is the default"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "9faa0c04",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"5\" valign=\"top\">city</th>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             city  temperature  humidity\n",
       "city 0     Lahore           35        76\n",
       "     1    Karachi           39        95\n",
       "     2   Peshawer           33        72\n",
       "     3  Islamabad           29        81\n",
       "     4      Muree           15        70"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3 = pd.concat([Pak_Weather, UAE_Weather], axis=0, keys=[\"city\",])\n",
    "df3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "c0424d4a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"5\" valign=\"top\">Pak</th>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"4\" valign=\"top\">UAE</th>\n",
       "      <th>0</th>\n",
       "      <td>Dubai</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sharja</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Ajman</td>\n",
       "      <td>47</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Abu Dhabi</td>\n",
       "      <td>45</td>\n",
       "      <td>86</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            city  temperature  humidity\n",
       "Pak 0     Lahore           35        76\n",
       "    1    Karachi           39        95\n",
       "    2   Peshawer           33        72\n",
       "    3  Islamabad           29        81\n",
       "    4      Muree           15        70\n",
       "UAE 0      Dubai           41        88\n",
       "    1     Sharja           44        99\n",
       "    2      Ajman           47        79\n",
       "    3  Abu Dhabi           45        86"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3 = pd.concat([Pak_Weather, UAE_Weather], axis=0, keys=[\"Pak\", \"UAE\"])\n",
    "df3"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc7f40f1",
   "metadata": {},
   "source": [
    "- The advantage of doing this is you can use `df.loc` to get a subset of your dataframe\n",
    "- So, after getting a big dataframe if you want to get the dataframe from which it was created keys arg is useful"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "99a6d084",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0     Lahore           35        76\n",
       "1    Karachi           39        95\n",
       "2   Peshawer           33        72\n",
       "3  Islamabad           29        81\n",
       "4      Muree           15        70"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3.loc['Pak', :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "fb637708",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Dubai</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sharja</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Ajman</td>\n",
       "      <td>47</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Abu Dhabi</td>\n",
       "      <td>45</td>\n",
       "      <td>86</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0      Dubai           41        88\n",
       "1     Sharja           44        99\n",
       "2      Ajman           47        79\n",
       "3  Abu Dhabi           45        86"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3.loc['UAE', :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de96c379",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "98968442",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ed1be4fb",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "6138a715",
   "metadata": {},
   "source": [
    "#### What will Happen if one of the Dataframe has an Additional Column\n",
    "- If you combine two Dataframe objects which do not have all the same columns, then the columns outside the intersection will be filled with NaN values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "d83fb7a5",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature\n",
       "0     Lahore           35\n",
       "1    Karachi           39\n",
       "2   Peshawer           33\n",
       "3  Islamabad           29\n",
       "4      Muree           15"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Pak_Weather = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'temperature' : [35, 39, 33, 29, 15],\n",
    "    \n",
    "})\n",
    "Pak_Weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "25490022",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Dubai</td>\n",
       "      <td>41</td>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sharja</td>\n",
       "      <td>44</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Ajman</td>\n",
       "      <td>47</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Abu Dhabi</td>\n",
       "      <td>45</td>\n",
       "      <td>86</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0      Dubai           41        88\n",
       "1     Sharja           44        99\n",
       "2      Ajman           47        79\n",
       "3  Abu Dhabi           45        86"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "UAE_Weather = pd.DataFrame({\n",
    "    'city': [ 'Dubai', 'Sharja', 'Ajman', 'Abu Dhabi'],\n",
    "    'temperature' : [41, 44, 47, 45],\n",
    "    'humidity' : [88, 99, 79, 86],\n",
    "})\n",
    "UAE_Weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "6a3a2da6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Dubai</td>\n",
       "      <td>41</td>\n",
       "      <td>88.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Sharja</td>\n",
       "      <td>44</td>\n",
       "      <td>99.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Ajman</td>\n",
       "      <td>47</td>\n",
       "      <td>79.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Abu Dhabi</td>\n",
       "      <td>45</td>\n",
       "      <td>86.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0     Lahore           35       NaN\n",
       "1    Karachi           39       NaN\n",
       "2   Peshawer           33       NaN\n",
       "3  Islamabad           29       NaN\n",
       "4      Muree           15       NaN\n",
       "5      Dubai           41      88.0\n",
       "6     Sharja           44      99.0\n",
       "7      Ajman           47      79.0\n",
       "8  Abu Dhabi           45      86.0"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# NaN will be placed where values are missing\n",
    "df = pd.concat([Pak_Weather,UAE_Weather], axis=0, ignore_index=True)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3e303b79",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "85ff090a",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2756791b",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "214dbb0b",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "fa0954ac",
   "metadata": {},
   "source": [
    "## 4. Column Wise Concatenation\n",
    "- It is not advised to concatenate dataframes column wise. If you want to then you need to take care of some checks like:\n",
    "    - the number of rows must be same in both dataframes, and\n",
    "    - Indexes of both dataframes are sorted\n",
    "- If you are done with all the checks then you can simply use `axis=1` to do the job.\n",
    "\n",
    "<img align=\"left\" width=\"490\" height=\"100\"  src=\"images/concat_2.png\"  >"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "916d7dcf",
   "metadata": {},
   "source": [
    "### a. Creating a two Simple Dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0f961d7e",
   "metadata": {},
   "outputs": [],
   "source": [
    "temp_df = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'temperature' : [35, 39, 33, 29, 15],\n",
    "})\n",
    "temp_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aa8344bc",
   "metadata": {},
   "outputs": [],
   "source": [
    "wind_df = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'wind speed' : [9, 12, 7, 13, 18],\n",
    "})\n",
    "wind_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e2288f3",
   "metadata": {},
   "source": [
    "### b. Concatenate Dataframes (column-wise)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "175448a6",
   "metadata": {},
   "outputs": [],
   "source": [
    "# We have to use the argument axis=1\n",
    "df = pd.concat([temp_df,wind_df], axis=1)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1fd40a60",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "e46f0d9d",
   "metadata": {},
   "source": [
    "### c. What will happen if we have missing data in our dataframes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "02ce2467",
   "metadata": {},
   "outputs": [],
   "source": [
    "# This dataframe do not have the temperature for Lahore\n",
    "temp_df = pd.DataFrame({\n",
    "    'city': [ 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'temperature' : [39, 33, 29, 15],\n",
    "})\n",
    "temp_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "78b39323",
   "metadata": {},
   "outputs": [],
   "source": [
    "#This dataframe do not have the windspeed of Islamabad\n",
    "wind_df = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Muree'],\n",
    "    'wind speed' : [9, 12, 7, 18],\n",
    "})\n",
    "wind_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0469a354",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "df1 = pd.concat([temp_df,wind_df], axis=1)\n",
    "df1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2ec7b9b6",
   "metadata": {},
   "source": [
    "**This doesnot look correct**\n",
    "- We have missing data in the resulting dataframe, i.e., it does not contain record for Lahore, which was there in the second dataframe but not in the first\n",
    "- Solution is while creating the dataframe you pass it the index\n",
    "- In Pandas, while creating a DataFrame, you can pass the index argument with appropriate related indices, which is a way to align rows from different dataframes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8700fe49",
   "metadata": {},
   "outputs": [],
   "source": [
    "# This dataframe do not have the temperature for Lahore\n",
    "temp_df = pd.DataFrame({\n",
    "    'city': [ 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'temperature' : [39, 33, 29, 15],\n",
    "},index=[0,1,2,3])\n",
    "temp_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d3e1dc50",
   "metadata": {},
   "outputs": [],
   "source": [
    "#This dataframe do not have the windspeed of Islamabad\n",
    "# Note the indices in wind_df are related to indices of temp_df\n",
    "wind_df = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Muree'],\n",
    "    'wind speed' : [9, 12, 7, 18],\n",
    "}, index=[4,0,1,3])\n",
    "wind_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f25ca5b9",
   "metadata": {},
   "source": [
    "---\n",
    "#### Note the indexes in above two dataframes match. Now concatenation will be OK\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5fcd61ae",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.concat([temp_df,wind_df], axis=1)\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6e3306da",
   "metadata": {},
   "source": [
    ">- Concatenating Dataframes along axis = 1 adds one Dataframe along the other. It is like a full outer join. Placing NaN for non-matching rows in the left as well as right Dataframes.\n",
    ">- By default, a concatenation results in a set union, where all data is preserved."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f339166b",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "ea8276dd",
   "metadata": {},
   "source": [
    "## 5. Adding a Single Row/Column in a Dataframe\n",
    "- Now let us see how we can concat a single row or a single column to a dataframe using the `pd.concat()` method."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa3e80c9",
   "metadata": {},
   "source": [
    "### a. Adding a Row in a Dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "969be5e0",
   "metadata": {},
   "outputs": [],
   "source": [
    "df1 = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'temperature' : [35, 39, 33, 29, 15],\n",
    "    'humidity' : [76, 95, 72, 81, 70],\n",
    "})\n",
    "df1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e88f7016",
   "metadata": {},
   "outputs": [],
   "source": [
    "df2 = pd.DataFrame({\"city\": \"Multan\", \"temperature\": 45, \"humidity\": 75}, index=[5])\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a4ca835d",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "acbed946",
   "metadata": {},
   "outputs": [],
   "source": [
    "df3 = pd.concat([df1, df2], ignore_index=True, axis = 0)\n",
    "df3"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eaee8c26",
   "metadata": {},
   "source": [
    ">**You can place the new row at your desired location using slicing operator, as shown below**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3578ad25",
   "metadata": {},
   "outputs": [],
   "source": [
    "df3 = pd.concat([df1[:2], df2, df1[2:]], ignore_index = True)\n",
    "df3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21fcac05",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de0bb80d",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b58a3ec8",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "a4b7432b",
   "metadata": {},
   "source": [
    "### b. Adding a Column in a Dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "06be80a5",
   "metadata": {},
   "outputs": [],
   "source": [
    "Pak_Weather = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'temperature' : [35, 39, 33, 29, 15],\n",
    "    'humidity' : [76, 95, 72, 81, 70],\n",
    "})\n",
    "Pak_Weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "18946847",
   "metadata": {},
   "outputs": [],
   "source": [
    "s = pd.Series([\"Humid\", 'Dry', 'Rainy', 'Humid', 'Rainy'], name=\"event\")\n",
    "s"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cdddbf58",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.concat([Pak_Weather, s], axis=1)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a864c0a5",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f99f919f",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ab483930",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "9af9231d",
   "metadata": {},
   "source": [
    "<img align=\"right\" width=\"310\" height=\"100\"  src=\"images/append.png\"  >\n",
    "\n",
    "## 6. Appending DataFrames\n",
    "- The `df1.append(df2)` method is used to concat the second dataframe’s records at the end of first dataframe (along axis=0). Columns not present in the first DataFrame are added as new columns\n",
    "- The `df1.append(df2)` method considers the calling dataframe as main object and adds rows to that dataframe from the dataframes that are passed to the function as argument.\n",
    "- It returns a new dataframe object consisting of the rows of caller and the rows of `other`. The dataframe that called the `append()` method,  remain unchanged.\n",
    "```\n",
    "df.append(other, ignore_index=False, verify_integrity=False, sort=False)\n",
    "```\n",
    "\n",
    "    - `other`: DataFrame or Series/dict-like object, or list of these (The data to append.)\n",
    "    - `ignore_index`: If True, the resulting axis will be labeled 0, 1, …, n - 1 (default is False)\n",
    "    - `verify_integrity`: If True, raise ValueError on creating index with duplicates (default is False)\n",
    "    - `sort`: Sort columns if the columns of `self` and `other` are not aligned (default is False)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be5cc5d3",
   "metadata": {},
   "source": [
    "### a. Append Two DataFrames"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1023fe8f",
   "metadata": {},
   "outputs": [],
   "source": [
    "Pak_Weather = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'temperature' : [35, 39, 33, 29, 15],\n",
    "    'humidity' : [76, 95, 72, 81, 70],\n",
    "})\n",
    "Pak_Weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "40ae4f44",
   "metadata": {},
   "outputs": [],
   "source": [
    "UAE_Weather = pd.DataFrame({\n",
    "    'city': [ 'Dubai', 'Sharja', 'Ajman', 'Abu Dhabi'],\n",
    "    'temperature' : [41, 44, 47, 45],\n",
    "    'humidity' : [88, 99, 79, 86],\n",
    "})\n",
    "UAE_Weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2ba1f307",
   "metadata": {},
   "outputs": [],
   "source": [
    "# append Dataframe\n",
    "df2 =  Pak_Weather.append(UAE_Weather)\n",
    "df2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bdbd5966",
   "metadata": {},
   "source": [
    "- Notice the index is also concatenated as such\n",
    "- To handle this pass `ignore_index` parameter a value of `True`, so that the resulting axis is be labeled 0, …, n - 1. \n",
    "- Useful if you are concatenating objects where the concatenation axis does not have meaningful indexing information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e17da324",
   "metadata": {},
   "outputs": [],
   "source": [
    "# set the ignore_index to true\n",
    "df2 =  Pak_Weather.append(UAE_Weather, ignore_index=True)\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c30e92d2",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9fc7cb85",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a1126bfc",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "ac0b6a94",
   "metadata": {},
   "source": [
    "### b. Append a Row in DataFrame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "11741128",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Lahore</td>\n",
       "      <td>35</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Karachi</td>\n",
       "      <td>39</td>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Peshawer</td>\n",
       "      <td>33</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Islamabad</td>\n",
       "      <td>29</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Muree</td>\n",
       "      <td>15</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        city  temperature  humidity\n",
       "0     Lahore           35        76\n",
       "1    Karachi           39        95\n",
       "2   Peshawer           33        72\n",
       "3  Islamabad           29        81\n",
       "4      Muree           15        70"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Pak_Weather = pd.DataFrame({\n",
    "    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],\n",
    "    'temperature' : [35, 39, 33, 29, 15],\n",
    "    'humidity' : [76, 95, 72, 81, 70],\n",
    "})\n",
    "Pak_Weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "b67aafdc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>temperature</th>\n",
       "      <th>humidity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Multan</td>\n",
       "      <td>45</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     city  temperature  humidity\n",
       "5  Multan           45        75"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Creating a row to be appended\n",
    "d1 = pd.DataFrame({\"city\": \"Multan\", \"temperature\": 45, \"humidity\": 75}, index=[5])\n",
    "d1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7731c2ca",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Append this dataframe having single row to Pak_Weather dataframe\n",
    "df3 =  Pak_Weather.append(d1)\n",
    "df3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "89e23b8e",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6f8a148f",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "a463c667",
   "metadata": {},
   "source": [
    "**Columns of passed/other dataframe that are not in the caller are added as new columns**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "61338b44",
   "metadata": {},
   "outputs": [],
   "source": [
    "d1 = pd.DataFrame({\"city\": \"Sialkot\", \"temperature\": 45, \"humidity\": 75, \"newcol\": 66}, index=[5])\n",
    "d1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "77f1795a",
   "metadata": {},
   "outputs": [],
   "source": [
    "df3 =  Pak_Weather.append(d1)\n",
    "df3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1bf283c7",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "39898602",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7580b7e4",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}