{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Accessing a DataFrame\n",
    "There are many [different choices for indexing](https://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing) DataFrames.\n",
    "\n",
    "Let's explore!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Setup\n",
    "import pandas as pd\n",
    "\n",
    "test_user_data = {\n",
    "    'first_name': ['Craig', 'Treasure', 'Ashley', 'Guil'],\n",
    "    'last_name': ['Dennis', 'Porth', 'Boucher', 'Hernandez'],\n",
    "    'balance': [42.42, 25.00, 2.02, 87.00]\n",
    "}\n",
    "test_user_names = ['craigsdennis', 'treasure', 'lindsay2000', 'guil']\n",
    "\n",
    "users = pd.DataFrame(test_user_data, index=test_user_names)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Retrieve a specific Series\n",
    "\n",
    "### By Column Name\n",
    "Each column in a `DataFrame` is actually a `Series`. The `DataFrame` provides access to each of these `Series` by a column name index.\n",
    "\n",
    "For instance, to get the **`balance`** `Series`, you could just use that for the index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "craigsdennis    42.42\n",
       "treasure        25.00\n",
       "lindsay2000      2.02\n",
       "guil            87.00\n",
       "Name: balance, dtype: float64"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "balances = users['balance']\n",
    "balances"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the returned `Series` has a property of `name` set to the column label."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'balance'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "balances.name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### By Label\n",
    "You can retrieve a row from a `DataFrame` by using the `loc` property and supply the label. \n",
    "\n",
    "Note how the returned `Series` is labelled by the existing column labels of the **`users`** `DataFrame` and the `name` property is set to the label."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "first_name         Guil\n",
       "last_name     Hernandez\n",
       "balance              87\n",
       "Name: guil, dtype: object"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "users.loc['guil']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### By Position\n",
    "Normal list like indices are also available to get a specific row, by using the `iloc` ( `i`ndex `loc`ation) property and the appropriate index.\n",
    "\n",
    "For instance the second row can be accessed by using the index `1`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "first_name    Treasure\n",
       "last_name        Porth\n",
       "balance             25\n",
       "Name: treasure, dtype: object"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "users.iloc[1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Retrieve a Specific Value\n",
    "\n",
    "### By Chaining\n",
    "You can definitely [chain lookups together](https://pandas.pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy), but there's a better way to do it!\n",
    "\n",
    "Note that this is really running multiple separate lookups. You don't want to force this extra work to happen. However, I do want you to see that you can (and you will see this!)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Craig'"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# CAREFUL: This first retrieves the column `Series`\n",
    "#  and then uses the label.\n",
    "users['first_name']['craigsdennis']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Craig'"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# CAREFUL: This retrieves the row `Series` \n",
    "#  and then does a lookup for first_name\n",
    "users.loc['craigsdennis']['first_name']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### By Row and Column\n",
    "The DataFrame allows access in a single operation to a specific value by specifying a row and a column. There are a few options here too.\n",
    "\n",
    "#### Using [`DataFrame.loc`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html)\n",
    "\n",
    "\n",
    "\n",
    "The `loc` property on `DataFrame` allows you to work on both axes, rows and columns, by indexing with a tuple (remember a comma creates a tuple). The first element of the tuple is the row access, second is the column.\n",
    "\n",
    "Using a single label and single column name will retrieve that specific scalar value."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Craig'"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "users.loc['craigsdennis', 'first_name']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Using [`DataFrame.at`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.at.html#pandas.DataFrame.at)\n",
    "\n",
    "To be even more explicit (💪🏾) that you are trying to just get at a single scalar value, you should use the `DataFrame.at` property."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Craig'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "users.at['craigsdennis', 'first_name']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Retrieve a Specific DataFrame Through Slicing\n",
    "\n",
    "Using the `loc` and `iloc` properties you can slice an existing `DataFrame` into a new one.\n",
    "\n",
    "In the example below we use `:` in the rows axis to select all rows, and we specify which columns we want back using a list in the columns axis, ala NumPy Fancy Indexing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>balance</th>\n",
       "      <th>last_name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>craigsdennis</th>\n",
       "      <td>42.42</td>\n",
       "      <td>Dennis</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>treasure</th>\n",
       "      <td>25.00</td>\n",
       "      <td>Porth</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>lindsay2000</th>\n",
       "      <td>2.02</td>\n",
       "      <td>Boucher</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>guil</th>\n",
       "      <td>87.00</td>\n",
       "      <td>Hernandez</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              balance  last_name\n",
       "craigsdennis    42.42     Dennis\n",
       "treasure        25.00      Porth\n",
       "lindsay2000      2.02    Boucher\n",
       "guil            87.00  Hernandez"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# All rows and the following ordered columns as a list (fancy, eh?)\n",
    "users.loc[:, ['balance', 'last_name']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When using a slice with `loc` the results are inclusive, meaning they include the right side."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>first_name</th>\n",
       "      <th>last_name</th>\n",
       "      <th>balance</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>treasure</th>\n",
       "      <td>Treasure</td>\n",
       "      <td>Porth</td>\n",
       "      <td>25.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>lindsay2000</th>\n",
       "      <td>Ashley</td>\n",
       "      <td>Boucher</td>\n",
       "      <td>2.02</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            first_name last_name  balance\n",
       "treasure      Treasure     Porth    25.00\n",
       "lindsay2000     Ashley   Boucher     2.02"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# All rows starting with 'treasure' up to and including 'lindsay2000'\n",
    "#  and all the columns\n",
    "users.loc['treasure':'lindsay2000', :]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When using a slice with `iloc` the results are exclusive, just like standard Python list slices "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>last_name</th>\n",
       "      <th>balance</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>treasure</th>\n",
       "      <td>Porth</td>\n",
       "      <td>25.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         last_name  balance\n",
       "treasure     Porth     25.0"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# All rows starting with the 2nd up until but not including the 3rd\n",
    "#  and all columns starting at the second until the end\n",
    "users.iloc[1:2, 1:]"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}