{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Chapter 05 -- Understanding Indexes\n",
    "## Topics Covered\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#Indicies\"> Indicies </a>\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#.iloc-Indexer\"> .iloc indexer </a>\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#Setting-and-resetting-Indicies\">Setting and Reseting Indicies</a>\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#.loc-Indexer\"> .loc indexer </a>\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#Mixing-.loc-Indexer-with-Boolean-Operators\"> Mixing .loc indexer with Boolean Operations\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#Altering-DataFrame-values-using-the-.loc-indexer\"> Altering DataFrame values using the .loc indexer </a>\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#Conditionally-Apply-Values-Based-on-Another-Column-Value\">Conditionally Apply Values Based on Another Column Value</a>\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#.ix-Indexer\"> .ix indexer </a>\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#Indexing-Issues\"> Indexing Issues </a>\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2005%20--%20Understanding%20Indexes.ipynb#Resources\"> Resources </a>\n",
    "\n",
    "SAS users tend to think of indexing SAS data sets as a means to improve query performance.  Another use case for using SAS indices is in performing non-sequential reads for table look-ups.\n",
    "\n",
    "Indexing for DataFrames is used to provide direct access to data.  Many analytical techniques take advantage of indexes.  Indexing is also used as key values to selecting and subseting data.  \n",
    "\n",
    "It took me a bit of time to understand how indexing for Series and DataFrames actually work.  \n",
    "\n",
    "Many of the working examples I found, while mostly useful, used synthetic data which is typically ordered neatly.  Here we examine issues like overlapping date ranges or index values in non-alphabetical or in non-sequential order and so on.\n",
    "\n",
    "That is why you will find some errors in the examples below.  By examining the pits I have fallen into, hopefully you can avoid them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "from numpy.random import randn\n",
    "from pandas import Series, DataFrame, Index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consider the creation of the DataFrame df in the cell below.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "df = pd.DataFrame([['a', 'cold','slow', np.nan, 2., 6., 3.], \n",
    "                    ['b', 'warm', 'medium', 4, 5, 7, 9],\n",
    "                    ['c', 'hot', 'fast', 9, 4, np.nan, 6],\n",
    "                    ['d', 'cool', None, np.nan, np.nan, 17, 89],\n",
    "                    ['e', 'cool', 'medium', 16, 44, 21, 13],\n",
    "                    ['f', 'cold', 'slow', np.nan, 29, 33, 17]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the 'df' DataFrame created above we did not specify a row index or column names resulting in the RangeIndex object used for row labels.  Default column labels are created as well using another RangeIndex object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>a</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>d</td>\n",
       "      <td>cool</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "      <td>89.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>f</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>29.0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>17.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0     1       2     3     4     5     6\n",
       "0  a  cold    slow   NaN   2.0   6.0   3.0\n",
       "1  b  warm  medium   4.0   5.0   7.0   9.0\n",
       "2  c   hot    fast   9.0   4.0   NaN   6.0\n",
       "3  d  cool    None   NaN   NaN  17.0  89.0\n",
       "4  e  cool  medium  16.0  44.0  21.0  13.0\n",
       "5  f  cold    slow   NaN  29.0  33.0  17.0"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Indices\n",
    "\n",
    "The .index attribute returns the DataFrame's index structure.  We did not explicitly set an index.  As a result a default index object is created.  The RangeIndex object has a start position at 0 and the end position set to len(df) - 1."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Return the row index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=6, step=1)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Return the column labels.  Since no labels were specificed, a RangeIndex object is used to identify columns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=7, step=1)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For observations SAS uses the automatic variable \\_N\\_ in the Data Step and FIRSTOBS and OBS in PROC step for its row indexing. \n",
    "\n",
    "SAS also has a similar construct (SAS variable list) allowing the use of column name 'ranges' (col1-colN) described <a href=\"http://support.sas.com/documentation/cdl/en/lrcon/69852/HTML/default/viewer.htm#p0wphcpsfgx6o7n1sjtqzizp1n39.htm\"> here</a>."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The SAS example below creates a data set with same data used to create the DataFrame df in cell #2 above."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````\n",
    "    /******************************************************/\n",
    "    /* c05_create_analog_df_dataframe.sas                 */\n",
    "    /******************************************************/\n",
    "    11      data df;\n",
    "    12      infile cards dlm=',';\n",
    "    13      \n",
    "    14      input id   $\n",
    "    15            col1 $\n",
    "    16            col2 $\n",
    "    17            col3\n",
    "    18            col4\n",
    "    19            col5\n",
    "    20            col6 ;\n",
    "    21      \n",
    "    22      datalines;\n",
    "    23      a, cold, slow, ., 2, 6, 3\n",
    "    24      b, warm, medium, 4, 5, 7, 9\n",
    "    25      c, hot, fast, 9, 4, ., 6\n",
    "    26      d, cool, , ., ., 17, 89\n",
    "    27      e, cool,  medium, 16, 44, 21, 13\n",
    "    28      f, cold, slow, . ,29, 33, 17\n",
    "    29      ;;;;\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Data Step with the SET options NOBS= is an example of implicit indexing used by SAS.  The END= parameter on the SET statement is initialized to 0 and set to one when the last observation is read.  The automatic variable \\_N\\_ can be used as the observation (row) index."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````\n",
    "    /******************************************************/\n",
    "    /* c05_sas_row_index1.sas                             */\n",
    "    /******************************************************/\n",
    "    31      data _null_;\n",
    "    32      set df nobs=obs end=end_of_file;\n",
    "    33      \n",
    "    34      put _n_ = ;\n",
    "    35      \n",
    "    36      if end_of_file then\n",
    "    37         put 'Data set contains: ' obs ' observations' /;\n",
    "\n",
    "    _N_=1\n",
    "    _N_=2\n",
    "    _N_=3\n",
    "    _N_=4\n",
    "    _N_=5\n",
    "    _N_=6\n",
    "    Data set contains: 6  observations\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using the DataFrame indices we can select specific rows and columns.  DataFrames provide indexers to accomplish these tasks.  They are:\n",
    "\n",
    "    1. .iloc() method which is mainly an integer-based method\n",
    "    2. .loc() method used to select ranges by labels (either column or row)\n",
    "    3. .ix() method which supports a combination of the loc() and iloc() methods\n",
    "    \n",
    "We also illsutrate altering values in a DataFrame using the .loc().  This is equivalent to the SAS update method."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## .iloc Indexer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .iloc indexer uses an integer-based method for locating row and column positions.  It is a very robust indexer, however, it is limited by the fact that humans are better at remembering labels than numbers.  This is analogous to locating an observations by \\_n\\_ or locating a SAS variable using 'colN' from a variable list. \n",
    "\n",
    "The syntax for the .iloc indexer is:\n",
    "\n",
    "    df.iloc[row selection, column selection]\n",
    "    \n",
    "A comma (,) is used to separate the request of rows for the request for columns.  A colon (:) is used to request a range of cells.\n",
    "\n",
    "The absence of a either a column or row selection is an implicit request for all columns or rows, respectively.  \n",
    "\n",
    "Details for the .iloc indexer are located <a href=\"http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-integer\"> here</a>."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ".iloc[0] is useful since it returns the first scalar from a Series of the first column from a DataFrame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0       a\n",
       "1    cold\n",
       "2    slow\n",
       "3     NaN\n",
       "4       2\n",
       "5       6\n",
       "6       3\n",
       "Name: 0, dtype: object"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[0] "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The point= option on the SET statement behaves similarly to return the first row in the data set.  Note the SET statement inside a DO loop and the STOP statement.  STOP is needed because the POINT= option indicates a non-sequnetial access pattern and thus the end of data set indicator is not available."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````\n",
    "    /******************************************************/\n",
    "    /* c05_sas_row_index2.sas                             */\n",
    "    /******************************************************/\n",
    "    52      data _null_;\n",
    "    53      \n",
    "    54      do iloc = 1 to 1;\n",
    "    55         set df point=iloc;\n",
    "    56      \n",
    "    57       put _all_ ;\n",
    "    58       end;\n",
    "    59      stop;\n",
    "\n",
    "    _N_=1 _ERROR_=0 iloc=1 id=a col1=cold col2=slow col3=. col4=2 col5=6 col6=3\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the example below, you might expect three rows returned, rather than two. The range request for .iloc[] includes the start of the range and does not include the last item in the range value. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>d</td>\n",
       "      <td>cool</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "      <td>89.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0     1     2    3    4     5     6\n",
       "2  c   hot  fast  9.0  4.0   NaN   6.0\n",
       "3  d  cool  None  NaN  NaN  17.0  89.0"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[2:4]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The SAS analog example for cell #6 is below."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````\n",
    "    /******************************************************/\n",
    "    /* c05_sas_row_index3.sas                             */\n",
    "    /******************************************************/\n",
    "    94       data _null_ ;\n",
    "    95       \n",
    "    96       do iloc = 3 to 4;\n",
    "    97           set df point=iloc;\n",
    "    98          put _all_ ;\n",
    "    99       end;\n",
    "    100      stop;\n",
    "\n",
    "    _N_=1 _ERROR_=0 iloc=3 id=c col1=hot col2=fast col3=9 col4=4 col5=. col6=6\n",
    "    _N_=1 _ERROR_=0 iloc=4 id=d col1=cool col2=  col3=. col4=. col5=17 col6=89\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similar to the indexer for string slicing, the index position \n",
    "\n",
    "    iloc[0]\n",
    "    \n",
    "returns the first row in a DataFrame and \n",
    "\n",
    "    iloc[-1]\n",
    "    \n",
    "returns the last row in the DataFrame.  This is analogous to the END= option for the SET statement (assuming a sequential access pattern)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .iloc indexer is mainly used to locate first or last row in the DataFrame.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0       f\n",
       "1    cold\n",
       "2    slow\n",
       "3     NaN\n",
       "4      29\n",
       "5      33\n",
       "6      17\n",
       "Name: 5, dtype: object"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[-1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .iloc indexer in cell #8 below returns rows 2 and 3 using (2:4) as the row selector and columns 0 to 6 using (0:6) as the column selctor."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>d</td>\n",
       "      <td>cool</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0     1     2    3    4     5\n",
       "2  c   hot  fast  9.0  4.0   NaN\n",
       "3  d  cool  None  NaN  NaN  17.0"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[2:4, 0:6]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The analog SAS program for returning the same sub-set is below.  FIRSTOBS=3 OBS=4 is the equivalent row selector and keep = id -- col5 is the equivalent column selector."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````\n",
    "    /******************************************************/\n",
    "    /* c05_firstobs_3_obs_4.sas                           */\n",
    "    /******************************************************/\n",
    "    60      data df;\n",
    "    61          set df(keep = id -- col5\n",
    "    62                 firstobs=3 obs=4);\n",
    "    63       put _all_ ;\n",
    "\n",
    "    _N_=1 _ERROR_=0 id=c col1=hot col2=fast col3=9 col4=4 col5=.\n",
    "    _N_=2 _ERROR_=0 id=d col1=cool col2=  col3=. col4=. col5=17\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .iloc idexer illustrating multi-row and multi-column requests.  Note the double square brackets ([]) syntax."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>2</th>\n",
       "      <th>4</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>medium</td>\n",
       "      <td>5.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>89.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>slow</td>\n",
       "      <td>29.0</td>\n",
       "      <td>17.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        2     4     6\n",
       "1  medium   5.0   9.0\n",
       "3    None   NaN  89.0\n",
       "5    slow  29.0  17.0"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[[1,3,5], [2, 4, 6]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## .loc Indexer "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .loc indexer is similar to .iloc and allows access to rows and columns by labels.  A good analogy is a cell reference in Excel, eg. C:31.\n",
    "\n",
    "The syntax for the .loc indexer is:\n",
    "\n",
    "    df.loc[row selection, column selection]\n",
    "    \n",
    "For both the row and column selection, a comma (,) is used to request a list of multiple cells.  A colon (:) is used to request a range of cells.\n",
    "    \n",
    "Similiar to the .iloc indexer you can select combinations of rows and columns.  The doc details are located <a href=\"http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-label\"> here</a>."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consider the DataFrame df2 created below in cell #10.  It contains the new columns 'id' and 'date'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "df2 = pd.DataFrame([['a', 'cold','slow', np.nan, 2., 6., 3., '08/01/16'], \n",
    "                    ['b', 'warm', 'medium', 4, 5, 7, 9, '03/15/16'],\n",
    "                    ['c', 'hot', 'fast', 9, 4, np.nan, 6, '04/30/16'],\n",
    "                    ['d', 'None', 'fast', np.nan, np.nan, 17, 89, '05/31/16'],\n",
    "                    ['e', 'cool', 'medium', 16, 44, 21, 13, '07/04/16'],\n",
    "                    ['f', 'cold', 'slow', np.nan, 29, 33, 17, '01/01/16']],\n",
    "                    columns=['id', 'col1', 'col2', 'col3', 'col4', 'col5', 'col6', 'date'])\n",
    "                  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Executing just the name for the DataFrame is the equivalent of:\n",
    "    \n",
    "    print(df2)\n",
    "    \n",
    "The print() method for a DataFrame returns the output without the cell outlines, however."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>col5</th>\n",
       "      <th>col6</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>a</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>08/01/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "      <td>03/15/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "      <td>04/30/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>d</td>\n",
       "      <td>None</td>\n",
       "      <td>fast</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>05/31/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "      <td>07/04/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>f</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>29.0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>17.0</td>\n",
       "      <td>01/01/16</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  id  col1    col2  col3  col4  col5  col6      date\n",
       "0  a  cold    slow   NaN   2.0   6.0   3.0  08/01/16\n",
       "1  b  warm  medium   4.0   5.0   7.0   9.0  03/15/16\n",
       "2  c   hot    fast   9.0   4.0   NaN   6.0  04/30/16\n",
       "3  d  None    fast   NaN   NaN  17.0  89.0  05/31/16\n",
       "4  e  cool  medium  16.0  44.0  21.0  13.0  07/04/16\n",
       "5  f  cold    slow   NaN  29.0  33.0  17.0  01/01/16"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting and resetting Indicies\n",
    "\n",
    "In SAS the construction of an index for a data set creates an external file used in a deterministic fashion by SAS.  In contrast, the construction of a panda index physically alters either the DataFrame, or a copy of it, depending on argument values to the set_index method.  The doc is located <a href=\"http://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing\"> here </a>."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Start by setting the index to 'id', to access rows by a row value or a range of rows values.  By default, the column is dropped when it becomes the index.  You may not want this behavior, in which case you set the drop= argument value to False. \n",
    "    \n",
    "    drop=False\n",
    "    \n",
    "This is useful if you have multiple occasions setting and resetting the index.  Otherwise, a re-read of the DataFrame is required. Below, use the set_index() method to set the index to the 'id' column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "df2.set_index('id', inplace=True, drop=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .set_index attribute execution is silent when the inplace=argument value is utilized.  Validate the index using the .index attribute."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['a', 'b', 'c', 'd', 'e', 'f'], dtype='object', name='id')"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To reset the index, use the reset_index() method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "df2 = df2.reset_index(drop=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to have the remainder of the examples for the .loc indexer to work, we set the index again."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "df2.set_index('id', inplace=True, drop=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Return the row labeled 'e'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "id             e\n",
       "col1        cool\n",
       "col2      medium\n",
       "col3          16\n",
       "col4          44\n",
       "col5          21\n",
       "col6          13\n",
       "date    07/04/16\n",
       "Name: e, dtype: object"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc['e', ]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Return rows in the range of 'b' to 'f' inclusive.  'b':'f' denotes a row range.  The absence of a column request is an implicit request for all of them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>col5</th>\n",
       "      <th>col6</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>b</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "      <td>03/15/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>c</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "      <td>04/30/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>d</th>\n",
       "      <td>d</td>\n",
       "      <td>None</td>\n",
       "      <td>fast</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>05/31/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>e</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "      <td>07/04/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>f</th>\n",
       "      <td>f</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>29.0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>17.0</td>\n",
       "      <td>01/01/16</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id  col1    col2  col3  col4  col5  col6      date\n",
       "id                                                   \n",
       "b   b  warm  medium   4.0   5.0   7.0   9.0  03/15/16\n",
       "c   c   hot    fast   9.0   4.0   NaN   6.0  04/30/16\n",
       "d   d  None    fast   NaN   NaN  17.0  89.0  05/31/16\n",
       "e   e  cool  medium  16.0  44.0  21.0  13.0  07/04/16\n",
       "f   f  cold    slow   NaN  29.0  33.0  17.0  01/01/16"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc['b':'f' ,]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Return the rows between range of 'd' to 'f' inclusive.  'col6' and 'col2' is a request for columns by label."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col6</th>\n",
       "      <th>col2</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>d</th>\n",
       "      <td>89.0</td>\n",
       "      <td>fast</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>e</th>\n",
       "      <td>13.0</td>\n",
       "      <td>medium</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>f</th>\n",
       "      <td>17.0</td>\n",
       "      <td>slow</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    col6    col2\n",
       "id              \n",
       "d   89.0    fast\n",
       "e   13.0  medium\n",
       "f   17.0    slow"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc['d':'f',['col6','col2']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Return the 'date' column by label."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "id\n",
       "a    08/01/16\n",
       "b    03/15/16\n",
       "c    04/30/16\n",
       "d    05/31/16\n",
       "e    07/04/16\n",
       "f    01/01/16\n",
       "Name: date, dtype: object"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc[: , 'date']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Change the DataFrame df2 index from 'id' to 'date'.  The inplace=True argument does not make a copy of the DataFrame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "df2.set_index('date', inplace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Validate the index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['08/01/16', '03/15/16', '04/30/16', '05/31/16', '07/04/16', '01/01/16'], dtype='object', name='date')"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Request a row by label."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "id         d\n",
       "col1    None\n",
       "col2    fast\n",
       "col3     NaN\n",
       "col4     NaN\n",
       "col5      17\n",
       "col6      89\n",
       "Name: 05/31/16, dtype: object"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc['05/31/16']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Request arbitrary rows.  Notice the double square braces ([])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>col5</th>\n",
       "      <th>col6</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>03/15/16</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>07/04/16</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         id  col1    col2  col3  col4  col5  col6\n",
       "date                                             \n",
       "03/15/16  b  warm  medium   4.0   5.0   7.0   9.0\n",
       "07/04/16  e  cool  medium  16.0  44.0  21.0  13.0"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc[['03/15/16', '07/04/16']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Request a range of rows.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col2</th>\n",
       "      <th>col1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>04/30/16</th>\n",
       "      <td>fast</td>\n",
       "      <td>hot</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>05/31/16</th>\n",
       "      <td>fast</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>07/04/16</th>\n",
       "      <td>medium</td>\n",
       "      <td>cool</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            col2  col1\n",
       "date                  \n",
       "04/30/16    fast   hot\n",
       "05/31/16    fast  None\n",
       "07/04/16  medium  cool"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc['04/30/16':'07/04/16',['col2','col1']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The SAS program below is equivalent to cell #25 above.  It uses character values representing dates."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````\n",
    "    /******************************************************/\n",
    "    /* c05_select_dates_01Feb16_and_31Jul16.sas           */\n",
    "    /******************************************************/\n",
    "    25       data df2(keep = col2 col1);\n",
    "    26          set df(where=(date between '04/30/16' and '07/04/16'));\n",
    "    27       put _all_;\n",
    "\n",
    "    _N_=1 _ERROR_=0 id=c col1=hot col2=fast col3=9 col4=4 col5=. col6=6 date=04/30/16\n",
    "    _N_=2 _ERROR_=0 id=d col1=cool col2=  col3=. col4=. col5=17 col6=89 date=05/31/16\n",
    "    _N_=3 _ERROR_=0 id=e col1=cool col2=medium col3=16 col4=44 col5=21 col6=13 date=07/04/16\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In cell #26 below we hit a snag.  The issue begins with cell #21 above using the set_index attribute for the 'df2' DataFrame.  Examine cell #22 above to observe how the df2['date'] column dtype is 'object'.  \n",
    "\n",
    "This indicates we are working with string literals and not datetime objects.  Cells #24 and #25 work because these specific labels are values found in the df2['date'] index.\n",
    "\n",
    "Cell #26, below does not work, since the range request contains the value '07/31/16 as the range end-point which is not a value in the index.  \n",
    "\n",
    "The remedy, shown below in cell #29 is to use the pd.to_datetime() method to convert the df2['date'] string values into a datetime object.  The obvious analogy for SAS users is converting a string variable to a numeric variable which has an associated date format.\n",
    "\n",
    "Converting string literals to datetime values is covered in the section, String Literal Mapped to datetime timestamp located<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2008%20--%20Date%2C%20Time%2C%20and%20%20Timedelta%20Objects.ipynb#String-Literal-Mapped-to-datetime-timestamp\"> here</a>."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "'07/31/16'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mget_slice_bound\u001b[1;34m(self, label, side, kind)\u001b[0m\n\u001b[0;32m   2909\u001b[0m             \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2910\u001b[1;33m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_searchsorted_monotonic\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlabel\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mside\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2911\u001b[0m             \u001b[1;32mexcept\u001b[0m \u001b[0mValueError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36m_searchsorted_monotonic\u001b[1;34m(self, label, side)\u001b[0m\n\u001b[0;32m   2875\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2876\u001b[1;33m         \u001b[1;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'index must be monotonic increasing or decreasing'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2877\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mValueError\u001b[0m: index must be monotonic increasing or decreasing",
      "\nDuring handling of the above exception, another exception occurred:\n",
      "\u001b[1;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-26-e06fa69430df>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mdf2\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mloc\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'01/01/16'\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;34m'07/31/16'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m__getitem__\u001b[1;34m(self, key)\u001b[0m\n\u001b[0;32m   1294\u001b[0m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getitem_tuple\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1295\u001b[0m         \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1296\u001b[1;33m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getitem_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   1297\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1298\u001b[0m     \u001b[1;32mdef\u001b[0m \u001b[0m_getitem_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_getitem_axis\u001b[1;34m(self, key, axis)\u001b[0m\n\u001b[0;32m   1435\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mslice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1436\u001b[0m             \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_has_valid_type\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1437\u001b[1;33m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_get_slice_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   1438\u001b[0m         \u001b[1;32melif\u001b[0m \u001b[0mis_bool_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1439\u001b[0m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getbool_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_get_slice_axis\u001b[1;34m(self, slice_obj, axis)\u001b[0m\n\u001b[0;32m   1316\u001b[0m         \u001b[0mlabels\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mobj\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_get_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1317\u001b[0m         indexer = labels.slice_indexer(slice_obj.start, slice_obj.stop,\n\u001b[1;32m-> 1318\u001b[1;33m                                        slice_obj.step, kind=self.name)\n\u001b[0m\u001b[0;32m   1319\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1320\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mindexer\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mslice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mslice_indexer\u001b[1;34m(self, start, end, step, kind)\u001b[0m\n\u001b[0;32m   2783\u001b[0m         \"\"\"\n\u001b[0;32m   2784\u001b[0m         start_slice, end_slice = self.slice_locs(start, end, step=step,\n\u001b[1;32m-> 2785\u001b[1;33m                                                  kind=kind)\n\u001b[0m\u001b[0;32m   2786\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2787\u001b[0m         \u001b[1;31m# return a slice\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mslice_locs\u001b[1;34m(self, start, end, step, kind)\u001b[0m\n\u001b[0;32m   2968\u001b[0m         \u001b[0mend_slice\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2969\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0mend\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2970\u001b[1;33m             \u001b[0mend_slice\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_slice_bound\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mend\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'right'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mkind\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2971\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0mend_slice\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2972\u001b[0m             \u001b[0mend_slice\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mlen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mget_slice_bound\u001b[1;34m(self, label, side, kind)\u001b[0m\n\u001b[0;32m   2911\u001b[0m             \u001b[1;32mexcept\u001b[0m \u001b[0mValueError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2912\u001b[0m                 \u001b[1;31m# raise the original KeyError\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2913\u001b[1;33m                 \u001b[1;32mraise\u001b[0m \u001b[0merr\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2914\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2915\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mslc\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mndarray\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mget_slice_bound\u001b[1;34m(self, label, side, kind)\u001b[0m\n\u001b[0;32m   2905\u001b[0m         \u001b[1;31m# we need to look up the label\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2906\u001b[0m         \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2907\u001b[1;33m             \u001b[0mslc\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlabel\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2908\u001b[0m         \u001b[1;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0merr\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2909\u001b[0m             \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mget_loc\u001b[1;34m(self, key, method, tolerance)\u001b[0m\n\u001b[0;32m   1945\u001b[0m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1946\u001b[0m             \u001b[1;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1947\u001b[1;33m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_maybe_cast_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   1948\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1949\u001b[0m         \u001b[0mindexer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mmethod\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mmethod\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtolerance\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mtolerance\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mpandas\\index.pyx\u001b[0m in \u001b[0;36mpandas.index.IndexEngine.get_loc (pandas\\index.c:4154)\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas\\index.pyx\u001b[0m in \u001b[0;36mpandas.index.IndexEngine.get_loc (pandas\\index.c:4018)\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas\\hashtable.pyx\u001b[0m in \u001b[0;36mpandas.hashtable.PyObjectHashTable.get_item (pandas\\hashtable.c:12368)\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas\\hashtable.pyx\u001b[0m in \u001b[0;36mpandas.hashtable.PyObjectHashTable.get_item (pandas\\hashtable.c:12322)\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;31mKeyError\u001b[0m: '07/31/16'"
     ]
    }
   ],
   "source": [
    "df2.loc['01/01/16':'07/31/16']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Reset the index for the DataFrame 'df2' to the default RangeIndex object.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "df2.reset_index(inplace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Validate the index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=6, step=1)"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Cast the df2['date'] column from dtype='object' (strings) to dtype=datetime. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "df2['date'] = pd.to_datetime(df2.date)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Set the df2['date'] column as the index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "df2.set_index('date', inplace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Validate the index.  Observe the dytpe is now datetime64--a datetime stamp.  See <a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/blob/master/Chapter%2008%20--%20Date%2C%20Time%2C%20and%20%20Timedelta%20Objects.ipynb\"> Chapter 8--Date, Time, and Timedelta Objects</a> for more details on datetime arithmetic, shifting time intervals, and determining durations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2016-08-01', '2016-03-15', '2016-04-30', '2016-05-31',\n",
       "               '2016-07-04', '2016-01-01'],\n",
       "              dtype='datetime64[ns]', name='date', freq=None)"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With the 'ds2['date'] column values converted to a datetime values, re-do the statement in cell #26 above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>col5</th>\n",
       "      <th>col6</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2016-03-15</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-04-30</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-05-31</th>\n",
       "      <td>d</td>\n",
       "      <td>None</td>\n",
       "      <td>fast</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "      <td>89.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-07-04</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           id  col1    col2  col3  col4  col5  col6\n",
       "date                                               \n",
       "2016-03-15  b  warm  medium   4.0   5.0   7.0   9.0\n",
       "2016-04-30  c   hot    fast   9.0   4.0   NaN   6.0\n",
       "2016-05-31  d  None    fast   NaN   NaN  17.0  89.0\n",
       "2016-07-04  e  cool  medium  16.0  44.0  21.0  13.0"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc['02/01/16':'07/31/16']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The SAS example below illustrates a similiar set of steps:\n",
    "    \n",
    "    1.  Read the original 'date' variable which is character and rename it to 'str_date'\n",
    "    2.  Use the input funtion to 'read' the 'str_date' values and assign them to the 'date' variable using the mmddyy10. informat\n",
    "    3.  Print the date value without any formatting showing it is now a SAS datetime value\n",
    "    4.  Print the SAS datetime value using the mmddyy10. date format\n",
    "   "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````\n",
    "    /******************************************************/\n",
    "    /* c05_alter_string_to_datetime_variable.sas          */\n",
    "    /******************************************************/\n",
    "    4      data df2(drop = str_date);\n",
    "    5         set df(rename=(date=str_date));\n",
    "    6      date=input(str_date,mmddyy10.);\n",
    "    7      \n",
    "    8      if _n_ = 1 then put date= /\n",
    "    9         date mmddyy10.;\n",
    "\n",
    "    date=20667\n",
    "    08/01/2016\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Mixing .loc Indexer with Boolean Operators\n",
    "\n",
    "This approach works by creating either a Series or array of boolean values (True or False).  This Series or array is then used by the .loc indexer to return all of the values that evaluate to True.  Using the DataFrame df2 created in the cell above, consider the following.\n",
    "\n",
    "We want to return all rows where 'col2' is not equal to 'fast. This is expressed as:\n",
    "\n",
    "    df2['col2'] != 'fast'\n",
    "    \n",
    "A Series is returned with the True/False values not equal to 'fast' for fd2['col2'] shown below.  The df2['date'] column is returned since it remains as the index for the DataFrame.  The second print() method displays this object as being derived from the class: Series."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "date\n",
      "2016-08-01     True\n",
      "2016-03-15     True\n",
      "2016-04-30    False\n",
      "2016-05-31    False\n",
      "2016-07-04     True\n",
      "2016-01-01     True\n",
      "Name: col2, dtype: bool\n",
      "<class 'pandas.core.series.Series'>\n"
     ]
    }
   ],
   "source": [
    "print(df2['col2'] != 'fast')\n",
    "print(type(df2['col2'] != 'fast'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Passing the boolean Series:\n",
    "    \n",
    "    df2['col2'] != 'fast' \n",
    "    \n",
    "to the .loc indexer to retrieve those rows with a boolean value of True.  Also request 'col1' and 'col2', which a request by label."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2016-08-01</th>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-03-15</th>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-07-04</th>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-01-01</th>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            col1    col2\n",
       "date                    \n",
       "2016-08-01  cold    slow\n",
       "2016-03-15  warm  medium\n",
       "2016-07-04  cool  medium\n",
       "2016-01-01  cold    slow"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc[df2['col2'] != 'fast', 'col1':'col2']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can combine any number of boolean operation together.  Boolean value comparison operators are documented <a href=\"https://docs.python.org/3/reference/expressions.html#value-comparisons\"> here</a>."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>col5</th>\n",
       "      <th>col6</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2016-07-04</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           id  col1    col2  col3  col4  col5  col6\n",
       "date                                               \n",
       "2016-07-04  e  cool  medium  16.0  44.0  21.0  13.0"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc[(df2.col3 >=  9) & (df2.col1 == 'cool'), ]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .isin() method returns a boolean vector similar to the behavior described in cell #34 above.  In this example, the .isin list of elements evaluates True if the listed elements are found by the .loc indexer in 'col6'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>col5</th>\n",
       "      <th>col6</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2016-03-15</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-04-30</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-07-04</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           id  col1    col2  col3  col4  col5  col6\n",
       "date                                               \n",
       "2016-03-15  b  warm  medium   4.0   5.0   7.0   9.0\n",
       "2016-04-30  c   hot    fast   9.0   4.0   NaN   6.0\n",
       "2016-07-04  e  cool  medium  16.0  44.0  21.0  13.0"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc[df2.col6.isin([6, 9, 13])]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So far, the .loc indexers have resulted in an output stream.  All of the indexers can be used to sub-set a DataFrame using assignment syntax shown in the cell #33 below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>col5</th>\n",
       "      <th>col6</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2016-03-15</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-04-30</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016-07-04</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           id  col1    col2  col3  col4  col5  col6\n",
       "date                                               \n",
       "2016-03-15  b  warm  medium   4.0   5.0   7.0   9.0\n",
       "2016-04-30  c   hot    fast   9.0   4.0   NaN   6.0\n",
       "2016-07-04  e  cool  medium  16.0  44.0  21.0  13.0"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3 = df2.loc[df2.col6.isin([6, 9, 13])]\n",
    "df3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generally, in these types of sub-setting operations, the shape of the extracted DataFrame will be smaller than the input DataFrame.\n",
    "\n",
    "To return a DataFrame of the same shape as the original, use the where() method.  Details for the pandas where() method is described <a href=\"http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.where.html\"> here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Shape for df2 is (6, 7)\n",
      "Shape for df3 is (3, 7)\n"
     ]
    }
   ],
   "source": [
    "print('Shape for df2 is', df2.shape)\n",
    "print('Shape for df3 is', df3.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The example SAS program below uses the WHERE IN (list) syntax to subset a data set analogous to the example in cell #32 above."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````\n",
    "    /******************************************************/\n",
    "    /* c05_where_in.sas                                   */\n",
    "    /******************************************************/\n",
    "    NOTE: Data set \"WORK.df2\" has 6 observation(s) and 8 variable(s)\n",
    "\n",
    "    27      data df3;\n",
    "    28         set df2(where=(col6 in (6 9 13)));\n",
    "\n",
    "    NOTE: 3 observations were read from \"WORK.df2\"\n",
    "    NOTE: Data set \"WORK.df3\" has 3 observation(s) and 8 variable(s)\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice how the SAS variable count and DataFrame column count differ by 1.  That is because the DataFrame .shape() method does\n",
    "not include the index as part of its column count.  By reseting the index, the SAS variable count and DataFrame columns count agree."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Shape for df3 is (3, 8)\n"
     ]
    }
   ],
   "source": [
    "df3.reset_index(inplace=True)\n",
    "print('Shape for df3 is', df3.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Altering DataFrame values using the .loc indexer\n",
    "The .loc indexer can also be used to do an in-place update of values.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Find the values for df2['col2'] column before updating."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "date\n",
       "2016-08-01      slow\n",
       "2016-03-15    medium\n",
       "2016-04-30      fast\n",
       "2016-05-31      fast\n",
       "2016-07-04    medium\n",
       "2016-01-01      slow\n",
       "Name: col2, dtype: object"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc[: , 'col2']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conditionaly Apply Values Based on Another Column Value"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "df2.loc[df2['col6'] > 50, \"col2\"] = \"VERY FAST\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The values for df2['col2'] column after the update."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "date\n",
       "2016-08-01         slow\n",
       "2016-03-15       medium\n",
       "2016-04-30         fast\n",
       "2016-05-31    VERY FAST\n",
       "2016-07-04       medium\n",
       "2016-01-01         slow\n",
       "Name: col2, dtype: object"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc[: , 'col2']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## .ix Indexer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .ix indexer combines characteristics of the .loc and .iloc indexers.  This means you can select rows and columns by labels and integers.  \n",
    "\n",
    "The syntax for the .ix indexer is:\n",
    "\n",
    "    df.ix[row selection, column selection]\n",
    "    \n",
    "For both the row and column selection, a comma (,) is used to request a list of multiple cells.  A colon (:) is used to request a range of cells.\n",
    "    \n",
    "Similar to the .loc indexer you can select combinations of rows and columns.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .ix indexer is sometimes tricky to use.  A good rule of thumb is if you are indexing using labels or indexing using integers, use the .loc and .iloc to avoid unexpected results.  The documentation details are found <a href=\"http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.ix.html\"> here</a>.\n",
    "\n",
    "Consider the creation of the DataFrame 'df4' constructed in the cell below.  It is similar to DataFrame df2 created in cell #10 above.   The differences are the addition of another column and  columns being identified with labels as well as integers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "df4 = pd.DataFrame([['a', 'cold','slow', np.nan, 2., 6., 3., 17, '08/01/16'], \n",
    "                    ['b', 'warm', 'medium', 4, 5, 7, 9, 21, '03/15/16'],\n",
    "                    ['c', 'hot', 'fast', 9, 4, np.nan, 6, 10, '04/30/16'],\n",
    "                    ['d', 'None', 'fast', np.nan, np.nan, 17, 89, 44, '05/31/16'],\n",
    "                    ['e', 'cool', 'medium', 16, 44, 21, 13, 99, '07/04/16'],\n",
    "                    ['f', 'cold', 'slow', np.nan, 29, 33, 17, 11,'01/01/16']],\n",
    "                    columns=['id', 'col1', 'col2', 'col3', 'col4', 4, 5, 6, 'date'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>a</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>17</td>\n",
       "      <td>08/01/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "      <td>21</td>\n",
       "      <td>03/15/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "      <td>10</td>\n",
       "      <td>04/30/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>d</td>\n",
       "      <td>None</td>\n",
       "      <td>fast</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>44</td>\n",
       "      <td>05/31/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "      <td>99</td>\n",
       "      <td>07/04/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>f</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>29.0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>17.0</td>\n",
       "      <td>11</td>\n",
       "      <td>01/01/16</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  id  col1    col2  col3  col4     4     5   6      date\n",
       "0  a  cold    slow   NaN   2.0   6.0   3.0  17  08/01/16\n",
       "1  b  warm  medium   4.0   5.0   7.0   9.0  21  03/15/16\n",
       "2  c   hot    fast   9.0   4.0   NaN   6.0  10  04/30/16\n",
       "3  d  None    fast   NaN   NaN  17.0  89.0  44  05/31/16\n",
       "4  e  cool  medium  16.0  44.0  21.0  13.0  99  07/04/16\n",
       "5  f  cold    slow   NaN  29.0  33.0  17.0  11  01/01/16"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Set the index to column df4['id']."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "df4.set_index('id', inplace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .ix indexer allows slicing by labels and integer poitions along the index.  And look closely at the results since the request for columns is based on the integer-position of the column.  The column requestor of 6 is not a column label, but its position."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>6</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>b</th>\n",
       "      <td>21</td>\n",
       "      <td>03/15/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>c</th>\n",
       "      <td>10</td>\n",
       "      <td>04/30/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>d</th>\n",
       "      <td>44</td>\n",
       "      <td>05/31/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>e</th>\n",
       "      <td>99</td>\n",
       "      <td>07/04/16</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     6      date\n",
       "id              \n",
       "b   21  03/15/16\n",
       "c   10  04/30/16\n",
       "d   44  05/31/16\n",
       "e   99  07/04/16"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df4.ix['b':'e', 6:8]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### A Review:\n",
    "\n",
    "    .iloc uses the integer position in the index and only accepts integers\n",
    "    .loc uses the labels in the index\n",
    "    .ix generally behaves like the .loc\n",
    "\n",
    "Finally, to appreciate the differences, consider the following DataFrame.  Also notice the un-Pythonic style of using a semi-colon at the end of the DataFrame definition."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>a</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>d</td>\n",
       "      <td>cool</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "      <td>89.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>e</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>f</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>29.0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>17.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0     1       2     3     4     5     6\n",
       "6  a  cold    slow   NaN   2.0   6.0   3.0\n",
       "8  b  warm  medium   4.0   5.0   7.0   9.0\n",
       "2  c   hot    fast   9.0   4.0   NaN   6.0\n",
       "3  d  cool    None   NaN   NaN  17.0  89.0\n",
       "4  e  cool  medium  16.0  44.0  21.0  13.0\n",
       "5  f  cold    slow   NaN  29.0  33.0  17.0"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5 = pd.DataFrame([['a', 'cold','slow', np.nan, 2., 6., 3.], \n",
    "                    ['b', 'warm', 'medium', 4, 5, 7, 9],\n",
    "                    ['c', 'hot', 'fast', 9, 4, np.nan, 6],\n",
    "                    ['d', 'cool', None, np.nan, np.nan, 17, 89],\n",
    "                    ['e', 'cool', 'medium', 16, 44, 21, 13],\n",
    "                    ['f', 'cold', 'slow', np.nan, 29, 33, 17]],\n",
    "                    index = [6, 8, 2, 3, 4, 5,]);\n",
    "df5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .iloc indexer returns the first two rows since it looks at positions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>a</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0     1       2    3    4    5    6\n",
       "6  a  cold    slow  NaN  2.0  6.0  3.0\n",
       "8  b  warm  medium  4.0  5.0  7.0  9.0"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5.iloc[:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .loc indexer returns 3 rows since it looks at the labels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>a</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0     1       2    3    4    5    6\n",
       "6  a  cold    slow  NaN  2.0  6.0  3.0\n",
       "8  b  warm  medium  4.0  5.0  7.0  9.0\n",
       "2  c   hot    fast  9.0  4.0  NaN  6.0"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5.loc[:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .ix indexer returns the same number of rows as the .loc indexer since its behavior is to first use labels before looking by position.  Looking by position with an integer-based index can lead to unexpected results.  This illustrated in cell #49 below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>a</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>b</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>c</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0     1       2    3    4    5    6\n",
       "6  a  cold    slow  NaN  2.0  6.0  3.0\n",
       "8  b  warm  medium  4.0  5.0  7.0  9.0\n",
       "2  c   hot    fast  9.0  4.0  NaN  6.0"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5.ix[:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the next two examples, review the DataFrame df5 index structure in the cell below.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Int64Index([6, 8, 2, 3, 4, 5], dtype='int64')"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .iloc example in the cell below returns the first row.  That's because it is looking by position."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>a</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0     1     2   3    4    5    6\n",
       "6  a  cold  slow NaN  2.0  6.0  3.0"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5.iloc[:1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .ix example in the cell below raises a KeyError since 1 is not found in the index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "1",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mget_slice_bound\u001b[1;34m(self, label, side, kind)\u001b[0m\n\u001b[0;32m   2909\u001b[0m             \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2910\u001b[1;33m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_searchsorted_monotonic\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlabel\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mside\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2911\u001b[0m             \u001b[1;32mexcept\u001b[0m \u001b[0mValueError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36m_searchsorted_monotonic\u001b[1;34m(self, label, side)\u001b[0m\n\u001b[0;32m   2875\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2876\u001b[1;33m         \u001b[1;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'index must be monotonic increasing or decreasing'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2877\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mValueError\u001b[0m: index must be monotonic increasing or decreasing",
      "\nDuring handling of the above exception, another exception occurred:\n",
      "\u001b[1;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-53-bf1757e6e2b5>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mdf5\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mix\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m__getitem__\u001b[1;34m(self, key)\u001b[0m\n\u001b[0;32m     76\u001b[0m         \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     77\u001b[0m             \u001b[0mkey\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mcom\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_apply_if_callable\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 78\u001b[1;33m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getitem_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     79\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     80\u001b[0m     \u001b[1;32mdef\u001b[0m \u001b[0m_get_label\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlabel\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_getitem_axis\u001b[1;34m(self, key, axis)\u001b[0m\n\u001b[0;32m    990\u001b[0m         \u001b[0mlabels\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_get_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    991\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mslice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 992\u001b[1;33m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_get_slice_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    993\u001b[0m         elif (is_list_like_indexer(key) and\n\u001b[0;32m    994\u001b[0m               not (isinstance(key, tuple) and\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_get_slice_axis\u001b[1;34m(self, slice_obj, axis)\u001b[0m\n\u001b[0;32m   1235\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0mneed_slice\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mslice_obj\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1236\u001b[0m             \u001b[1;32mreturn\u001b[0m \u001b[0mobj\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1237\u001b[1;33m         \u001b[0mindexer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_convert_slice_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mslice_obj\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   1238\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1239\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mindexer\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mslice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_convert_slice_indexer\u001b[1;34m(self, key, axis)\u001b[0m\n\u001b[0;32m    190\u001b[0m         \u001b[1;31m# if we are accessing via lowered dim, use the last dim\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    191\u001b[0m         \u001b[0max\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_get_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mmin\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mndim\u001b[0m \u001b[1;33m-\u001b[0m \u001b[1;36m1\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 192\u001b[1;33m         \u001b[1;32mreturn\u001b[0m \u001b[0max\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_convert_slice_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mkind\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mname\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    193\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    194\u001b[0m     \u001b[1;32mdef\u001b[0m \u001b[0m_has_valid_setitem_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mindexer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36m_convert_slice_indexer\u001b[1;34m(self, key, kind)\u001b[0m\n\u001b[0;32m   1077\u001b[0m         \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1078\u001b[0m             \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1079\u001b[1;33m                 \u001b[0mindexer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mslice_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mstart\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mstop\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mstep\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mkind\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mkind\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   1080\u001b[0m             \u001b[1;32mexcept\u001b[0m \u001b[0mException\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1081\u001b[0m                 \u001b[1;32mif\u001b[0m \u001b[0mis_index_slice\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mslice_indexer\u001b[1;34m(self, start, end, step, kind)\u001b[0m\n\u001b[0;32m   2783\u001b[0m         \"\"\"\n\u001b[0;32m   2784\u001b[0m         start_slice, end_slice = self.slice_locs(start, end, step=step,\n\u001b[1;32m-> 2785\u001b[1;33m                                                  kind=kind)\n\u001b[0m\u001b[0;32m   2786\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2787\u001b[0m         \u001b[1;31m# return a slice\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mslice_locs\u001b[1;34m(self, start, end, step, kind)\u001b[0m\n\u001b[0;32m   2968\u001b[0m         \u001b[0mend_slice\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2969\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0mend\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2970\u001b[1;33m             \u001b[0mend_slice\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_slice_bound\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mend\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'right'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mkind\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2971\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0mend_slice\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2972\u001b[0m             \u001b[0mend_slice\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mlen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mget_slice_bound\u001b[1;34m(self, label, side, kind)\u001b[0m\n\u001b[0;32m   2911\u001b[0m             \u001b[1;32mexcept\u001b[0m \u001b[0mValueError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2912\u001b[0m                 \u001b[1;31m# raise the original KeyError\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2913\u001b[1;33m                 \u001b[1;32mraise\u001b[0m \u001b[0merr\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2914\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2915\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mslc\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mndarray\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mget_slice_bound\u001b[1;34m(self, label, side, kind)\u001b[0m\n\u001b[0;32m   2905\u001b[0m         \u001b[1;31m# we need to look up the label\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2906\u001b[0m         \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2907\u001b[1;33m             \u001b[0mslc\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlabel\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2908\u001b[0m         \u001b[1;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0merr\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2909\u001b[0m             \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mget_loc\u001b[1;34m(self, key, method, tolerance)\u001b[0m\n\u001b[0;32m   1945\u001b[0m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1946\u001b[0m             \u001b[1;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1947\u001b[1;33m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_maybe_cast_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   1948\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1949\u001b[0m         \u001b[0mindexer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mmethod\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mmethod\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtolerance\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mtolerance\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mpandas\\index.pyx\u001b[0m in \u001b[0;36mpandas.index.IndexEngine.get_loc (pandas\\index.c:4154)\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas\\index.pyx\u001b[0m in \u001b[0;36mpandas.index.IndexEngine.get_loc (pandas\\index.c:4018)\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas\\hashtable.pyx\u001b[0m in \u001b[0;36mpandas.hashtable.Int64HashTable.get_item (pandas\\hashtable.c:6610)\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas\\hashtable.pyx\u001b[0m in \u001b[0;36mpandas.hashtable.Int64HashTable.get_item (pandas\\hashtable.c:6554)\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;31mKeyError\u001b[0m: 1"
     ]
    }
   ],
   "source": [
    "df5.ix[:1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Indexing Issues\n",
    "So far, so good.  We have a basic understanding of how indexes can be established, utilized, and reset.  We can use the .iloc, .loc, and .ix indexers to retrieve subsets of columns and rows.  But what about real-world scenarios where data is rarely, if ever tidy?\n",
    "\n",
    "The synthetic examples above work (except the intentional errors of course) since they rely on constructing the DataFrames in an orderly manner, like having 'id' columns in alphabetical order or dates in chronological order.  \n",
    "\n",
    "Consider the DataFrame 'df5' created below.  It is similar to DataFrame 'df2' in cell #10 above with the exception of the df5['id'] column containing non-unique values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>col5</th>\n",
       "      <th>col6</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>b</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>01/01/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>c</td>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "      <td>03/15/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>a</td>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "      <td>04/30/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>d</td>\n",
       "      <td>cool</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>05/31/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>c</td>\n",
       "      <td>cool</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>44.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>13.0</td>\n",
       "      <td>07/04/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>e</td>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>29.0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>17.0</td>\n",
       "      <td>08/30/16</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  id  col1    col2  col3  col4  col5  col6      date\n",
       "0  b  cold    slow   NaN   2.0   6.0   3.0  01/01/16\n",
       "1  c  warm  medium   4.0   5.0   7.0   9.0  03/15/16\n",
       "2  a   hot    fast   9.0   4.0   NaN   6.0  04/30/16\n",
       "3  d  cool    None   NaN   NaN  17.0  89.0  05/31/16\n",
       "4  c  cool  medium  16.0  44.0  21.0  13.0  07/04/16\n",
       "5  e  cold    slow   NaN  29.0  33.0  17.0  08/30/16"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5 = pd.DataFrame([['b', 'cold','slow', np.nan, 2., 6., 3., '01/01/16'], \n",
    "                    ['c', 'warm', 'medium', 4, 5, 7, 9, '03/15/16'],\n",
    "                    ['a', 'hot', 'fast', 9, 4, np.nan, 6, '04/30/16'],\n",
    "                    ['d', 'cool', None, np.nan, np.nan, 17, 89, '05/31/16'],\n",
    "                    ['c', 'cool', 'medium', 16, 44, 21, 13, '07/04/16'],\n",
    "                    ['e', 'cold', 'slow', np.nan, 29, 33, 17, '08/30/16']],\n",
    "                    columns=['id', 'col1', 'col2', 'col3', 'col4', 'col5', 'col6', 'date']);\n",
    "df5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Set the index for DataFrame df5 to the df5['id'] column. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "df5.set_index('id', inplace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Validate the index for DataFrame df5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['b', 'c', 'a', 'd', 'c', 'e'], dtype='object', name='id')"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use the .loc indexer to request the rows in the range of 'b' through 'd'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>col4</th>\n",
       "      <th>col5</th>\n",
       "      <th>col6</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>b</th>\n",
       "      <td>cold</td>\n",
       "      <td>slow</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>01/01/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>c</th>\n",
       "      <td>warm</td>\n",
       "      <td>medium</td>\n",
       "      <td>4.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>9.0</td>\n",
       "      <td>03/15/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>a</th>\n",
       "      <td>hot</td>\n",
       "      <td>fast</td>\n",
       "      <td>9.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.0</td>\n",
       "      <td>04/30/16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>d</th>\n",
       "      <td>cool</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>05/31/16</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    col1    col2  col3  col4  col5  col6      date\n",
       "id                                                \n",
       "b   cold    slow   NaN   2.0   6.0   3.0  01/01/16\n",
       "c   warm  medium   4.0   5.0   7.0   9.0  03/15/16\n",
       "a    hot    fast   9.0   4.0   NaN   6.0  04/30/16\n",
       "d   cool    None   NaN   NaN  17.0  89.0  05/31/16"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5.loc['b':'d', :]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you look closely at the results from the example above, you will find the first occurence of the row 'id' label 'c' was returned, but not the second row labeled 'c'.  'id' label 'c' is obviously non-unique.  And that is only part of the issue.  Consider futher the use of a non-unique label for the row range selection in the example below.  \n",
    "\n",
    "This issue is described in sparse prose <a href=\"http://pandas.pydata.org/pandas-docs/stable/gotchas.html#non-monotonic-indexes-require-exact-matches\"> here.</a>\n",
    "\n",
    "If we should want the row label range of 'b' to 'c' with all the columns we raise the error:\n",
    "\n",
    "    \"Cannot get right slice bound for non-unique label: 'c'\"\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "\"Cannot get right slice bound for non-unique label: 'c'\"",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-58-64fc99b408b0>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mdf5\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mloc\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'b'\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;34m'c'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m:\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m__getitem__\u001b[1;34m(self, key)\u001b[0m\n\u001b[0;32m   1292\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1293\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0mtype\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32mis\u001b[0m \u001b[0mtuple\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1294\u001b[1;33m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getitem_tuple\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   1295\u001b[0m         \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1296\u001b[0m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getitem_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_getitem_tuple\u001b[1;34m(self, tup)\u001b[0m\n\u001b[0;32m    802\u001b[0m                 \u001b[1;32mcontinue\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    803\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 804\u001b[1;33m             \u001b[0mretval\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mgetattr\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mretval\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mname\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getitem_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mi\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    805\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    806\u001b[0m         \u001b[1;32mreturn\u001b[0m \u001b[0mretval\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_getitem_axis\u001b[1;34m(self, key, axis)\u001b[0m\n\u001b[0;32m   1435\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mslice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1436\u001b[0m             \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_has_valid_type\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1437\u001b[1;33m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_get_slice_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   1438\u001b[0m         \u001b[1;32melif\u001b[0m \u001b[0mis_bool_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1439\u001b[0m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getbool_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_get_slice_axis\u001b[1;34m(self, slice_obj, axis)\u001b[0m\n\u001b[0;32m   1316\u001b[0m         \u001b[0mlabels\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mobj\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_get_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1317\u001b[0m         indexer = labels.slice_indexer(slice_obj.start, slice_obj.stop,\n\u001b[1;32m-> 1318\u001b[1;33m                                        slice_obj.step, kind=self.name)\n\u001b[0m\u001b[0;32m   1319\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1320\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mindexer\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mslice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mslice_indexer\u001b[1;34m(self, start, end, step, kind)\u001b[0m\n\u001b[0;32m   2783\u001b[0m         \"\"\"\n\u001b[0;32m   2784\u001b[0m         start_slice, end_slice = self.slice_locs(start, end, step=step,\n\u001b[1;32m-> 2785\u001b[1;33m                                                  kind=kind)\n\u001b[0m\u001b[0;32m   2786\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2787\u001b[0m         \u001b[1;31m# return a slice\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mslice_locs\u001b[1;34m(self, start, end, step, kind)\u001b[0m\n\u001b[0;32m   2968\u001b[0m         \u001b[0mend_slice\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2969\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0mend\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2970\u001b[1;33m             \u001b[0mend_slice\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_slice_bound\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mend\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'right'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mkind\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2971\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0mend_slice\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2972\u001b[0m             \u001b[0mend_slice\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mlen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\Users\\randy\\Anaconda3\\lib\\site-packages\\pandas\\indexes\\base.py\u001b[0m in \u001b[0;36mget_slice_bound\u001b[1;34m(self, label, side, kind)\u001b[0m\n\u001b[0;32m   2922\u001b[0m             \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mslc\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mndarray\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2923\u001b[0m                 raise KeyError(\"Cannot get %s slice bound for non-unique \"\n\u001b[1;32m-> 2924\u001b[1;33m                                \"label: %r\" % (side, original_label))\n\u001b[0m\u001b[0;32m   2925\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2926\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mslc\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mslice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mKeyError\u001b[0m: \"Cannot get right slice bound for non-unique label: 'c'\""
     ]
    }
   ],
   "source": [
    "df5.loc['b':'c', :]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What is going on here is the values in 'id' column are non-unique.  To enable detection of this condition the attribute .index.is_montonic_increasing and .index_montonic_decreasing return a boolean to test for this non-uniqueness property."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Applied to the df5 DataFrame created in cell #54 above returns False."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5.index.is_monotonic_increasing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The .is_monotonic_increasing attribute applied to the first DataFrame created above, 'df' returns true."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.index.is_monotonic_increasing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "While not spelled out in any documentation I found, the moral of the story is when using indices with non-unique values, be wary."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Resources\n",
    "\n",
    "<a href=\"http://pandas.pydata.org/pandas-docs/stable/indexing.html\"> Indexing and Selecting Data </a> pandas 0.19.0 doucmentation\n",
    "\n",
    "<a href=\"http://www.shanelynn.ie/select-pandas-dataframe-rows-and-columns-using-iloc-loc-and-ix/\"> Selecting DataFrame rows and columns using iloc, loc, and ix in Pandas </a> by Shane Lynn\n",
    "\n",
    "<a href=\"http://chris.friedline.net/2015-12-15-rutgers/lessons/python2/02-index-slice-subset.html\"> Indexing, Slicing and Subsetting DataFrames in Python </a> by Chris Friedline.\n",
    "\n",
    "<a href=\"http://pandas.pydata.org/pandas-docs/stable/gotchas.html#non-monotonic-indexes-require-exact-matches\"> Non-monotonic indexes require exact matches </a> pandas 0.19.0 documentation\n",
    "\n",
    "<a href=\"http://www.swegler.com/becky/blog/2014/08/06/useful-pandas-snippets/\"> Useful pandas Snippets </a> by Becky Swegler, Computers are for People."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Navigation\n",
    "\n",
    "<a href=\"http://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/tree/master/\"> Return to Chapter List </a>    "
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [Root]",
   "language": "python",
   "name": "Python [Root]"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}