In [37]:
pricing = get_pricing(['AAPL', 'MSFT'], 
                      start_date='2014-01-01', 
                      end_date='2014-01-07', 
                      frequency='minute',
                      fields='price')

`pricing` is a `DataFrame` with the same structure as the return value of `history` on quantopian.

In [38]:
pricing.head(10)

Unnamed: 0,Security(24 [AAPL]),Security(5061 [MSFT])
2014-01-02 14:31:00+00:00,79.446,37.34
2014-01-02 14:32:00+00:00,79.424,37.375
2014-01-02 14:33:00+00:00,79.49,37.26
2014-01-02 14:34:00+00:00,79.502,37.26
2014-01-02 14:35:00+00:00,79.252,37.28
2014-01-02 14:36:00+00:00,79.184,37.283
2014-01-02 14:37:00+00:00,79.26,37.27
2014-01-02 14:38:00+00:00,79.3,37.3
2014-01-02 14:39:00+00:00,79.259,37.3
2014-01-02 14:40:00+00:00,79.222,37.28


Pandas' built-in `groupby` and `apply` operations are extremely powerful.  For more information on these features, see http://pandas.pydata.org/pandas-docs/stable/groupby.html.

In [39]:
from pandas.tseries.tools import normalize_date

def my_grouper(ts):
    "Function to apply to the index of the DataFrame to break it into groups."
    # Returns midnight of the supplied date.
    return normalize_date(ts)


def first_thirty_minutes(frame):
    "Function to apply to the resulting groups."
    return frame.iloc[:30]

The result of a `groupby` computation is a [Hierarchichally-Indexed DataFrame](http://pandas.pydata.org/pandas-docs/stable/advanced.html) where the outermost layer of the index is the groupby key, and the secondary layers are the values from the frame's original index.

In [40]:
data = pricing.groupby(my_grouper).apply(first_thirty_minutes)
data.head(40)

Unnamed: 0,Unnamed: 1,Security(24 [AAPL]),Security(5061 [MSFT])
2014-01-02 00:00:00+00:00,2014-01-02 14:31:00+00:00,79.446,37.34
2014-01-02 00:00:00+00:00,2014-01-02 14:32:00+00:00,79.424,37.375
2014-01-02 00:00:00+00:00,2014-01-02 14:33:00+00:00,79.49,37.26
2014-01-02 00:00:00+00:00,2014-01-02 14:34:00+00:00,79.502,37.26
2014-01-02 00:00:00+00:00,2014-01-02 14:35:00+00:00,79.252,37.28
2014-01-02 00:00:00+00:00,2014-01-02 14:36:00+00:00,79.184,37.283
2014-01-02 00:00:00+00:00,2014-01-02 14:37:00+00:00,79.26,37.27
2014-01-02 00:00:00+00:00,2014-01-02 14:38:00+00:00,79.3,37.3
2014-01-02 00:00:00+00:00,2014-01-02 14:39:00+00:00,79.259,37.3
2014-01-02 00:00:00+00:00,2014-01-02 14:40:00+00:00,79.222,37.28


Because our `DataFrame` is Hierarchically-Indexed, we can query it by our groupby keys.

In [41]:
from pandas import Timestamp
# This gives us the first thirty minutes of January 3rd.
data.loc[Timestamp('2014-01-03', tz='UTC')]

Unnamed: 0,Security(24 [AAPL]),Security(5061 [MSFT])
2014-01-03 14:31:00+00:00,78.937,37.18
2014-01-03 14:32:00+00:00,78.702,37.17
2014-01-03 14:33:00+00:00,78.756,37.1301
2014-01-03 14:34:00+00:00,78.552,37.145
2014-01-03 14:35:00+00:00,78.573,37.15
2014-01-03 14:36:00+00:00,78.616,37.14
2014-01-03 14:37:00+00:00,78.693,37.11
2014-01-03 14:38:00+00:00,78.63,37.125
2014-01-03 14:39:00+00:00,78.589,37.09
2014-01-03 14:40:00+00:00,78.543,37.104


If we want to query on the second layer of the index, we have to use `.xs` with a level argument instead of `.loc`.  

Note that `level=1` means the **second** level of the index, because the levels start at index 0.

In [42]:
data.xs(Timestamp('2014-01-03 14:58:00', tz='UTC'), level=1)

Unnamed: 0,Security(24 [AAPL]),Security(5061 [MSFT])
2014-01-03 00:00:00+00:00,78.252,37.0301


If we just want to work with the original index values, we can drop the extra level from our index.

In [43]:
data_copy = data.copy()
data_copy.index = data_copy.index.droplevel(0)
data_copy.head()

Unnamed: 0,Security(24 [AAPL]),Security(5061 [MSFT])
2014-01-02 14:31:00+00:00,79.446,37.34
2014-01-02 14:32:00+00:00,79.424,37.375
2014-01-02 14:33:00+00:00,79.49,37.26
2014-01-02 14:34:00+00:00,79.502,37.26
2014-01-02 14:35:00+00:00,79.252,37.28
