{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Pandas MultiIndex Tutorial" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# What's Pandas?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas is one of the most-used open source libraries for importing and analyzing data available in Python today. It provides convenient ways to import, view, split, apply, and combine array-like data. And not just convenient, but efficient, too. For example, Pandas' read_csv and to_csv functions are so efficient that the library is often imported just for this task instead of relying on the standard library alternative!\n", "\n", "The core value of the library, however, comes through several data structure options, primarily Series (for labeled, homogenously-typed, one-dimensional arrays) and DataFrames (for labeled, potentially heterogeneously-typed, two-dimensional arrays)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# What's a DataFrame?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "DataFrames are two-dimensional, labeled data structures, with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects. It is generally the most commonly used Pandas object. Like Series, DataFrame accepts many different kinds of input, including dicts, lists, lists of lists, Series, numpy arrays, other DataFrames, external data from CSVs, etc.\n", "\n", "The DataFrame has two core parts: an index (row labels; look like columns, but aren't) and columns (data with headers)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# What's an index?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The index of a DataFrame is a set (i.e. each element is only represented once) that consists of a label for each row. To be helpful, those labels should be not just unique, but also meaningful. By default, if no index is provided, the index will be a numbered range, starting from 0 (known as a range index).\n", "\n", "A more meaningful index, however, would be something that uniquely describes each row of your data in a way that will help you look things up. For example, in a list of transactions, the date-time might be most useful. Alternatively, in a grade book for a math class the name of the student might be most useful.\n", "\n", "The most confusing thing about understanding about Pandas' indexes at first (in my opinion) is how to interact with them. While they look exactly like columns, they're not referenced in the same way (learn more [here](https://pandas.pydata.org/pandas-docs/stable/indexing.html)). Understanding that, I personally find it most useful to think of an index in Pandas like a column that's in time-out and that just can't play with all the other columns.\n", " - Note: Another common way in which Pandas' indexes are misunderstood at first is by thinking of them in SQL-like terms. While that can be helpful if that's what you're familiar with, in practice (and for performance) Pandas' and SQL's indexes are quite different (see this [SO answer](https://stackoverflow.com/questions/42641018/pandas-column-indexing-for-searching) and [Pandas Under The Hood by Jeff Tratner](http://www.jeffreytratner.com/slides/pandas-under-the-hood-pydata-seattle-2015.pdf))." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# What is a MultiIndex DataFrame?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas' multiindex DataFrames extend the DataFrames described above by enabling effective storage and manipulation of arbitrarily high dimension data in a 2-dimensional tabular structure. ((If that sentence doesn't make sense yet, don't worry - it should by the end of this tutorial.))\n", "\n", "While the displayed version of a multiindexed DataFrame doesn't appear to be much more than a prettily-organized regular DataFrame, it's actually a pretty powerful structure if the data warrants its use." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# When should you use one?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. When a single column’s value isn’t enough to uniquely identify a row (e.g. multiple records on the same date means date alone isn’t a good index).\n", "2. When data is logically hierarchical - meaning that it has multiple dimensions or “levels.”\n", "\n", "Besides structure, multiindexes offer relatively easy in-memory retreival of complex data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Realistic Demo Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this tutorial we will work with a realistic case of when multiindex DataFrames can come in handy - a grocer's retail transactions." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Pandas version: 0.22.0'" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Normally put all your import up top, but this cell was ugly in the midst of the intro paragraphs\n", "from typing import Any\n", "\n", "import numpy as np\n", "import pandas as pd\n", "\n", "display(f\"Pandas version: {pd.__version__}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that this notebook is using the Pandas version above. There have been many changes to MultiIndex methods since 0.19, including major bug fixes. **It is STRONGLY recommended to use the latest version of Pandas, but at least version 0.21 is required for all of the techniques in this notebook to work as presented.**" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# # Creates random mock data.\n", "# # Based on the UPCs found in upc_meta_data.csv and saves to data.csv\n", "# %run mocker.py" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DateStoreCategorySubcategoryUPC EANDescriptionDollarsUnits
202018-07-14Store 3BeerLagers702770081011Omission - Gluten Free Lager - 6 Pack128.6412
1012018-07-10Store 3AlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL Bottle128.305
682018-07-14Store 1WineRed492130008399Yellow Tail - Merlot - 750 mL Bottle32.106
122018-07-16Store 2BeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 Pack99.4010
582018-07-16Store 1WineRed89744402585Trapiche - Malbec - 750 mL Bottle126.1414
12018-07-11Store 2BeerAles736920111112Goose Island - Honkers Ale - 6 Pack166.7414
1002018-07-10Store 2AlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL Bottle231.309
592018-07-16Store 3WineRed89744402585Trapiche - Malbec - 750 mL Bottle66.168
1162018-07-11Store 3AlcoholLiquor674545000001Don Julio - Tequila Blanco - 750 mL Bottle102.044
772018-07-15Store 3WineRed492130008399Yellow Tail - Merlot - 750 mL Bottle58.2012
\n", "
" ], "text/plain": [ " Date Store Category Subcategory UPC EAN \\\n", "20 2018-07-14 Store 3 Beer Lagers 702770081011 \n", "101 2018-07-10 Store 3 Alcohol Liqour 80480280024 \n", "68 2018-07-14 Store 1 Wine Red 492130008399 \n", "12 2018-07-16 Store 2 Beer Ales 94922755711 \n", "58 2018-07-16 Store 1 Wine Red 89744402585 \n", "1 2018-07-11 Store 2 Beer Ales 736920111112 \n", "100 2018-07-10 Store 2 Alcohol Liqour 80480280024 \n", "59 2018-07-16 Store 3 Wine Red 89744402585 \n", "116 2018-07-11 Store 3 Alcohol Liquor 674545000001 \n", "77 2018-07-15 Store 3 Wine Red 492130008399 \n", "\n", " Description Dollars Units \n", "20 Omission - Gluten Free Lager - 6 Pack 128.64 12 \n", "101 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 5 \n", "68 Yellow Tail - Merlot - 750 mL Bottle 32.10 6 \n", "12 Bowser Beer - Beefy Brown Ale - 6 Pack 99.40 10 \n", "58 Trapiche - Malbec - 750 mL Bottle 126.14 14 \n", "1 Goose Island - Honkers Ale - 6 Pack 166.74 14 \n", "100 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 9 \n", "59 Trapiche - Malbec - 750 mL Bottle 66.16 8 \n", "116 Don Julio - Tequila Blanco - 750 mL Bottle 102.04 4 \n", "77 Yellow Tail - Merlot - 750 mL Bottle 58.20 12 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('data.csv', parse_dates=['Date'])\n", "df.sample(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each row of data represents a sale of an item, but how do we set our index meaningfully? A numeric ID for each transaction would be fine, but it wouldn't tell us terribly much. Even the date of a transaction isn't useful by itself, since it's common for such a company to have many transactions on a given date (even at the same time). \n", "\n", "Instead, we will have to look at a combination of the available metadata to create a unique and meaningful index. At first glance, a combination of the date, the store, and the hierarchy of each item (Category > Subcategory > UPC) look like they might be enough to get us a unique index." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Setting and Manipulating MultiIndexes\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So let's take a look at how we can create our multiindex from our regular ol' DataFrame. We'll walk through the basics of setting, reordering, and resetting indexes, along with some useful tips/tricks. Then we can begin investigating our transaction data to learn about our sales and trends." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
UPC EANDollarsUnits
DateStoreCategorySubcategoryDescription
2018-07-11Store 3BeerAlesGoose Island - Honkers Ale - 6 Pack73692011111235.684
Store 2BeerAlesGoose Island - Honkers Ale - 6 Pack736920111112166.7414
2018-07-10Store 3BeerAlesGoose Island - Honkers Ale - 6 Pack73692011111267.368
\n", "
" ], "text/plain": [ " UPC EAN \\\n", "Date Store Category Subcategory Description \n", "2018-07-11 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 736920111112 \n", " Store 2 Beer Ales Goose Island - Honkers Ale - 6 Pack 736920111112 \n", "2018-07-10 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 736920111112 \n", "\n", " Dollars \\\n", "Date Store Category Subcategory Description \n", "2018-07-11 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 35.68 \n", " Store 2 Beer Ales Goose Island - Honkers Ale - 6 Pack 166.74 \n", "2018-07-10 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 67.36 \n", "\n", " Units \n", "Date Store Category Subcategory Description \n", "2018-07-11 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 4 \n", " Store 2 Beer Ales Goose Island - Honkers Ale - 6 Pack 14 \n", "2018-07-10 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 8 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Set just like the index for a DataFrame...\n", "# ...except we give a list of column names instead of a single string column name\n", "df.set_index(['Date', 'Store', 'Category', 'Subcategory', 'Description'], inplace=True)\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Uh oh - it looks like we forgot to add the 'UPC EAN' column to our index, but don't worry - Pandas has us covered with extra set_index parameters for MultiIndexes:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
DateStoreCategorySubcategoryDescriptionUPC EAN
2018-07-11Store 3BeerAlesGoose Island - Honkers Ale - 6 Pack73692011111235.684
Store 2BeerAlesGoose Island - Honkers Ale - 6 Pack736920111112166.7414
2018-07-10Store 3BeerAlesGoose Island - Honkers Ale - 6 Pack73692011111267.368
\n", "
" ], "text/plain": [ " Dollars \\\n", "Date Store Category Subcategory Description UPC EAN \n", "2018-07-11 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 736920111112 35.68 \n", " Store 2 Beer Ales Goose Island - Honkers Ale - 6 Pack 736920111112 166.74 \n", "2018-07-10 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 736920111112 67.36 \n", "\n", " Units \n", "Date Store Category Subcategory Description UPC EAN \n", "2018-07-11 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 736920111112 4 \n", " Store 2 Beer Ales Goose Island - Honkers Ale - 6 Pack 736920111112 14 \n", "2018-07-10 Store 3 Beer Ales Goose Island - Honkers Ale - 6 Pack 736920111112 8 " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We can append a column to our existing index\n", "df.set_index('UPC EAN', append=True, inplace=True)\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's almost right, but we'd actually like 'Description' to show up after 'UPC EAN'. We have a couple of options to get things in the right order:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
DateStoreCategorySubcategoryUPC EANDescription
2018-07-11Store 3BeerAles736920111112Goose Island - Honkers Ale - 6 Pack35.684
Store 2BeerAles736920111112Goose Island - Honkers Ale - 6 Pack166.7414
2018-07-10Store 3BeerAles736920111112Goose Island - Honkers Ale - 6 Pack67.368
\n", "
" ], "text/plain": [ " Dollars \\\n", "Date Store Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 35.68 \n", " Store 2 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 166.74 \n", "2018-07-10 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 67.36 \n", "\n", " Units \n", "Date Store Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 4 \n", " Store 2 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 14 \n", "2018-07-10 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 8 " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Option 1 is the generalized solution to reorder the index levels\n", "# Note: We're not making an inplace change in this cell,\n", "# but it's worth noting that this method doesn't have an inplace parameter.\n", "df.reorder_levels(order=['Date', 'Store', 'Category', 'Subcategory', 'UPC EAN', 'Description']).head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "reorder_levels() is useful, but it was a pain to have to type all five levels just two switch two. In cases like this we have a second, less verbose option:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
DateStoreCategorySubcategoryUPC EANDescription
2018-07-11Store 3BeerAles736920111112Goose Island - Honkers Ale - 6 Pack35.684
Store 2BeerAles736920111112Goose Island - Honkers Ale - 6 Pack166.7414
2018-07-10Store 3BeerAles736920111112Goose Island - Honkers Ale - 6 Pack67.368
\n", "
" ], "text/plain": [ " Dollars \\\n", "Date Store Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 35.68 \n", " Store 2 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 166.74 \n", "2018-07-10 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 67.36 \n", "\n", " Units \n", "Date Store Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 4 \n", " Store 2 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 14 \n", "2018-07-10 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 8 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Option 2 just switches two index levels (a more common need than you'd think)\n", "# Note: This time we're doing an inplace change, but there's no parameter for this method either.\n", "df = df.swaplevel('Description', 'UPC EAN')\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Just when we thought we were done, it turns our we forgot to add the highest level of the product hierarchy - the Department - not just to our index, but to our DataFrame altogether. Luckily all of our records belong in the same Department, so here's a neat trick to add a new column with all the same values as a level in an existing index:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
DateStoreDepartmentCategorySubcategoryUPC EANDescription
2018-07-11Store 3BoooozeBeerAles736920111112Goose Island - Honkers Ale - 6 Pack35.684
Store 2BoooozeBeerAles736920111112Goose Island - Honkers Ale - 6 Pack166.7414
2018-07-10Store 3BoooozeBeerAles736920111112Goose Island - Honkers Ale - 6 Pack67.368
\n", "
" ], "text/plain": [ " Dollars \\\n", "Date Store Department Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 35.68 \n", " Store 2 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 166.74 \n", "2018-07-10 Store 3 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 67.36 \n", "\n", " Units \n", "Date Store Department Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 4 \n", " Store 2 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 14 \n", "2018-07-10 Store 3 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 8 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# A handy function to keep around for projects\n", "def add_constant_index_level(df: pd.DataFrame, value: Any, level_name: str):\n", " \"\"\"Add a new level to an existing index where every row has the same, given value.\n", " \n", " Args:\n", " df: Any existing pd.DataFrame.\n", " value: Value to be placed in every row of the new index level.\n", " level_name: Title of the new index level.\n", " \n", " Returns:\n", " df with an additional, prepended index level.\n", " \"\"\"\n", " return pd.concat([df], keys=[value], names=[level_name])\n", "\n", "df = add_constant_index_level(df, \"Booooze\", \"Department\")\n", "df = df.reorder_levels(order=['Date', 'Store', 'Department', 'Category', 'Subcategory', 'UPC EAN', 'Description'])\n", "df.head(3)\n", "\n", "# # If we wanted to later drop that level\n", "# df.index = df.index.droplevel(level='Department')\n", "# df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that our index is set the way we want it, what if we want to interact with those index levels? Here are a few helpful code snippets:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['Ales', 'Lagers', 'Stouts', 'Malts', 'Red', 'White', 'Rose', 'Liqour',\n", " 'Liquor'],\n", " dtype='object', name='Subcategory')" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# checking out their unique values, for a single level \n", "df.index.get_level_values('Subcategory').unique()\n", "\n", "# checking out their unique values, for combinations of multiple levels\n", "# See answer at https://stackoverflow.com/questions/39080555/pandas-get-level-values-for-multiple-columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note the typo in \"Liquor\" above. Good thing we checked out unique values! Maybe someone can submit a pull request to fix this for me :)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
DateStoreDepartmentCategorySubcategoryUPC EANDescription
2018-07-11Store 3BoooozeBeerAles736920111112We changed this35.684
Store 2BoooozeBeerAles736920111112We changed this166.7414
2018-07-10Store 3BoooozeBeerAles736920111112We changed this67.368
Store 2BoooozeBeerAles736920111112We changed this80.019
2018-07-12Store 3BoooozeBeerAles736920111112We changed this78.3010
Store 2BoooozeBeerAles736920111112We changed this37.755
2018-07-14Store 3BoooozeBeerAles736920111112We changed this113.4913
Store 2BoooozeBeerAles736920111112We changed this45.055
2018-07-10Store 2BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 Pack81.768
Store 3BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 Pack62.488
2018-07-11Store 2BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 Pack104.9113
Store 3BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 Pack72.729
2018-07-16Store 2BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 Pack99.4010
Store 3BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 Pack46.655
2018-07-14Store 2BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 Pack124.5211
\n", "
" ], "text/plain": [ " Dollars \\\n", "Date Store Department Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Booooze Beer Ales 736920111112 We changed this 35.68 \n", " Store 2 Booooze Beer Ales 736920111112 We changed this 166.74 \n", "2018-07-10 Store 3 Booooze Beer Ales 736920111112 We changed this 67.36 \n", " Store 2 Booooze Beer Ales 736920111112 We changed this 80.01 \n", "2018-07-12 Store 3 Booooze Beer Ales 736920111112 We changed this 78.30 \n", " Store 2 Booooze Beer Ales 736920111112 We changed this 37.75 \n", "2018-07-14 Store 3 Booooze Beer Ales 736920111112 We changed this 113.49 \n", " Store 2 Booooze Beer Ales 736920111112 We changed this 45.05 \n", "2018-07-10 Store 2 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 81.76 \n", " Store 3 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 62.48 \n", "2018-07-11 Store 2 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 104.91 \n", " Store 3 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 72.72 \n", "2018-07-16 Store 2 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 99.40 \n", " Store 3 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 46.65 \n", "2018-07-14 Store 2 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 124.52 \n", "\n", " Units \n", "Date Store Department Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Booooze Beer Ales 736920111112 We changed this 4 \n", " Store 2 Booooze Beer Ales 736920111112 We changed this 14 \n", "2018-07-10 Store 3 Booooze Beer Ales 736920111112 We changed this 8 \n", " Store 2 Booooze Beer Ales 736920111112 We changed this 9 \n", "2018-07-12 Store 3 Booooze Beer Ales 736920111112 We changed this 10 \n", " Store 2 Booooze Beer Ales 736920111112 We changed this 5 \n", "2018-07-14 Store 3 Booooze Beer Ales 736920111112 We changed this 13 \n", " Store 2 Booooze Beer Ales 736920111112 We changed this 5 \n", "2018-07-10 Store 2 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 8 \n", " Store 3 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 8 \n", "2018-07-11 Store 2 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 13 \n", " Store 3 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 9 \n", "2018-07-16 Store 2 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 10 \n", " Store 3 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 5 \n", "2018-07-14 Store 2 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 11 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Replace level values using rename\n", "# Note that this can be done using set_levels as well, but it's a pain\n", "df.rename(index={'Goose Island - Honkers Ale - 6 Pack': 'We changed this'}).head(15)\n", "\n", "# Replace np.nan level values\n", "# df.rename(index={np.nan: \"''\"}, inplace=True)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
DateStoreDepartmentCategorySubcategoryUPC EANDesc.
2018-07-11Store 3BoooozeBeerAles736920111112Goose Island - Honkers Ale - 6 Pack35.684
Store 2BoooozeBeerAles736920111112Goose Island - Honkers Ale - 6 Pack166.7414
2018-07-10Store 3BoooozeBeerAles736920111112Goose Island - Honkers Ale - 6 Pack67.368
\n", "
" ], "text/plain": [ " Dollars \\\n", "Date Store Department Category Subcategory UPC EAN Desc. \n", "2018-07-11 Store 3 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 35.68 \n", " Store 2 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 166.74 \n", "2018-07-10 Store 3 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 67.36 \n", "\n", " Units \n", "Date Store Department Category Subcategory UPC EAN Desc. \n", "2018-07-11 Store 3 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 4 \n", " Store 2 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 14 \n", "2018-07-10 Store 3 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 8 " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Rename index levels\n", "temp_df = df.copy()\n", "temp_df.index = df.index.set_names('Desc.', level='Description')\n", "temp_df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Understanding the MultiIndex Object" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Why is this section all the way down here? Because the MultiIndex object is scary looking if you're new to using them. Many guides to hierarchical data analysis using multiindex DataFrames start with DataFrame creation and manipulation using MultiIndex objects, which I think both hinders adoption and is not reflective of how a lot of DataFrames get created in practice. As a result, my explanation of MultiIndex objects is very basic, because there are lots of other great resources out there if you want to learn more. Here are my top two:\n", " * [Official guide](https://pandas.pydata.org/pandas-docs/stable/advanced.html?highlight=indexslice#hierarchical-indexing-multiindex)\n", " * [Python Data Science Handbook by Jake Vanderplas](https://jakevdp.github.io/PythonDataScienceHandbook/03.05-hierarchical-indexing.html#Methods-of-MultiIndex-Creation)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "MultiIndex(levels=[[2018-07-10 00:00:00, 2018-07-11 00:00:00, 2018-07-12 00:00:00, 2018-07-13 00:00:00, 2018-07-14 00:00:00, 2018-07-15 00:00:00, 2018-07-16 00:00:00], ['Store 1', 'Store 2', 'Store 3'], ['Booooze'], ['Alcohol', 'Beer', 'Wine'], ['Ales', 'Lagers', 'Liqour', 'Liquor', 'Malts', 'Red', 'Rose', 'Stouts', 'White'], [71250000140, 80432400630, 80480280024, 81248501095, 83783575217, 85000024218, 89744402585, 94922755711, 492130008399, 674545000001, 702770081011, 702770082018, 736920111112], ['Bowser Beer - Beefy Brown Ale - 6 Pack', 'Colt 45 - Malt Liquor - 6 Pack', 'Curious Beasts - Chardonnay - 750 mL Bottle', 'Dark Horse - S. Blanc Rose - 750 mL Bottle', 'Don Julio - Tequila Blanco - 750 mL Bottle', 'Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle', 'Goose Island - Honkers Ale - 6 Pack', 'Grey Goose - Imported Vodka - 750 mL Bottle', 'Omission - Gluten Free Lager - 6 Pack', 'Omission Pale Ale - Gluten Free Pale Ale - 6 Pack', 'Sierra Nevada Brewing Co. - Stout - 6 Pack', 'Trapiche - Malbec - 750 mL Bottle', 'Yellow Tail - Merlot - 750 mL Bottle']],\n", " labels=[[1, 1, 0, 0, 2, 2, 4, 4, 0, 0, 1, 1, 6, 6, 4, 4, 5, 5, 2, 2, 4, 4, 0, 0, 6, 6, 2, 2, 4, 4, 3, 3, 0, 0, 1, 1, 5, 5, 0, 0, 1, 1, 6, 6, 5, 5, 2, 2, 6, 6, 1, 1, 5, 5, 3, 3, 2, 2, 6, 6, 5, 5, 0, 0, 1, 1, 4, 4, 4, 4, 3, 3, 0, 0, 6, 6, 5, 5, 2, 2, 3, 3, 2, 2, 0, 0, 1, 1, 4, 4, 6, 6, 0, 0, 3, 3, 2, 2, 1, 1, 0, 0, 5, 5, 2, 2, 1, 1, 6, 6, 2, 2, 6, 6, 0, 0, 1, 1, 5, 5, 2, 2, 3, 3, 1, 1, 4, 4, 5, 5, 0, 0], [2, 1, 2, 1, 2, 1, 2, 1, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 0, 2, 0, 2, 0, 2, 0, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 6, 6, 6, 6, 6, 6, 6, 6, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3], [12, 12, 12, 12, 12, 12, 12, 12, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 10, 10, 10, 10, 10, 10, 10, 10, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5, 5, 5, 5, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [6, 6, 6, 6, 6, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 8, 8, 8, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 1, 1, 1, 1, 1, 1, 1, 1, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5]],\n", " names=['Date', 'Store', 'Department', 'Category', 'Subcategory', 'UPC EAN', 'Description'])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Well that's gross looking...but don't be scared - it's actually not that hard to understand.\n", "\n", "**'levels'** is a list of lists, where each sublist represents all possible values in that index level. In other words, the 'levels' parameter reflects all possible unique values by level. For example, our first index level ('Date') has the possible values \\['2018-07-10', '2018-07-11', '2018-07-12', ...\\].\n", "\n", "* **Important Note:** When talking about a multiindex DataFrame (not the parameter for the MultiIndex object), we talk about the \"levels\" as the index \"columns.\" For example, the 'levels' of our df in a more general sense are 'Date', 'Store', 'Department', etc. Levels in this sense (and elsewhere in code) can also be referenced by number (e.g. 'Date' = 0 \\[read as 'level 0'\\], 'Store' = 1, 'Department' = 2, etc.).\n", "\n", "**'labels'** is also a list of lists, but here each sublist reflects all of the values that appear in the row of that index. In other words, each sublist in our labels is of the same length as the entire dataframe, and the value of each row is one of the possible values defined in our associated level (above). Looking again at our first index level ('Date'), we see values like \\[1, 1, 0, 0, 2, 2, 4, 4 ...\\]. There are just an enumerated representation of the options defined in our level, so 0 = '2018-07-10', 1 = '2018-07-11', 2 = '2018-07-12', 3 = '2018-07-13', etc.\n", "\n", "**'names'** is a list of the actual titles of each index level, in order of appearance from left to right." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With that fresh understanding of the 'anatomy' of a MultiIndex, we can look at..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Other Methods of Multiindex DataFrame Creation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the most part, the two references listed in the section above cover this topic well; however, a common use case that isn't covered in those guides is creating a multiindex DataFrame while reading from a csv:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
DateStoreCategorySubcategoryUPC EANDescription
2018-07-11Store 3BeerAles736920111112Goose Island - Honkers Ale - 6 Pack35.684
Store 2BeerAles736920111112Goose Island - Honkers Ale - 6 Pack166.7414
2018-07-10Store 3BeerAles736920111112Goose Island - Honkers Ale - 6 Pack67.368
\n", "
" ], "text/plain": [ " Dollars \\\n", "Date Store Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 35.68 \n", " Store 2 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 166.74 \n", "2018-07-10 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 67.36 \n", "\n", " Units \n", "Date Store Category Subcategory UPC EAN Description \n", "2018-07-11 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 4 \n", " Store 2 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 14 \n", "2018-07-10 Store 3 Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 8 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We can set a MultiIndex while reading a csv by referencing columns to be used in the index by number\n", "pd.read_csv(\"data.csv\", index_col=[0, 1, 2, 3, 4, 5], skipinitialspace=True, parse_dates=['Date']).head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll review more advanced importing/exporting methods below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# MultiIndex Columns (Multiple Column Levels)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For a different view we can also create hierarchical column levels. We'll do this by introducting a new method: unstack(). This function \"pivots\" an index level to a new level of column labels whose inner-most level consists of the pivoted index labels. **Stack/unstack is one of the biggest reasons to use a MultiIndex, so it's work supplementing the examples here by checking out the [official docs on reshaping](https://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-by-stacking-and-unstacking).**\n", "\n", "With this new technique we can start actually investigating our data. For example, let's say we want to more easily compare sales of a product by store by day, we can unstack our 'Store' index level:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-16BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN221.04126.66NaN9.06.0
Liquor674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN161.15345.07NaN5.011.0
2018-07-10BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
2018-07-11BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN201.72318.80NaN6.08.0
2018-07-16BoooozeBeerLagers702770081011Omission - Gluten Free Lager - 6 Pack174.72NaN104.8614.0NaN7.0
2018-07-12BoooozeWineRed492130008399Yellow Tail - Merlot - 750 mL Bottle95.16NaN75.6012.0NaN10.0
2018-07-10BoooozeBeerAles702770082018Omission Pale Ale - Gluten Free Pale Ale - 6 PackNaN89.76135.72NaN6.013.0
2018-07-15BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN415.68476.64NaN12.012.0
2018-07-13BoooozeBeerStouts83783575217Sierra Nevada Brewing Co. - Stout - 6 PackNaN82.6775.74NaN7.07.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-16 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-16 Booooze Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 174.72 \n", "2018-07-12 Booooze Wine Red 492130008399 Yellow Tail - Merlot - 750 mL Bottle 95.16 \n", "2018-07-10 Booooze Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", "2018-07-15 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-13 Booooze Beer Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-16 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 221.04 \n", " Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 161.15 \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 201.72 \n", "2018-07-16 Booooze Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", "2018-07-12 Booooze Wine Red 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "2018-07-10 Booooze Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 89.76 \n", "2018-07-15 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 415.68 \n", "2018-07-13 Booooze Beer Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 82.67 \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-16 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 126.66 \n", " Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 345.07 \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 318.80 \n", "2018-07-16 Booooze Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 104.86 \n", "2018-07-12 Booooze Wine Red 492130008399 Yellow Tail - Merlot - 750 mL Bottle 75.60 \n", "2018-07-10 Booooze Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 135.72 \n", "2018-07-15 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 476.64 \n", "2018-07-13 Booooze Beer Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 75.74 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-16 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-16 Booooze Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 14.0 \n", "2018-07-12 Booooze Wine Red 492130008399 Yellow Tail - Merlot - 750 mL Bottle 12.0 \n", "2018-07-10 Booooze Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", "2018-07-15 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-13 Booooze Beer Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-16 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 5.0 \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 6.0 \n", "2018-07-16 Booooze Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", "2018-07-12 Booooze Wine Red 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "2018-07-10 Booooze Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 6.0 \n", "2018-07-15 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 12.0 \n", "2018-07-13 Booooze Beer Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 7.0 \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-16 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 6.0 \n", " Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 11.0 \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", "2018-07-16 Booooze Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 7.0 \n", "2018-07-12 Booooze Wine Red 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", "2018-07-10 Booooze Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 13.0 \n", "2018-07-15 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 12.0 \n", "2018-07-13 Booooze Beer Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 7.0 " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "multi_col_lvl_df = df.unstack('Store')\n", "multi_col_lvl_df.sample(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The new view makes our comparison easier, but now it's a bit cluttered. Internally our multi-level columns are stored as tuples of the name values for each level, so we can easily fix the clutter by flattening the columns into a single level:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Dollars | Store 1Dollars | Store 2Dollars | Store 3Units | Store 1Units | Store 2Units | Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
\n", "
" ], "text/plain": [ " Dollars | Store 1 \\\n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " Dollars | Store 2 \\\n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", "\n", " Dollars | Store 3 \\\n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", "\n", " Units | Store 1 \\\n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " Units | Store 2 \\\n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "\n", " Units | Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def flatten_cols(df: pd.DataFrame, delim: str = \"\"):\n", " \"\"\"Flatten multiple column levels of the DataFrame into a one column level.\n", "\n", " Args:\n", " delim: the delimiter between the column values.\n", "\n", " Returns:\n", " A copy of the dataframe with the new column names.\n", "\n", " \"\"\"\n", " new_cols = [delim.join((col_lev for col_lev in tup if col_lev))\n", " for tup in df.columns.values]\n", " ndf = df.copy()\n", " ndf.columns = new_cols\n", "\n", " return ndf\n", "\n", "flattened_multi_col_df = flatten_cols(multi_col_lvl_df, \" | \").head(3)\n", "flattened_multi_col_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If later we want to undo that flattening it's just as simple:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
Store 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
\n", "
" ], "text/plain": [ " Dollars \\\n", " Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", " Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", "\n", " \\\n", " Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", "\n", " Units \\\n", " Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", " Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "\n", " \n", " Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def unflatten_cols(df: pd.DataFrame, delim: str = \"\"):\n", " \"\"\"Unflatten a single column level into multiple column levels.\n", "\n", " Args:\n", " delim: the delimiter to split on to identify the multiple column values.\n", "\n", " Returns:\n", " A copy of the dataframe with the new column levels.\n", "\n", " \"\"\"\n", " new_cols = pd.MultiIndex.from_tuples([tuple(col.split(delim)) for col in df.columns])\n", " ndf = df.copy()\n", " ndf.columns = new_cols\n", "\n", " return ndf\n", "\n", "unflatten_cols(flattened_multi_col_df, \" | \")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Importing/Exporting MultiIndex DataFrames" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We've already seen an example of reading a csv, but what if we want to save our multiindex DataFrame and then be able to reread it? How the stored files need to be accessed and by whom they need to be accessed will determine a lot. If everyone who needs access to the data is Python/Pandas savy, [pickling](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.to_pickle.html) is fast and easy. If, however, other, not-tech-savy people will need to access the data, CSVs are a versitile storage medium. That said, it complicates our read_csv() a bit if we want to just dump out our multiindex. Revisiting our multi_col_lvl_df, where we have a multiindex DataFrame that has both multiple index levels and column levels, creates a difficult situation that requires getting to know as many as all 48 (yes, 48!) of read_csv's parameters." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Write our multi-column-level df\n", "multi_col_lvl_df.to_csv('multi_col_lvl_output.csv')\n", "\n", "# Reading it back in requires the header parameter\n", "read_multi_df = pd.read_csv('multi_col_lvl_output.csv', header=[0, 1], index_col=[0, 1, 2, 3, 4, 5],\n", " skipinitialspace=True, parse_dates=[0]).head(3)\n", "\n", "read_multi_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By adding a header parameter to deal with our multiple column levels, on top of index_col for the setting of the index, this import looks good, but it required inspecting the csv for the column and row numbers (that's right - names won't work. Just to be sure everything worked, let's also check our dtypes:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Date: datetime64[ns]',\n", " 'Department: object',\n", " 'Category: object',\n", " 'Subcategory: object',\n", " 'UPC EAN: int64',\n", " 'Description: object']" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# A function to check our index level dtypes to aid this example\n", "def index_level_dtypes(df):\n", " return [f\"{df.index.names[i]}: {df.index.get_level_values(n).dtype}\"\n", " for i, n in enumerate(df.index.names)]\n", "\n", "index_level_dtypes(read_multi_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looks good. However let's take a step back and look at what would have happened if we'd not parsed the dates and wanted to later change the dtype:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "['Date: object',\n", " 'Department: object',\n", " 'Category: object',\n", " 'Subcategory: object',\n", " 'UPC EAN: int64',\n", " 'Description: object']" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Reading it back iwithout parse_dates\n", "bad_dtype_df = pd.read_csv('multi_col_lvl_output.csv', header=[0, 1], index_col=[0, 1, 2, 3, 4, 5],\n", " skipinitialspace=True).head(3)\n", "\n", "display(bad_dtype_df)\n", "display(index_level_dtypes(bad_dtype_df))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Updating the dtypes of our index columns isn't going to be so simple, because our MultiIndex levels are immutable. To make any changes to the levels, we actually have to recreate the levels:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Date: datetime64[ns]',\n", " 'Department: object',\n", " 'Category: object',\n", " 'Subcategory: object',\n", " 'UPC EAN: int64',\n", " 'Description: object']" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bad_dtype_df.index.set_levels([pd.to_datetime(bad_dtype_df.index.levels[0]), bad_dtype_df.index.levels[1],\n", " bad_dtype_df.index.levels[2], bad_dtype_df.index.levels[3],\n", " bad_dtype_df.index.levels[4], bad_dtype_df.index.levels[5]],\n", " inplace=True)\n", "index_level_dtypes(bad_dtype_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively we could reset just the 'Date' level, update its dtype, add it back to our index, and finally reorder our index:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Date: datetime64[ns]',\n", " 'Department: object',\n", " 'Category: object',\n", " 'Subcategory: object',\n", " 'UPC EAN: int64',\n", " 'Description: object']" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bad_dtype_df2 = bad_dtype_df.reset_index(level='Date')\n", "bad_dtype_df2['Date'] = pd.to_datetime(bad_dtype_df2['Date'],infer_datetime_format=True)\n", "bad_dtype_df2.set_index('Date', append=True, inplace=True)\n", "bad_dtype_df2 = (bad_dtype_df2.swaplevel('Date', 'Description')\n", " .swaplevel('Date', 'UPC EAN')\n", " .swaplevel('Date', 'Subcategory')\n", " .swaplevel('Date', 'Category')\n", " .swaplevel('Date', 'Department'))\n", "index_level_dtypes(bad_dtype_df2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The moral of the story is, if you want to export and import complex multiindex DataFrames, learn read_csv's parameters well! The second moral is that, either of the above situations results in an awful lot of work.\n", "\n", "One alternative if the readability of the CSV by other users is important, is to simply reset the column levels and index levels before we write to a CSV and recreate them when we import. While it requires more steps and information is lost while in CSV form, it makes the code manipulations a lot easier:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateCategorySubcategoryUPC EANDescription
2018-07-10AlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Category Subcategory UPC EAN Description \n", "2018-07-10 Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Category Subcategory UPC EAN Description \n", "2018-07-10 Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", "\n", " \\\n", "Store Store 3 \n", "Date Category Subcategory UPC EAN Description \n", "2018-07-10 Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Category Subcategory UPC EAN Description \n", "2018-07-10 Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Category Subcategory UPC EAN Description \n", "2018-07-10 Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "\n", " \n", "Store Store 3 \n", "Date Category Subcategory UPC EAN Description \n", "2018-07-10 Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# a) restack the column levels,\n", "# b) drop any blanks the unstacking created, and then\n", "# c) reset the index so everything is a flat column again\n", "# d) output to csv\n", "multi_col_lvl_df.stack().dropna().reset_index().to_csv('index_removed_output.csv')\n", "\n", "# Reading it back in will require\n", "read_df = pd.read_csv(\"data.csv\", index_col=[0, 1, 2, 3, 4, 5], skipinitialspace=True, parse_dates=['Date'])\n", "read_df.unstack('Store').head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Still takes some work, but the code's a lot more straightforward." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To this point we've really just been learning the basics of setting up our multiindex DataFrames, with a few neat tricks along the way. Now comes the fun part - actually interacting with those DataFrames to analyze our data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Multiindex Math" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Math operations are as nearly as easy with multiindex DataFrames as with regular ones, but a whole lot more powerful. Let's say we want to calculate dollars per unit for each of our stores. That's as simple as:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StoreStore 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-11BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN33.6239.85
2018-07-16BoooozeAlcoholLiquor674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN32.2331.37
2018-07-10BoooozeAlcoholLiquor674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN28.0129.16
WineWhite81248501095Curious Beasts - Chardonnay - 750 mL BottleNaN20.9419.90
Red89744402585Trapiche - Malbec - 750 mL Bottle7.11NaN6.30
BeerAles702770082018Omission Pale Ale - Gluten Free Pale Ale - 6 PackNaN14.9610.44
AlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN25.7025.66
2018-07-16BoooozeWineRed492130008399Yellow Tail - Merlot - 750 mL Bottle4.89NaN7.76
2018-07-12BoooozeBeerStouts83783575217Sierra Nevada Brewing Co. - Stout - 6 PackNaN10.617.81
2018-07-16BoooozeWineRed89744402585Trapiche - Malbec - 750 mL Bottle9.01NaN8.27
\n", "
" ], "text/plain": [ "Store Store 1 \\\n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-16 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-10 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Wine White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", " Red 89744402585 Trapiche - Malbec - 750 mL Bottle 7.11 \n", " Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", " Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", "2018-07-16 Booooze Wine Red 492130008399 Yellow Tail - Merlot - 750 mL Bottle 4.89 \n", "2018-07-12 Booooze Beer Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", "2018-07-16 Booooze Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 9.01 \n", "\n", "Store Store 2 \\\n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 33.62 \n", "2018-07-16 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 32.23 \n", "2018-07-10 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 28.01 \n", " Wine White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 20.94 \n", " Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 14.96 \n", " Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 25.70 \n", "2018-07-16 Booooze Wine Red 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "2018-07-12 Booooze Beer Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 10.61 \n", "2018-07-16 Booooze Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", "\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 39.85 \n", "2018-07-16 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 31.37 \n", "2018-07-10 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 29.16 \n", " Wine White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 19.90 \n", " Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.30 \n", " Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 10.44 \n", " Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 25.66 \n", "2018-07-16 Booooze Wine Red 492130008399 Yellow Tail - Merlot - 750 mL Bottle 7.76 \n", "2018-07-12 Booooze Beer Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 7.81 \n", "2018-07-16 Booooze Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.27 " ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dollars_per_unit = multi_col_lvl_df['Dollars'] / multi_col_lvl_df['Units']\n", "dollars_per_unit.sample(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We just operated on all the pairwise sub column levels at the same time! Now we'd like this answer as columns back in our original DataFrame:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnitsDollars per Unit
StoreStore 1Store 2Store 3Store 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0NaN25.7025.66
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0NaN34.4034.85
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0NaN28.0129.16
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 \n", "\n", " Dollars per Unit \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 25.70 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 34.40 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 28.01 \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 25.66 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 34.85 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 29.16 " ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Add a column level for our new measure\n", "dollars_per_unit.columns = pd.MultiIndex.from_product([['Dollars per Unit'], dollars_per_unit.columns])\n", "\n", "# Concat it with our original data\n", "pd.concat([multi_col_lvl_df, dollars_per_unit], axis='columns').head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can similarly apply functions to our multiindex DataFrame:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN0.23k0.13kNaN0.01k0.01k
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN0.17k0.28kNaN0.01k0.01k
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN0.17k0.2kNaN0.01k0.01k
BeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN0.08k0.06kNaN0.01k0.01k
702770082018Omission Pale Ale - Gluten Free Pale Ale - 6 PackNaN0.09k0.14kNaN0.01k0.01k
736920111112Goose Island - Honkers Ale - 6 PackNaN0.08k0.07kNaN0.01k0.01k
Lagers702770081011Omission - Gluten Free Lager - 6 Pack0.14kNaN0.18k0.01kNaN0.01k
Stouts83783575217Sierra Nevada Brewing Co. - Stout - 6 PackNaN0.08k0.12kNaN0.01k0.01k
WineRed89744402585Trapiche - Malbec - 750 mL Bottle0.06kNaN0.04k0.01kNaN0.01k
492130008399Yellow Tail - Merlot - 750 mL Bottle0.05kNaN0.03k0.01kNaN0.01k
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack NaN \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 0.14k \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 0.06k \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 0.05k \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 0.23k \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 0.17k \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 0.17k \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 0.08k \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 0.09k \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 0.08k \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 0.08k \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 0.13k \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 0.28k \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 0.2k \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 0.06k \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 0.14k \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 0.07k \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 0.18k \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 0.12k \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 0.04k \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 0.03k \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack NaN \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 0.01k \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 0.01k \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 0.01k \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 0.01k \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 0.01k \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 0.01k \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 0.01k \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 0.01k \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 0.01k \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 0.01k \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 0.01k \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 0.01k \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 0.01k \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 0.01k \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 0.01k \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 0.01k \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 0.01k \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 0.01k \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 0.01k \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 0.01k " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Change our units to 000s for funsies\n", "multi_col_lvl_df.applymap(lambda x: np.nan if np.isnan(x) else str(round(x/1000, 2)) + \"k\").head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Sorting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we start slicing and filtering, it's important to sort out data to drastically improve efficiency. In some cases pandas will begin sorting for you by default in the latest versions, but other times you'll get an ugly and confusing lexsort warning (referencing the default lexographic sorting order).\n", "\n", "Sorting indexes works the exactly the same way for multiindex DataFrames as with regular DataFrames, but with some extra parameters to decide on:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Sort by Date, then Department, then Category, etc.:\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
BeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN81.7662.48NaN8.08.0
702770082018Omission Pale Ale - Gluten Free Pale Ale - 6 PackNaN89.76135.72NaN6.013.0
736920111112Goose Island - Honkers Ale - 6 PackNaN80.0167.36NaN9.08.0
Lagers702770081011Omission - Gluten Free Lager - 6 Pack137.67NaN176.9613.0NaN14.0
Stouts83783575217Sierra Nevada Brewing Co. - Stout - 6 PackNaN78.40119.90NaN7.010.0
WineRed89744402585Trapiche - Malbec - 750 mL Bottle56.88NaN37.808.0NaN6.0
492130008399Yellow Tail - Merlot - 750 mL Bottle46.30NaN34.5610.0NaN6.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack NaN \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 137.67 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 56.88 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 46.30 \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 81.76 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 89.76 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 80.01 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 78.40 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 62.48 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 135.72 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 67.36 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 176.96 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 119.90 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 37.80 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 34.56 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack NaN \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 13.0 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 8.0 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 6.0 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 9.0 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 7.0 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 8.0 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 13.0 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 8.0 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 14.0 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 10.0 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 6.0 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Sort by Category, then Subcategory, then UPC EAN, then Description, then Date, then Deparment:\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
2018-07-11BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN248.80296.01NaN10.011.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN201.72318.80NaN6.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN338.10102.04NaN10.04.0
2018-07-12BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN279.36228.08NaN12.08.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN250.25455.13NaN7.013.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN410.88268.56NaN12.09.0
2018-07-13BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN432.74185.75NaN11.05.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-11 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-12 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", "2018-07-11 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 248.80 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 201.72 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 338.10 \n", "2018-07-12 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 279.36 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 250.25 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 410.88 \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 432.74 \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", "2018-07-11 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 296.01 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 318.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 102.04 \n", "2018-07-12 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 228.08 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 455.13 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 268.56 \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 185.75 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-11 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-12 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "2018-07-11 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 10.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 6.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 10.0 \n", "2018-07-12 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 12.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 7.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 12.0 \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 11.0 \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 \n", "2018-07-11 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 11.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 4.0 \n", "2018-07-12 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 8.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 13.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 9.0 \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Sort by Category, then Date only:\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiquor674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
2018-07-11BoooozeAlcoholLiquor674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN338.10102.04NaN10.04.0
2018-07-12BoooozeAlcoholLiquor674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN410.88268.56NaN12.09.0
2018-07-15BoooozeAlcoholLiquor674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN177.00430.50NaN6.014.0
2018-07-16BoooozeAlcoholLiquor674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN161.15345.07NaN5.011.0
2018-07-10BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
2018-07-11BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN201.72318.80NaN6.08.0
2018-07-12BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN250.25455.13NaN7.013.0
2018-07-13BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN432.74185.75NaN11.05.0
2018-07-14BoooozeAlcoholLiquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN227.82268.52NaN6.07.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-11 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-12 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-15 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-16 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-12 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-14 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", "2018-07-11 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 338.10 \n", "2018-07-12 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 410.88 \n", "2018-07-15 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 177.00 \n", "2018-07-16 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 161.15 \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 201.72 \n", "2018-07-12 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 250.25 \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 432.74 \n", "2018-07-14 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 227.82 \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", "2018-07-11 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 102.04 \n", "2018-07-12 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 268.56 \n", "2018-07-15 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 430.50 \n", "2018-07-16 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 345.07 \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 318.80 \n", "2018-07-12 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 455.13 \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 185.75 \n", "2018-07-14 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 268.52 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-11 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-12 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-15 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-16 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-12 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "2018-07-14 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "2018-07-11 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 10.0 \n", "2018-07-12 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 12.0 \n", "2018-07-15 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", "2018-07-16 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 5.0 \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 6.0 \n", "2018-07-12 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 7.0 \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 11.0 \n", "2018-07-14 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 6.0 \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 \n", "2018-07-11 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 4.0 \n", "2018-07-12 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 9.0 \n", "2018-07-15 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 14.0 \n", "2018-07-16 Booooze Alcohol Liquor 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 11.0 \n", "2018-07-10 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", "2018-07-11 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", "2018-07-12 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 13.0 \n", "2018-07-13 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", "2018-07-14 Booooze Alcohol Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 7.0 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# By default sort_index will sort all levels of the index,\n", "# first sorting the first index level, then secondarily sorting the second index level, and so on...\n", "sort1 = multi_col_lvl_df.copy()\n", "sort1.sort_index(inplace=True)\n", "print(\"\\nSort by Date, then Department, then Category, etc.:\")\n", "display(sort1.head(10))\n", "\n", "# ...but you can choose to starting from a different level...\n", "sort2 = multi_col_lvl_df.copy()\n", "sort2.sort_index(level='Category', inplace=True)\n", "print(\"\\nSort by Category, then Subcategory, then UPC EAN, then Description, then Date, then Deparment:\")\n", "display(sort2.head(10))\n", "\n", "# ...or even to only sort on specifc levels.\n", "sort3 = multi_col_lvl_df.copy()\n", "sort3.sort_index(level=['Category', 'Date'], sort_remaining=False, inplace=True)\n", "print(\"\\nSort by Category, then Date only:\")\n", "display(sort3.head(10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Slicing, Filtering, and Querying" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the sorting out of the way we can begin searching our data by specific criteria. There is a fantastic guide about [understanding SettingwithCopyWarnings in Pandas](https://www.dataquest.io/blog/settingwithcopywarning/), so if you're unclear on the difference between a view and a copy, I suggest giving that a read first.\n", "\n", "Readers who have used Pandas for slicing previously will know that .loc is generally the preferred method of referencing cells in a DataFrame. The same type of syntax exists for multiindex Dataframes:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "View only rows in the 'Dollars' columns where the Category is 'Beer':\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Dollars
StoreStore 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN81.7662.48
2018-07-11BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN104.9172.72
2018-07-14BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN124.52113.68
2018-07-15BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN88.77119.47
2018-07-16BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN99.4046.65
2018-07-11BoooozeBeerMalts71250000140Colt 45 - Malt Liquor - 6 PackNaN101.79107.58
2018-07-13BoooozeBeerMalts71250000140Colt 45 - Malt Liquor - 6 PackNaN103.59142.44
2018-07-15BoooozeBeerMalts71250000140Colt 45 - Malt Liquor - 6 PackNaN139.2340.50
2018-07-16BoooozeBeerMalts71250000140Colt 45 - Malt Liquor - 6 PackNaN47.55108.48
2018-07-10BoooozeBeerAles736920111112Goose Island - Honkers Ale - 6 PackNaN80.0167.36
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", "2018-07-11 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", "2018-07-14 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", "2018-07-15 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", "2018-07-16 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", "2018-07-11 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack NaN \n", "2018-07-13 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack NaN \n", "2018-07-15 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack NaN \n", "2018-07-16 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack NaN \n", "2018-07-10 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 81.76 \n", "2018-07-11 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 104.91 \n", "2018-07-14 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 124.52 \n", "2018-07-15 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 88.77 \n", "2018-07-16 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 99.40 \n", "2018-07-11 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack 101.79 \n", "2018-07-13 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack 103.59 \n", "2018-07-15 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack 139.23 \n", "2018-07-16 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack 47.55 \n", "2018-07-10 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 80.01 \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 62.48 \n", "2018-07-11 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 72.72 \n", "2018-07-14 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 113.68 \n", "2018-07-15 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 119.47 \n", "2018-07-16 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 46.65 \n", "2018-07-11 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack 107.58 \n", "2018-07-13 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack 142.44 \n", "2018-07-15 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack 40.50 \n", "2018-07-16 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack 108.48 \n", "2018-07-10 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack 67.36 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "View only rows in the 'Dollars' AND 'Store 1' column where the Category is 'Beer':\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Dollars
StoreStore 1
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN
2018-07-11BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN
2018-07-14BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN
2018-07-15BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN
2018-07-16BoooozeBeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackNaN
2018-07-11BoooozeBeerMalts71250000140Colt 45 - Malt Liquor - 6 PackNaN
2018-07-13BoooozeBeerMalts71250000140Colt 45 - Malt Liquor - 6 PackNaN
2018-07-15BoooozeBeerMalts71250000140Colt 45 - Malt Liquor - 6 PackNaN
2018-07-16BoooozeBeerMalts71250000140Colt 45 - Malt Liquor - 6 PackNaN
2018-07-10BoooozeBeerAles736920111112Goose Island - Honkers Ale - 6 PackNaN
\n", "
" ], "text/plain": [ " Dollars\n", "Store Store 1\n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN\n", "2018-07-11 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN\n", "2018-07-14 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN\n", "2018-07-15 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN\n", "2018-07-16 Booooze Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN\n", "2018-07-11 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack NaN\n", "2018-07-13 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack NaN\n", "2018-07-15 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack NaN\n", "2018-07-16 Booooze Beer Malts 71250000140 Colt 45 - Malt Liquor - 6 Pack NaN\n", "2018-07-10 Booooze Beer Ales 736920111112 Goose Island - Honkers Ale - 6 Pack NaN" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# A slicing helper. Works similarly to slicing in Python (e.g. list slicing),\n", "# but is inclusive of both the start and stop values.\n", "idx = pd.IndexSlice\n", "\n", "# View rows with a Category of Beer, but any Date, Department, Subcategory, UPC EAN, or Description\n", "# Only looking at Dollars columns\n", "print(\"View only rows in the 'Dollars' columns where the Category is 'Beer':\")\n", "display(sort3.loc[idx[:, :, 'Beer'], 'Dollars':'Dollars'].head(10))\n", "\n", "# If we just want to look at the Store 1 sub-column\n", "print(\"\\nView only rows in the 'Dollars' AND 'Store 1' column where the Category is 'Beer':\")\n", "display(sort3.loc[idx[:, :, 'Beer'], idx['Dollars', 'Store 1':'Store 1']].head(10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Unfortunately the .loc syntax doesn't scale well to having many index levels. In order to select a values from a single level, .loc requires specifying ':' for every level that comes before. For the column names to show up for a single column you also need to use : syntax just to specify a single column. This can be incredibly annoying.\n", " \n", "Instead, it is more common (and practical) to use df.xs (cross sections). df.xs allows us to filter our DataFrame to only those rows that match levels of our index we specify. We can choose an individual level or multiple levels easily:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DepartmentCategorySubcategoryUPC EANDescription
BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN279.36228.08NaN12.08.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN250.25455.13NaN7.013.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN410.88268.56NaN12.09.0
BeerAles702770082018Omission Pale Ale - Gluten Free Pale Ale - 6 PackNaN109.2056.76NaN8.06.0
736920111112Goose Island - Honkers Ale - 6 PackNaN37.7578.30NaN5.010.0
Lagers702770081011Omission - Gluten Free Lager - 6 Pack40.12NaN93.104.0NaN10.0
Stouts83783575217Sierra Nevada Brewing Co. - Stout - 6 PackNaN116.71109.34NaN11.014.0
WineRed89744402585Trapiche - Malbec - 750 mL Bottle60.24NaN69.298.0NaN13.0
492130008399Yellow Tail - Merlot - 750 mL Bottle95.16NaN75.6012.0NaN10.0
Rose85000024218Dark Horse - S. Blanc Rose - 750 mL Bottle93.9483.82NaN7.06.0NaN
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Department Category Subcategory UPC EAN Description \n", "Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack NaN \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 40.12 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 60.24 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 95.16 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 93.94 \n", "\n", " \\\n", "Store Store 2 \n", "Department Category Subcategory UPC EAN Description \n", "Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 279.36 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 250.25 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 410.88 \n", " Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 109.20 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 37.75 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 116.71 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 83.82 \n", "\n", " \\\n", "Store Store 3 \n", "Department Category Subcategory UPC EAN Description \n", "Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 228.08 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 455.13 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 268.56 \n", " Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 56.76 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 78.30 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 93.10 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 109.34 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 69.29 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 75.60 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", "\n", " Units \\\n", "Store Store 1 \n", "Department Category Subcategory UPC EAN Description \n", "Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack NaN \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 4.0 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 12.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 7.0 \n", "\n", " \\\n", "Store Store 2 \n", "Department Category Subcategory UPC EAN Description \n", "Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 12.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 7.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 12.0 \n", " Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 8.0 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 5.0 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 11.0 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 6.0 \n", "\n", " \n", "Store Store 3 \n", "Department Category Subcategory UPC EAN Description \n", "Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 8.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 13.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 9.0 \n", " Beer Ales 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 6.0 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 10.0 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 10.0 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 14.0 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 13.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentSubcategoryUPC EANDescription
2018-07-10BoooozeRed89744402585Trapiche - Malbec - 750 mL Bottle56.88NaN37.808.0NaN6.0
492130008399Yellow Tail - Merlot - 750 mL Bottle46.30NaN34.5610.0NaN6.0
Rose85000024218Dark Horse - S. Blanc Rose - 750 mL Bottle148.61100.38NaN11.07.0NaN
White81248501095Curious Beasts - Chardonnay - 750 mL BottleNaN293.16278.60NaN14.014.0
2018-07-11BoooozeRed89744402585Trapiche - Malbec - 750 mL Bottle50.40NaN47.256.0NaN7.0
Rose85000024218Dark Horse - S. Blanc Rose - 750 mL Bottle52.24173.03NaN4.013.0NaN
White81248501095Curious Beasts - Chardonnay - 750 mL BottleNaN237.16247.52NaN11.013.0
2018-07-12BoooozeRed89744402585Trapiche - Malbec - 750 mL Bottle60.24NaN69.298.0NaN13.0
492130008399Yellow Tail - Merlot - 750 mL Bottle95.16NaN75.6012.0NaN10.0
Rose85000024218Dark Horse - S. Blanc Rose - 750 mL Bottle93.9483.82NaN7.06.0NaN
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Subcategory UPC EAN Description \n", "2018-07-10 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 56.88 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 46.30 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 148.61 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "2018-07-11 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 50.40 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 52.24 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "2018-07-12 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 60.24 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 95.16 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 93.94 \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Subcategory UPC EAN Description \n", "2018-07-10 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 100.38 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 293.16 \n", "2018-07-11 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 173.03 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 237.16 \n", "2018-07-12 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 83.82 \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Subcategory UPC EAN Description \n", "2018-07-10 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 37.80 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 34.56 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 278.60 \n", "2018-07-11 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 47.25 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 247.52 \n", "2018-07-12 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 69.29 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 75.60 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Subcategory UPC EAN Description \n", "2018-07-10 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 11.0 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "2018-07-11 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 4.0 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "2018-07-12 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 12.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 7.0 \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Subcategory UPC EAN Description \n", "2018-07-10 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 7.0 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 14.0 \n", "2018-07-11 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 13.0 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 11.0 \n", "2018-07-12 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 6.0 \n", "\n", " \n", "Store Store 3 \n", "Date Department Subcategory UPC EAN Description \n", "2018-07-10 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 6.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 14.0 \n", "2018-07-11 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 7.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 13.0 \n", "2018-07-12 Booooze Red 89744402585 Trapiche - Malbec - 750 mL Bottle 13.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
SubcategoryUPC EANDescription
Red89744402585Trapiche - Malbec - 750 mL Bottle56.88NaN37.808.0NaN6.0
492130008399Yellow Tail - Merlot - 750 mL Bottle46.30NaN34.5610.0NaN6.0
Rose85000024218Dark Horse - S. Blanc Rose - 750 mL Bottle148.61100.38NaN11.07.0NaN
White81248501095Curious Beasts - Chardonnay - 750 mL BottleNaN293.16278.60NaN14.014.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle 56.88 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 46.30 \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 148.61 \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 100.38 \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 293.16 \n", "\n", " \\\n", "Store Store 3 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle 37.80 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 34.56 \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 278.60 \n", "\n", " Units \\\n", "Store Store 1 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 11.0 \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 7.0 \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 14.0 \n", "\n", " \n", "Store Store 3 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 6.0 \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 14.0 " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
SubcategoryUPC EANDescription
Red89744402585Trapiche - Malbec - 750 mL Bottle56.88NaN37.808.0NaN6.0
492130008399Yellow Tail - Merlot - 750 mL Bottle46.30NaN34.5610.0NaN6.0
Rose85000024218Dark Horse - S. Blanc Rose - 750 mL Bottle148.61100.38NaN11.07.0NaN
White81248501095Curious Beasts - Chardonnay - 750 mL BottleNaN293.16278.60NaN14.014.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle 56.88 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 46.30 \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 148.61 \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 100.38 \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 293.16 \n", "\n", " \\\n", "Store Store 3 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle 37.80 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 34.56 \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 278.60 \n", "\n", " Units \\\n", "Store Store 1 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 11.0 \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 7.0 \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 14.0 \n", "\n", " \n", "Store Store 3 \n", "Subcategory UPC EAN Description \n", "Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 6.0 \n", "Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", "White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 14.0 " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
CategorySubcategoryUPC EANDescription
BeerLagers702770081011Omission - Gluten Free Lager - 6 Pack137.67NaN176.9613.0NaN14.0
WineRed89744402585Trapiche - Malbec - 750 mL Bottle56.88NaN37.808.0NaN6.0
492130008399Yellow Tail - Merlot - 750 mL Bottle46.30NaN34.5610.0NaN6.0
Rose85000024218Dark Horse - S. Blanc Rose - 750 mL Bottle148.61100.38NaN11.07.0NaN
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Category Subcategory UPC EAN Description \n", "Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 137.67 \n", "Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 56.88 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 46.30 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 148.61 \n", "\n", " \\\n", "Store Store 2 \n", "Category Subcategory UPC EAN Description \n", "Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", "Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 100.38 \n", "\n", " \\\n", "Store Store 3 \n", "Category Subcategory UPC EAN Description \n", "Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 176.96 \n", "Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 37.80 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 34.56 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", "\n", " Units \\\n", "Store Store 1 \n", "Category Subcategory UPC EAN Description \n", "Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 13.0 \n", "Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 11.0 \n", "\n", " \\\n", "Store Store 2 \n", "Category Subcategory UPC EAN Description \n", "Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", "Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 7.0 \n", "\n", " \n", "Store Store 3 \n", "Category Subcategory UPC EAN Description \n", "Beer Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 14.0 \n", "Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 6.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# We provide the level value, which by default searches the first index level ('Date')\n", "display(multi_col_lvl_df.xs('2018-07-12').head(10))\n", "\n", "# Here we're more specific because we want to search 'Category instead'\n", "display(multi_col_lvl_df.xs('Wine', level='Category').head(10))\n", "\n", "# To search for rows that match all of our level value requirements, use tuples\n", "display(multi_col_lvl_df.xs(('2018-07-10', 'Booooze', 'Wine'), level=['Date', 'Department', 'Category']).head(10))\n", "\n", "# Note that chaining .xs works, but is significantly less efficient\n", "display(multi_col_lvl_df.xs('2018-07-10', level='Date')\n", " .xs('Booooze', level='Department')\n", " .xs('Wine', level='Category')\n", " .head(10))\n", "\n", "# We can then use the column value filtering we're used to to filter rows as well\n", "cross_section_df = multi_col_lvl_df.xs(('2018-07-10', 'Booooze'), level=['Date', 'Department'])\n", "display(cross_section_df[cross_section_df['Dollars']['Store 1'] > 0].head(10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's a whole lot nicer of syntax than the .loc method, but it's still relatively verbose and requires separating out or index and column value filtering. Multiindex DataFrames also offer a lesser known method, df.query. Query is powerful for a few reasons:\n", "1. It can search both index levels and columns in the same query,\n", "2. Syntactically it's nicer, since it lets use write our query as short expressions,\n", "3. It's very efficient (depending on the chosen engine)\n", "\n", "The documentation has excellent [notes about the method](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.query.html) and a nice [usage guide](https://pandas.pydata.org/pandas-docs/stable/indexing.html#multiindex-query-syntax) that are definitely worth a read but a couple of examples using our dataset:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeWineRed89744402585Trapiche - Malbec - 750 mL Bottle56.88NaN37.808.0NaN6.0
492130008399Yellow Tail - Merlot - 750 mL Bottle46.30NaN34.5610.0NaN6.0
Rose85000024218Dark Horse - S. Blanc Rose - 750 mL Bottle148.61100.38NaN11.07.0NaN
White81248501095Curious Beasts - Chardonnay - 750 mL BottleNaN293.16278.60NaN14.014.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 56.88 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 46.30 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 148.61 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 100.38 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 293.16 \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 37.80 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 34.56 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 278.60 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 11.0 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle 7.0 \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 14.0 \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 6.0 \n", " Rose 85000024218 Dark Horse - S. Blanc Rose - 750 mL Bottle NaN \n", " White 81248501095 Curious Beasts - Chardonnay - 750 mL Bottle 14.0 " ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "multi_col_lvl_df.query(\"Date == '2018-07-10' and Category == 'Wine'\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So why doesn't everyone always use df.query? It has two big drawbacks: \n", "1. It doesn't allow index or column names with spaces in the name. \n", "2. It can't deal with multiple column levels well.\n", "\n", "However, if you're used to working with SQL databases and/or willing to work within the above limitations, this is a powerful and fast option." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Updating Column Values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we've learned a few ways to slice, filter, and query our data, what if we want to alter values in our dataframe? We can take advantage of the .loc method to replace values in place without creating a copy or, if temporarily copying our data isn't an issue (i.e. too big for memory) we can use the df.xs method with df.update:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
BeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackWe changed this81.7662.48NaN8.08.0
702770082018Omission Pale Ale - Gluten Free Pale Ale - 6 PackWe changed this89.76135.72NaN6.013.0
736920111112Goose Island - Honkers Ale - 6 PackWe changed this80.0167.36NaN9.08.0
Lagers702770081011Omission - Gluten Free Lager - 6 PackWe changed thisNaN176.9613.0NaN14.0
Stouts83783575217Sierra Nevada Brewing Co. - Stout - 6 PackWe changed this78.40119.90NaN7.010.0
WineRed89744402585Trapiche - Malbec - 750 mL Bottle56.88NaN37.808.0NaN6.0
492130008399Yellow Tail - Merlot - 750 mL Bottle46.3NaN34.5610.0NaN6.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack We changed this \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack We changed this \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack We changed this \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack We changed this \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack We changed this \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 56.88 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 46.3 \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 81.76 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 89.76 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 80.01 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 78.40 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 62.48 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 135.72 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 67.36 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 176.96 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 119.90 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 37.80 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 34.56 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack NaN \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 13.0 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 8.0 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 6.0 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 9.0 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 7.0 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 8.0 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 13.0 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 8.0 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 14.0 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 10.0 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 6.0 " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "updated_df = multi_col_lvl_df.copy()\n", "updated_df.loc[idx[:, :, 'Beer'], idx['Dollars', 'Store 1':'Store 1']] = \"We changed this\"\n", "updated_df.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above syntax works well for immutable values, but you'll get an error if you try the same thing for a mutable value, like a list. To my knowledge the only way to set values in a DataFrame to a mutable object is one cell at a time, so doing multiple replacements requires iterrows:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "# # Note: If the column we're changing's type isn't already object,\n", "# # we need to change it or the value relpacements below will error.\n", "# updated_df['Dollars']['Store 1'] = updated_df['Dollars']['Store 1'].astype(object)\n", "\n", "# # Loop through rows, replacing single values\n", "# # Only necessary if the new assigned value is mutable\n", "# # Code below currently not working when there are multiple column levels, but works with one column level\n", "# for index, row in updated_df.loc[idx[:, :, 'Beer'], idx['Dollars', 'Store 1':'Store 1']].iterrows():\n", "# updated_df.at[index, idx['Dollars', 'Store 1':'Store 1']] = [\"We\", \"changed\", \"this\"]\n", "\n", "# updated_df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, using df.xs and then df.update instead of the full .loc is a little bit more understandable in my option. For example, the following code does the same as the code two cells above (except that it requires a copy):" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DollarsUnits
StoreStore 1Store 2Store 3Store 1Store 2Store 3
DateDepartmentCategorySubcategoryUPC EANDescription
2018-07-10BoooozeAlcoholLiqour80480280024Grey Goose - Imported Vodka - 750 mL BottleNaN231.30128.30NaN9.05.0
Liquor80432400630Glenlivet - 12 Year Scotch Whisky - 750 mL BottleNaN172.00278.80NaN5.08.0
674545000001Don Julio - Tequila Blanco - 750 mL BottleNaN168.06204.12NaN6.07.0
BeerAles94922755711Bowser Beer - Beefy Brown Ale - 6 PackWe ALSO changed this81.7662.48NaN8.08.0
702770082018Omission Pale Ale - Gluten Free Pale Ale - 6 PackWe ALSO changed this89.76135.72NaN6.013.0
736920111112Goose Island - Honkers Ale - 6 PackWe ALSO changed this80.0167.36NaN9.08.0
Lagers702770081011Omission - Gluten Free Lager - 6 PackWe ALSO changed thisNaN176.9613.0NaN14.0
Stouts83783575217Sierra Nevada Brewing Co. - Stout - 6 PackWe ALSO changed this78.40119.90NaN7.010.0
WineRed89744402585Trapiche - Malbec - 750 mL Bottle56.88NaN37.808.0NaN6.0
492130008399Yellow Tail - Merlot - 750 mL Bottle46.3NaN34.5610.0NaN6.0
\n", "
" ], "text/plain": [ " Dollars \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack We ALSO changed this \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack We ALSO changed this \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack We ALSO changed this \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack We ALSO changed this \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack We ALSO changed this \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 56.88 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 46.3 \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 231.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 172.00 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 168.06 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 81.76 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 89.76 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 80.01 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 78.40 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "\n", " \\\n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 128.30 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 278.80 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 204.12 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 62.48 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 135.72 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 67.36 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 176.96 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 119.90 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 37.80 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 34.56 \n", "\n", " Units \\\n", "Store Store 1 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle NaN \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle NaN \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle NaN \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack NaN \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack NaN \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack NaN \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 13.0 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack NaN \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 8.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 10.0 \n", "\n", " \\\n", "Store Store 2 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 9.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 5.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 6.0 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 8.0 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 6.0 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 9.0 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack NaN \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 7.0 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle NaN \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle NaN \n", "\n", " \n", "Store Store 3 \n", "Date Department Category Subcategory UPC EAN Description \n", "2018-07-10 Booooze Alcohol Liqour 80480280024 Grey Goose - Imported Vodka - 750 mL Bottle 5.0 \n", " Liquor 80432400630 Glenlivet - 12 Year Scotch Whisky - 750 mL Bottle 8.0 \n", " 674545000001 Don Julio - Tequila Blanco - 750 mL Bottle 7.0 \n", " Beer Ales 94922755711 Bowser Beer - Beefy Brown Ale - 6 Pack 8.0 \n", " 702770082018 Omission Pale Ale - Gluten Free Pale Ale - 6 Pack 13.0 \n", " 736920111112 Goose Island - Honkers Ale - 6 Pack 8.0 \n", " Lagers 702770081011 Omission - Gluten Free Lager - 6 Pack 14.0 \n", " Stouts 83783575217 Sierra Nevada Brewing Co. - Stout - 6 Pack 10.0 \n", " Wine Red 89744402585 Trapiche - Malbec - 750 mL Bottle 6.0 \n", " 492130008399 Yellow Tail - Merlot - 750 mL Bottle 6.0 " ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "updated_df = multi_col_lvl_df.copy()\n", "\n", "df2 = updated_df.xs('Beer', level='Category', drop_level=False).copy() # .copy() is to avoid SettingwithCopyWarning\n", "df2[idx['Dollars', 'Store 1']] = \"We ALSO changed this\"\n", "\n", "updated_df.update(df2, join=\"left\", overwrite=True, filter_func=None, raise_conflict=False)\n", "updated_df.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Display options" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "pd.set_option('display.multi_sparse', True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# That's all, folks!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hopefully this guide has been a gentle and practical tutorial on multiindex DataFrames in Pandas. If you find any errors, material ommissions, or topics that could just be explained more clearly, please submit leave a comment or better yet, a pull request!\n", "\n", "To continue your learning, find a list of my top picks for Pandas resources - generally and specifically for MultiIndexes. Of particular note is the [Official list of available methods for MultiIndexes](https://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.MultiIndex.html), which has a full listing of all the available methods (many not listed here). **This link is specifically for Pandas version 0.22, because, for some reason, the list in future versions excludes a lot of still valuable methods!** What's more, this list provides the ONLY explanation for some methods, since many MultiIndex methods lack even the most basic of docstrings." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Resources" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Official MultiIndex References**\n", " * [Official MultiIndex / Advanced Indexing Tutorial](https://pandas.pydata.org/pandas-docs/stable/advanced.html?highlight=indexslice#hierarchical-indexing-multiindex)\n", " * [Official list of available methods for MultiIndexes](https://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.MultiIndex.html)\n", " \n", "**Other MultiIndex Tutorials**\n", " * [Official Reshaping by Stacking and Unstacking Tutorial](https://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-by-stacking-and-unstacking)\n", " * [Hierarchical indices, groupby and pandas by DataCamp](https://www.datacamp.com/community/tutorials/pandas-multi-index)\n", " * [Python Data Science Handbook by Jake Vanderplas](https://jakevdp.github.io/PythonDataScienceHandbook/03.05-hierarchical-indexing.html#Methods-of-MultiIndex-Creation)\n", " * [MultiIndex slicing by Nelson](https://www.somebits.com/~nelson/pandas-multiindex-slice-demo.html)\n", " * [Pandas examples and cookbook by Eric Neilsen, Jr.](http://ehneilsen.net/notebook/pandasExamples/pandas_examples.html)\n", " \n", "**General Pandas Tutorials**\n", " * [Official 10 Minutes to Pandas Tutorial](https://pandas.pydata.org/pandas-docs/stable/10min.html)\n", " * [Official Pandas Cheat Sheet](https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf)\n", " * [Official Indexing and Selecting Data Tutorial](https://pandas.pydata.org/pandas-docs/stable/indexing.html)\n", " * [Understanding SettingwithCopyWarning in pandas by DataQuest](https://www.dataquest.io/blog/settingwithcopywarning/)\n", " \n", "**Pandas Operations and Efficiency**\n", " * [A Beginner’s Guide to Optimizing Pandas Code for Speed by Sofia Heisler](https://engineering.upside.com/a-beginners-guide-to-optimizing-pandas-code-for-speed-c09ef2c6a4d6)\n", " * [Using pandas with large data by DataQuest](https://www.dataquest.io/blog/pandas-big-data/)\n", " * [Pandas Under The Hood by Jeff Tratner](http://www.jeffreytratner.com/slides/pandas-under-the-hood-pydata-seattle-2015.pdf)\n", " * [Pandas 2.0 Internals: Data structure changes by Wes McKinney](https://pandas-dev.github.io/pandas2/internal-architecture.html)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }