{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Table of Contents

\n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# code for loading the format for the notebook\n", "import os\n", "\n", "# path : store the current path to convert back to it later\n", "path = os.getcwd()\n", "os.chdir(os.path.join('..', '..', 'notebook_format'))\n", "\n", "from formats import load_style\n", "load_style(plot_style = False)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Ethen 2017-07-12 15:42:46 \n", "\n", "CPython 3.5.2\n", "IPython 5.4.1\n", "\n", "numpy 1.13.1\n", "pandas 0.20.2\n" ] } ], "source": [ "os.chdir(path)\n", "import numpy as np\n", "import pandas as pd\n", "\n", "# 1. magic to print version\n", "# 2. magic so that the notebook will reload external python modules\n", "%load_ext watermark\n", "%load_ext autoreload \n", "%autoreload 2\n", "\n", "%watermark -a 'Ethen' -d -t -v -p numpy,pandas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Pandas's Pivot Table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Following the tutorial from the following link. [Blog: Pandas pivot table explained](http://pbpython.com/pandas-pivot-table-explained.html).\n", "\n", "The General rule of thumb is that once you use multiple grouby you should evaluate whether a pivot table is a useful approach.\n", "\n", "One of the challenges with using the panda’s pivot_table is making sure you understand your data and what questions you are trying to answer with the pivot table. It is a seemingly simple function but can produce very powerful analysis very quickly. In this scenario, we'll be tracking a sales pipeline (also called funnel). The basic problem is that some sales cycles are very long (e.g. enterprise software, capital equipment, etc.) and the managemer wants to understand it in more detail throughout the year. Typical questions include:\n", "\n", "- How much revenue is in the pipeline?\n", "- What products are in the pipeline?\n", "- Who has what products at what stage?\n", "- How likely are we to close deals by year end?\n", "\n", "Many companies will have CRM tools or other software that sales uses to track the process, while they may be useful tools for analyzing the data, inevitably someone will export the data to Excel and use a PivotTable to summarize the data. Using a panda’s pivot table can be a good alternative because it is:\n", "\n", "- Quicker (once it is set up)\n", "- Self documenting (look at the code and you know what it does)\n", "- Easy to use to generate a report or email\n", "- More flexible because you can define custome aggregation functions" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AccountNameRepManagerProductQuantityPriceStatus
0714466Trantow-BarrowsCraig BookerDebra HenleyCPU130000presented
1714466Trantow-BarrowsCraig BookerDebra HenleySoftware110000presented
2714466Trantow-BarrowsCraig BookerDebra HenleyMaintenance25000pending
3737550Fritsch, Russel and AndersonCraig BookerDebra HenleyCPU135000declined
4146832Kiehn-SpinkaDaniel HiltonDebra HenleyCPU265000won
\n", "
" ], "text/plain": [ " Account Name Rep Manager \\\n", "0 714466 Trantow-Barrows Craig Booker Debra Henley \n", "1 714466 Trantow-Barrows Craig Booker Debra Henley \n", "2 714466 Trantow-Barrows Craig Booker Debra Henley \n", "3 737550 Fritsch, Russel and Anderson Craig Booker Debra Henley \n", "4 146832 Kiehn-Spinka Daniel Hilton Debra Henley \n", "\n", " Product Quantity Price Status \n", "0 CPU 1 30000 presented \n", "1 Software 1 10000 presented \n", "2 Maintenance 2 5000 pending \n", "3 CPU 1 35000 declined \n", "4 CPU 2 65000 won " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_excel('sales-funnel.xlsx')\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pivot the Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we build up the pivot table, it's probably easiest to take one step at a time. Add items and check each step to verify you are getting the results you expect.\n", "\n", "The simplest pivot table must have a dataframe and an `index`, which stands for the column that the data will be aggregated upon and `values`, which are the aggregated value." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Price
ManagerRep
Debra HenleyCraig Booker20000.000000
Daniel Hilton38333.333333
John Smith20000.000000
Fred AndersonCedric Moss27500.000000
Wendy Yule44250.000000
\n", "
" ], "text/plain": [ " Price\n", "Manager Rep \n", "Debra Henley Craig Booker 20000.000000\n", " Daniel Hilton 38333.333333\n", " John Smith 20000.000000\n", "Fred Anderson Cedric Moss 27500.000000\n", " Wendy Yule 44250.000000" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.pivot_table(index = ['Manager', 'Rep'], values = ['Price'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, the values will be averaged, but we can do a count or a sum by providing the `aggfun` parameter." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
meanlen
PricePrice
ManagerRep
Debra HenleyCraig Booker20000.0000004
Daniel Hilton38333.3333333
John Smith20000.0000002
Fred AndersonCedric Moss27500.0000004
Wendy Yule44250.0000004
\n", "
" ], "text/plain": [ " mean len\n", " Price Price\n", "Manager Rep \n", "Debra Henley Craig Booker 20000.000000 4\n", " Daniel Hilton 38333.333333 3\n", " John Smith 20000.000000 2\n", "Fred Anderson Cedric Moss 27500.000000 4\n", " Wendy Yule 44250.000000 4" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# you can provide multiple arguments to almost every argument of the pivot_table function\n", "df.pivot_table(index = ['Manager', 'Rep'], values = ['Price'], aggfunc = [np.mean, len])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want to see sales broken down by the products, the columns variable allows us to define one or more columns. **Note:** The confusing points with the pivot_table is the use of `columns` and `values`. `Columns` are optional - they provide an additional way to segment the actual values you care about. The aggregation functions are applied to the `values` you've listed." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sum
Price
ProductCPUMaintenanceMonitorSoftware
ManagerRep
Debra HenleyCraig Booker65000.05000.0NaN10000.0
Daniel Hilton105000.0NaNNaN10000.0
John Smith35000.05000.0NaNNaN
Fred AndersonCedric Moss95000.05000.0NaN10000.0
Wendy Yule165000.07000.05000.0NaN
\n", "
" ], "text/plain": [ " sum \n", " Price \n", "Product CPU Maintenance Monitor Software\n", "Manager Rep \n", "Debra Henley Craig Booker 65000.0 5000.0 NaN 10000.0\n", " Daniel Hilton 105000.0 NaN NaN 10000.0\n", " John Smith 35000.0 5000.0 NaN NaN\n", "Fred Anderson Cedric Moss 95000.0 5000.0 NaN 10000.0\n", " Wendy Yule 165000.0 7000.0 5000.0 NaN" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.pivot_table(index = ['Manager','Rep'], values = ['Price'],\n", " columns = ['Product'], aggfunc = [np.sum])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The NaNs are a bit distracting. If we want to remove them, we could use `fill_value` to set them to 0." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sum
PriceQuantity
ProductCPUMaintenanceMonitorSoftwareCPUMaintenanceMonitorSoftware
ManagerRep
Debra HenleyCraig Booker6500050000100002201
Daniel Hilton10500000100004001
John Smith350005000001200
Fred AndersonCedric Moss9500050000100003101
Wendy Yule1650007000500007320
\n", "
" ], "text/plain": [ " sum \\\n", " Price Quantity \n", "Product CPU Maintenance Monitor Software CPU \n", "Manager Rep \n", "Debra Henley Craig Booker 65000 5000 0 10000 2 \n", " Daniel Hilton 105000 0 0 10000 4 \n", " John Smith 35000 5000 0 0 1 \n", "Fred Anderson Cedric Moss 95000 5000 0 10000 3 \n", " Wendy Yule 165000 7000 5000 0 7 \n", "\n", " \n", " \n", "Product Maintenance Monitor Software \n", "Manager Rep \n", "Debra Henley Craig Booker 2 0 1 \n", " Daniel Hilton 0 0 1 \n", " John Smith 2 0 0 \n", "Fred Anderson Cedric Moss 1 0 1 \n", " Wendy Yule 3 2 0 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.pivot_table(index = ['Manager', 'Rep'], values = ['Price', 'Quantity'],\n", " columns = ['Product'], aggfunc = [np.sum], fill_value = 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can move items to the index to get a different visual representation. The following code chunk removes Product from the `columns` and add it to the `index` and also uses the `margins` = True parameter to add totals to the pivot table. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sum
PriceQuantity
ManagerRepProduct
Debra HenleyCraig BookerCPU65000.02.0
Maintenance5000.02.0
Software10000.01.0
Daniel HiltonCPU105000.04.0
Software10000.01.0
John SmithCPU35000.01.0
Maintenance5000.02.0
Fred AndersonCedric MossCPU95000.03.0
Maintenance5000.01.0
Software10000.01.0
Wendy YuleCPU165000.07.0
Maintenance7000.03.0
Monitor5000.02.0
All522000.030.0
\n", "
" ], "text/plain": [ " sum \n", " Price Quantity\n", "Manager Rep Product \n", "Debra Henley Craig Booker CPU 65000.0 2.0\n", " Maintenance 5000.0 2.0\n", " Software 10000.0 1.0\n", " Daniel Hilton CPU 105000.0 4.0\n", " Software 10000.0 1.0\n", " John Smith CPU 35000.0 1.0\n", " Maintenance 5000.0 2.0\n", "Fred Anderson Cedric Moss CPU 95000.0 3.0\n", " Maintenance 5000.0 1.0\n", " Software 10000.0 1.0\n", " Wendy Yule CPU 165000.0 7.0\n", " Maintenance 7000.0 3.0\n", " Monitor 5000.0 2.0\n", "All 522000.0 30.0" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.pivot_table(index = ['Manager', 'Rep', 'Product'],\n", " values = ['Price', 'Quantity'], aggfunc = [np.sum], margins = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can define the status column as a category and set the order we want in the pivot table." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sum
Price
ManagerStatus
Debra Henleywon65000.0
pending50000.0
presented50000.0
declined70000.0
Fred Andersonwon172000.0
pending5000.0
presented45000.0
declined65000.0
All522000.0
\n", "
" ], "text/plain": [ " sum\n", " Price\n", "Manager Status \n", "Debra Henley won 65000.0\n", " pending 50000.0\n", " presented 50000.0\n", " declined 70000.0\n", "Fred Anderson won 172000.0\n", " pending 5000.0\n", " presented 45000.0\n", " declined 65000.0\n", "All 522000.0" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Status'] = df['Status'].astype('category')\n", "df['Status'] = df['Status'].cat.set_categories(['won', 'pending', 'presented', 'declined'])\n", "df.pivot_table(index = ['Manager', 'Status'], values = ['Price'],\n", " aggfunc = [np.sum], fill_value = 0, margins = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A really handy feature is the ability to pass a dictionary to the `aggfunc` so you can perform different functions on each of the values you select. This has a side-effect of making the labels a little cleaner." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PriceQuantity
meansumlen
ProductCPUMaintenanceMonitorSoftwareCPUMaintenanceMonitorSoftwareCPUMaintenanceMonitorSoftware
ManagerStatus
Debra Henleywon65000000650000001000
pending400005000004000010000001200
presented3000000100003000000200001002
declined35000000700000002000
Fred Andersonwon825007000001650007000002100
pending050000005000000100
presented3000005000100003000005000100001011
declined65000000650000001000
\n", "
" ], "text/plain": [ " Price \\\n", " mean sum \n", "Product CPU Maintenance Monitor Software CPU \n", "Manager Status \n", "Debra Henley won 65000 0 0 0 65000 \n", " pending 40000 5000 0 0 40000 \n", " presented 30000 0 0 10000 30000 \n", " declined 35000 0 0 0 70000 \n", "Fred Anderson won 82500 7000 0 0 165000 \n", " pending 0 5000 0 0 0 \n", " presented 30000 0 5000 10000 30000 \n", " declined 65000 0 0 0 65000 \n", "\n", " Quantity \\\n", " len \n", "Product Maintenance Monitor Software CPU Maintenance \n", "Manager Status \n", "Debra Henley won 0 0 0 1 0 \n", " pending 10000 0 0 1 2 \n", " presented 0 0 20000 1 0 \n", " declined 0 0 0 2 0 \n", "Fred Anderson won 7000 0 0 2 1 \n", " pending 5000 0 0 0 1 \n", " presented 0 5000 10000 1 0 \n", " declined 0 0 0 1 0 \n", "\n", " \n", " \n", "Product Monitor Software \n", "Manager Status \n", "Debra Henley won 0 0 \n", " pending 0 0 \n", " presented 0 2 \n", " declined 0 0 \n", "Fred Anderson won 0 0 \n", " pending 0 0 \n", " presented 1 1 \n", " declined 0 0 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "table = df.pivot_table(index = ['Manager','Status'], \n", " columns = ['Product'], \n", " values = ['Quantity','Price'],\n", " aggfunc = {'Quantity': len, 'Price': [np.sum, np.mean]}, \n", " fill_value = 0)\n", "table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once you have generated your data, it is in a DataFrame so you can filter on it using your standard DataFrame functions. e.g. We can look at all of our pending and won deals." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PriceQuantity
meansumlen
ProductCPUMaintenanceMonitorSoftwareCPUMaintenanceMonitorSoftwareCPUMaintenanceMonitorSoftware
ManagerStatus
Debra Henleywon65000000650000001000
pending400005000004000010000001200
Fred Andersonwon825007000001650007000002100
pending050000005000000100
\n", "
" ], "text/plain": [ " Price \\\n", " mean sum \n", "Product CPU Maintenance Monitor Software CPU Maintenance \n", "Manager Status \n", "Debra Henley won 65000 0 0 0 65000 0 \n", " pending 40000 5000 0 0 40000 10000 \n", "Fred Anderson won 82500 7000 0 0 165000 7000 \n", " pending 0 5000 0 0 0 5000 \n", "\n", " Quantity \n", " len \n", "Product Monitor Software CPU Maintenance Monitor Software \n", "Manager Status \n", "Debra Henley won 0 0 1 0 0 0 \n", " pending 0 0 1 2 0 0 \n", "Fred Anderson won 0 0 2 1 0 0 \n", " pending 0 0 0 1 0 0 " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# .query uses strings for boolean indexing and we don't have to \n", "# specify the dataframe that the Status is comming from\n", "table.query(\"Status == ['pending','won']\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Reference \n", "\n", "- [Blog: Pandas pivot table explained](http://pbpython.com/pandas-pivot-table-explained.html)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" }, "toc": { "nav_menu": { "height": "126px", "width": "252px" }, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "303px" }, "toc_section_display": "block", "toc_window_display": true }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }