{ "metadata": { "name": "", "signature": "sha256:e482f4f5ab07b9d93a023397296727c05dbdf34fb020c3778470927e6b4cddd7" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "# Lets Create a Dataframe" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Lets Create a Dataframe\n", "The code below creates a pandas dataframe that we can use to pivot out some data. You will have to download the notebook and run it to see the final result as NBConvert cannot render it and it has to run against an open kernel. You can download the notebook at [here](https://raw.githubusercontent.com/Tooblippe/insightstack-blog/gh-pages/downloads/notebooks/pandas_pivot.ipynb)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from pandas import Series, DataFrame, pivot_table\n", "import numpy as np\n", "import numpy\n", "\n", "d = { 'Class' : Series( ['a', 'b', 'b','a','a', 'b', 'b','a','a', 'b', 'b','a','a','b','b','b']),\n", " 'Area' : Series( ['North','East', 'South', 'West','North','East', 'South', 'West','North','East', 'South', 'West','South', 'West','South', 'West']),\n", " 'Type' : Series( ['square', 'round','square', 'round', 'round', 'square', 'round', 'square', 'round', 'square','round', 'square',]),\n", " 'Web' : Series( ['Y','N','N','Y','Y','N','N','Y','Y','N','N','Y','Y','N','N','Y']),\n", " 'Agent' : Series( ['Mike', 'John', 'Pete','Mike', 'John', 'Pete','Mike', 'John', 'Pete','Mike', 'John', 'Pete','John', 'Pete','John', 'Pete']),\n", " 'Income' : Series( [20., 40., 90., 20.]),\n", " 'Profit' : Series( [1., 2., 3., 4.,1., 2., 3., 4.,1., 2., 3., 4.,1., 2., 3., 4.]),\n", " 'Stock' : Series( [20., 23., 33., 43.,12., 21., 310., 41.,11., 21., 31., 41.,11., 22., 34., 54.] )\n", " }\n", "df = DataFrame(d)\n", "print df\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " Agent Area Class Income Profit Stock Type Web\n", "0 Mike North a 20 1 20 square Y\n", "1 John East b 40 2 23 round N\n", "2 Pete South b 90 3 33 square N\n", "3 Mike West a 20 4 43 round Y\n", "4 John North a NaN 1 12 round Y\n", "5 Pete East b NaN 2 21 square N\n", "6 Mike South b NaN 3 310 round N\n", "7 John West a NaN 4 41 square Y\n", "8 Pete North a NaN 1 11 round Y\n", "9 Mike East b NaN 2 21 square N\n", "10 John South b NaN 3 31 round N\n", "11 Pete West a NaN 4 41 square Y\n", "12 John South a NaN 1 11 NaN Y\n", "13 Pete West b NaN 2 22 NaN N\n", "14 John South b NaN 3 34 NaN N\n", "15 Pete West b NaN 4 54 NaN Y\n", "\n", "[16 rows x 8 columns]\n" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now lets just create a nice pivot to see if everything is working. We call the sum function on the values for 'Profit' and 'Stock' for the 'Area' and 'Class' rows." ] }, { "cell_type": "code", "collapsed": false, "input": [ "pivot_table( df, rows=['Area','Class'], values = ['Profit','Stock'], aggfunc=numpy.sum)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", " | \n", " | Profit | \n", "Stock | \n", "
---|---|---|---|
Area | \n", "Class | \n", "\n", " | \n", " |
East | \n", "b | \n", "6 | \n", "65 | \n", "
North | \n", "a | \n", "3 | \n", "43 | \n", "
South | \n", "a | \n", "1 | \n", "11 | \n", "
b | \n", "12 | \n", "408 | \n", "|
West | \n", "a | \n", "12 | \n", "125 | \n", "
b | \n", "6 | \n", "76 | \n", "
6 rows \u00d7 2 columns
\n", "