{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Using Sumatra with Pandas in IPython\n", "\n", "This notebook demonstrates how to use\n", "[Sumatra](http://neuralensemble.org/sumatra/) to capture simulation\n", "input data and meta data and then\n", "export these records into a Pandas data frame. Sumatra has a stand\n", "alone web interface built with Django which allows users to view the\n", "data. Data can also be imported into Python, but requires a lot of\n", "code to manipulate and display in useful custom formats. Pandas seems\n", "like the ideal solution for manipulating Sumatra's data. In particular\n", "the ability to easily and quickly combine input data, meta data, and\n", "output data into custom data frames is really powerful for\n", "data analysis, reproduciblity and sharing.\n", "\n", "The first step in using Sumatra is to setup a simulation. Here the\n", "simulation just runs a diffusion problem using FiPy and outputs the\n", "time taken for a time step. The goal of the work is to test FiPy's parallel\n", "speed up based on different input parameters.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Setup the Simulations" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%matplotlib inline\n", "%load_ext autoreload\n", "%autoreload 2\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sumatra requires a file with the parameters specified." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import json\n", "\n", "params = {'N' : 10, 'suite' : 'trilinos', 'iterations' : 100}\n", "\n", "with open('params.json', 'w') as fp:\n", " json.dump(params, fp)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The script file for running the simulation is `fipy_timing.py`. It reads the JSON file, runs the simulation and the stores the run times in `data.txt`." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%writefile fipy_timing.py\n", "\n", "\"\"\"\n", "Usage: fipy_timing.py []\n", "\n", "\"\"\"\n", "\n", "from docopt import docopt\n", "import json\n", "import timeit\n", "import numpy as np\n", "import fipy as fp\n", "import os\n", "\n", "arguments = docopt(__doc__, version='Run FiPy timing')\n", "jsonfile = arguments['']\n", "\n", "if jsonfile:\n", " with open(jsonfile, 'rb') as ff:\n", " params = json.load(ff)\n", "else:\n", " params = dict()\n", " \n", "N = params.get('N', 10)\n", "iterations = params.get('iterations', 100)\n", "suite = params.get('suite', 'trilinos')\n", "sumatra_label = params.get('sumatra_label', '')\n", "\n", "attempts = 3\n", "\n", "setup_str = '''\n", "import fipy as fp\n", "import numpy as np\n", "np.random.seed(1)\n", "L = 1.\n", "N = {N:d}\n", "m = fp.GmshGrid3D(nx=N, ny=N, nz=N, dx=L / N, dy=L / N, dz=L / N)\n", "v0 = np.random.random(m.numberOfCells)\n", "v = fp.CellVariable(mesh=m)\n", "v0 = np.resize(v0, len(v)) ## Gmsh doesn't always give us the correct sized grid!\n", "eqn = fp.TransientTerm(1e-3) == fp.DiffusionTerm()\n", "v[:] = v0.copy()\n", "\n", "import fipy.solvers.{suite} as solvers\n", "solver = solvers.linearPCGSolver.LinearPCGSolver(precon=None, iterations={iterations}, tolerance=1e-100)\n", "\n", "eqn.solve(v, dt=1., solver=solver)\n", "v[:] = v0.copy()\n", "'''\n", "\n", "timeit_str = '''\n", "eqn.solve(v, dt=1., solver=solver)\n", "fp.parallelComm.Barrier()\n", "'''\n", "\n", "timer = timeit.Timer(timeit_str, setup=setup_str.format(N=N, suite=suite, iterations=iterations))\n", "times = timer.repeat(attempts, 1)\n", "\n", "if fp.parallelComm.procID == 0:\n", " filepath = os.path.join('Data', sumatra_label)\n", " filename = 'data.txt'\n", " np.savetxt(os.path.join(filepath, filename), times)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Overwriting fipy_timing.py\n" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Without using Sumatra and in serial this is run with" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!python fipy_timing.py params.json" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 145 }, { "cell_type": "markdown", "metadata": {}, "source": [ "and the output data file is" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!more Data/data.txt" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1.253199577331542969e-02\r\n", "1.225900650024414062e-02\r\n", "1.175403594970703125e-02\r\n" ] } ], "prompt_number": 146 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Create a Git Repository" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this demo, I'm assuming that the working directory is a Git repository set up with\n", "\n", " $ git init\n", "$ git add fipy_timing.py\n", " $ git ci -m \"Add timing script.\"\n", " \n", "Sumatra requires that the script is sitting in the a working copy of a repository." ] }, { "cell_type": "code", "collapsed": false, "input": [ "!git log -1" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\u001b[33mcommit 6a830dac2ea45ea090ec91a4a0f5263be10e95f3\u001b[m\r\n", "Author: Daniel Wheeler \r\n", "Date: Wed Feb 26 13:50:21 2014 -0500\r\n", "\r\n", " Fix README.\r\n" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Configure Sumatra" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the repository is setup, the Sumatra repository can be configured. Here we are using the `distributed` launch mode as we want Sumatra to launch and\n", "record parallel jobs." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%bash\n", "\n", "\\rm -rf .smt\n", "smt init smt-demo\n", "smt configure --executable=python --main=fipy_timing.py\n", "smt configure --launch_mode=distributed\n", "smt configure -g uuid\n", "smt configure -c store-diff\n", "smt configure --addlabel=parameters" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Sumatra project successfully set up\n", "Multiple versions found, using /home/wd15/anaconda/bin/python. If you wish to use a different version, please specify it explicitly\n", "Multiple versions found, using /home/wd15/anaconda/bin/mpirun. If you wish to use a different version, please specify it explicitly\n" ] } ], "prompt_number": 148 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sumatra requires that a `Data/` directory exists in the working copy." ] }, { "cell_type": "code", "collapsed": false, "input": [ "!mkdir Data" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we were not using Sumatra, we would launch the job with \n", "\n", " $ mpirun -n 2 python fipy_timing.py params.json\n", " \n", "The equivalent command using Sumatra is\n", "\n", " $ smt run -n 2 params.json\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Run Simulations\n", "\n", "In the following cell we just run a batch of simulations with varying parameters." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import itertools\n", "\n", "nprocs = (1, 2, 4, 8)#\n", "iterations_ = (100,)\n", "Ns = (10, 40)\n", "suites = ('trilinos',)\n", "tag='demo4'\n", "\n", "for nproc, iterations, N, suite in itertools.product(nprocs, iterations_, Ns, suites):\n", " !smt run --tag=$tag -n $nproc params.json N=$N iterations=$iterations suite=$suite" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:740: UserWarning: Found matplotlib configuration in ~/.matplotlib/. To conform with the XDG base directory standard, this configuration location has been deprecated on Linux, and the new location is now '/home/wd15/.config'/matplotlib/. Please move your configuration there to ensure that matplotlib will continue to find it in the future.\r\n", " _get_xdg_config_dir())\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/hg/sumatra/sumatra/launch.py:263: UserWarning: mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\r\n", " warnings.warn(\"mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\")\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Data keys are [data.txt(5a633ee751a2043deb828d28d4daefc0372c5b63)]\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:740: UserWarning: Found matplotlib configuration in ~/.matplotlib/. To conform with the XDG base directory standard, this configuration location has been deprecated on Linux, and the new location is now '/home/wd15/.config'/matplotlib/. Please move your configuration there to ensure that matplotlib will continue to find it in the future.\r\n", " _get_xdg_config_dir())\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/hg/sumatra/sumatra/launch.py:263: UserWarning: mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\r\n", " warnings.warn(\"mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\")\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Data keys are [data.txt(f66b8ea1a596f8f06321d389fc412e4809d50697)]\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:740: UserWarning: Found matplotlib configuration in ~/.matplotlib/. To conform with the XDG base directory standard, this configuration location has been deprecated on Linux, and the new location is now '/home/wd15/.config'/matplotlib/. Please move your configuration there to ensure that matplotlib will continue to find it in the future.\r\n", " _get_xdg_config_dir())\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/hg/sumatra/sumatra/launch.py:263: UserWarning: mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\r\n", " warnings.warn(\"mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\")\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Data keys are [data.txt(1d0effa7003d8e75bf998cce5a4f71a72dff7025)]\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:740: UserWarning: Found matplotlib configuration in ~/.matplotlib/. To conform with the XDG base directory standard, this configuration location has been deprecated on Linux, and the new location is now '/home/wd15/.config'/matplotlib/. Please move your configuration there to ensure that matplotlib will continue to find it in the future.\r\n", " _get_xdg_config_dir())\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/hg/sumatra/sumatra/launch.py:263: UserWarning: mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\r\n", " warnings.warn(\"mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\")\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Data keys are [data.txt(38d19ef93141666ffbabcb30c700028a87631ced)]\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:740: UserWarning: Found matplotlib configuration in ~/.matplotlib/. To conform with the XDG base directory standard, this configuration location has been deprecated on Linux, and the new location is now '/home/wd15/.config'/matplotlib/. Please move your configuration there to ensure that matplotlib will continue to find it in the future.\r\n", " _get_xdg_config_dir())\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/hg/sumatra/sumatra/launch.py:263: UserWarning: mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\r\n", " warnings.warn(\"mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\")\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Data keys are [data.txt(3fa95ba72c140b64dc07d4d2a7a9d0af38da06c8)]\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:740: UserWarning: Found matplotlib configuration in ~/.matplotlib/. To conform with the XDG base directory standard, this configuration location has been deprecated on Linux, and the new location is now '/home/wd15/.config'/matplotlib/. Please move your configuration there to ensure that matplotlib will continue to find it in the future.\r\n", " _get_xdg_config_dir())\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/hg/sumatra/sumatra/launch.py:263: UserWarning: mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\r\n", " warnings.warn(\"mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\")\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Data keys are [data.txt(2c702e292ec4da225cdfc414d29f80e9df2ccfd7)]\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:740: UserWarning: Found matplotlib configuration in ~/.matplotlib/. To conform with the XDG base directory standard, this configuration location has been deprecated on Linux, and the new location is now '/home/wd15/.config'/matplotlib/. Please move your configuration there to ensure that matplotlib will continue to find it in the future.\r\n", " _get_xdg_config_dir())\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/hg/sumatra/sumatra/launch.py:263: UserWarning: mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\r\n", " warnings.warn(\"mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\")\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Data keys are [data.txt(c63d4c44770cdc4329b13dc863762c8b5069dd2e)]\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:740: UserWarning: Found matplotlib configuration in ~/.matplotlib/. To conform with the XDG base directory standard, this configuration location has been deprecated on Linux, and the new location is now '/home/wd15/.config'/matplotlib/. Please move your configuration there to ensure that matplotlib will continue to find it in the future.\r\n", " _get_xdg_config_dir())\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/home/wd15/hg/sumatra/sumatra/launch.py:263: UserWarning: mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\r\n", " warnings.warn(\"mpi4py is not available, so Sumatra is not able to obtain platform information for remote nodes.\")\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Data keys are [data.txt(c4e026e885b64122b4b578c334ef9f078eac3df1)]\r\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Import into Pandas Dataframe\n", "\n", "The important part of this story is how to import the data into the Pandas data frame. This is actually trivial as Sumatra's default export format is a JSON file with all the records." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import json\n", "import pandas\n", "\n", "!smt export\n", "with open('.smt/records_export.json') as ff:\n", " data = json.load(ff)\n", "\n", "df = pandas.DataFrame(data)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Sumatra data is now in a Pandas data frame, albeit a touch raw." ] }, { "cell_type": "code", "collapsed": false, "input": [ "print df" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "Int64Index: 18 entries, 0 to 17\n", "Data columns (total 23 columns):\n", "datastore 18 non-null values\n", "dependencies 18 non-null values\n", "diff 18 non-null values\n", "duration 18 non-null values\n", "executable 18 non-null values\n", "input_data 18 non-null values\n", "input_datastore 18 non-null values\n", "label 18 non-null values\n", "launch_mode 18 non-null values\n", "main_file 18 non-null values\n", "outcome 18 non-null values\n", "output_data 18 non-null values\n", "parameters 18 non-null values\n", "platforms 18 non-null values\n", "reason 18 non-null values\n", "repeats 0 non-null values\n", "repository 18 non-null values\n", "script_arguments 18 non-null values\n", "stdout_stderr 18 non-null values\n", "tags 18 non-null values\n", "timestamp 18 non-null values\n", "user 18 non-null values\n", "version 18 non-null values\n", "dtypes: float64(1), object(22)\n" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "print df[['label', 'duration']]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " label duration\n", "0 ac32b9fc6df4 32.574247\n", "1 f372719a0648 6.860735\n", "2 9695c2529109 24.854690\n", "3 bf710e5339ff 3.592334\n", "4 179099003946 28.189878\n", "5 eeffa50a08bd 3.995513\n", "6 180b9c889f94 34.474531\n", "7 0732f6d89fc4 4.214891\n", "8 f2073ab41bd7 34.305814\n", "9 27bb809fa8ad 24.290209\n", "10 179247440765 28.439126\n", "11 0731f5a8e231 32.452093\n", "12 0330697ac505 2.569647\n", "13 a04a49a2107b 1.873670\n", "14 6b3d5ac075a6 1.283991\n", "15 1b124fc57ced 5.125001\n", "16 6b04488b14ed 5.207692\n", "17 5cc0546270c9 5.027596\n" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Reformat the Raw Imported Dataframe\n", "\n", "While all the meta data is important, often we want the input and\n", "output data combined into a data frame in a digestible\n", "form. Typically, we want a graph of reduced input versus reduced\n", "output.\n", "\n", "The first step is to introduce columns in the data frame for each of the input parameters (input data). The input data is buried in the `launch_mode` and `parameters` columns of the raw data frame." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import json\n", "df = df.copy()\n", "df['nproc'] = df.launch_mode.map(lambda x: x['parameters']['n'])\n", "for p in 'N', 'iterations', 'suite':\n", " df[p] = df.parameters.map(lambda x: json.loads(x['content'])[p])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now have the input data exposed as columns in the data frame." ] }, { "cell_type": "code", "collapsed": false, "input": [ "columns = ['label', 'nproc', 'N', 'iterations', 'suite', 'tags']\n", "print df[columns].sort('nproc')" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " label nproc N iterations suite tags\n", "15 1b124fc57ced 1 10 100 trilinos [demo2]\n", "6 180b9c889f94 1 40 100 trilinos [demo4]\n", "7 0732f6d89fc4 1 10 100 trilinos [demo4]\n", "11 0731f5a8e231 1 40 100 trilinos [demo3]\n", "17 5cc0546270c9 2 10 100 trilinos []\n", "4 179099003946 2 40 100 trilinos [demo4]\n", "5 eeffa50a08bd 2 10 100 trilinos [demo4]\n", "16 6b04488b14ed 2 10 100 trilinos [test]\n", "10 179247440765 2 40 100 trilinos [demo3]\n", "14 6b3d5ac075a6 2 10 100 trilinos [demo2]\n", "2 9695c2529109 4 40 100 trilinos [demo4]\n", "3 bf710e5339ff 4 10 100 trilinos [demo4]\n", "9 27bb809fa8ad 4 40 100 trilinos [demo3]\n", "13 a04a49a2107b 4 10 100 trilinos [demo2]\n", "0 ac32b9fc6df4 8 40 100 trilinos [demo4]\n", "1 f372719a0648 8 10 100 trilinos [demo4]\n", "12 0330697ac505 8 10 100 trilinos [demo2]\n", "8 f2073ab41bd7 8 40 100 trilinos [demo3]\n" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following pulls out the run times stored in the output files from each simulation into a `run_time` column." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import os\n", "\n", "datafiles = df['output_data'].map(lambda x: x[0]['path'])\n", "datapaths = df['datastore'].map(lambda x: x['parameters']['root'])\n", "data = [np.loadtxt(os.path.join(x, y)) for x, y in zip(datapaths, datafiles)]\n", "df['run_time'] = [min(d) for d in data]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "columns.append('run_time')\n", "print df[columns].sort('nproc')" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " label nproc N iterations suite tags run_time\n", "15 1b124fc57ced 1 10 100 trilinos [demo2] 0.012017\n", "6 180b9c889f94 1 40 100 trilinos [demo4] 0.419316\n", "7 0732f6d89fc4 1 10 100 trilinos [demo4] 0.012037\n", "11 0731f5a8e231 1 40 100 trilinos [demo3] 0.402522\n", "17 5cc0546270c9 2 10 100 trilinos [] 0.011014\n", "4 179099003946 2 40 100 trilinos [demo4] 0.252318\n", "5 eeffa50a08bd 2 10 100 trilinos [demo4] 0.011214\n", "16 6b04488b14ed 2 10 100 trilinos [test] 0.010802\n", "10 179247440765 2 40 100 trilinos [demo3] 0.253387\n", "14 6b3d5ac075a6 2 10 100 trilinos [demo2] 0.011340\n", "2 9695c2529109 4 40 100 trilinos [demo4] 0.173505\n", "3 bf710e5339ff 4 10 100 trilinos [demo4] 0.010188\n", "9 27bb809fa8ad 4 40 100 trilinos [demo3] 0.179195\n", "13 a04a49a2107b 4 10 100 trilinos [demo2] 0.010196\n", "0 ac32b9fc6df4 8 40 100 trilinos [demo4] 0.178471\n", "1 f372719a0648 8 10 100 trilinos [demo4] 0.016224\n", "12 0330697ac505 8 10 100 trilinos [demo2] 0.016702\n", "8 f2073ab41bd7 8 40 100 trilinos [demo3] 0.184142\n" ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create masks based on simulations records that have been tagged with either `demo2` or `demo3`. We want to plot these results as different curves on the same graph." ] }, { "cell_type": "code", "collapsed": false, "input": [ "tag_mask = df.tags.map(lambda x: 'demo4' in x)\n", "df_tmp = df[tag_mask]\n", "m10 = df_tmp.N.map(lambda x: x == 10)\n", "m40 = df_tmp.N.map(lambda x: x == 40)\n", "df_N10 = df_tmp[m10]\n", "df_N40 = df_tmp[m40]\n", "print df_N10[columns].sort('nproc')\n", "print df_N40[columns].sort('nproc')" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " label nproc N iterations suite tags run_time\n", "7 0732f6d89fc4 1 10 100 trilinos [demo4] 0.012037\n", "5 eeffa50a08bd 2 10 100 trilinos [demo4] 0.011214\n", "3 bf710e5339ff 4 10 100 trilinos [demo4] 0.010188\n", "1 f372719a0648 8 10 100 trilinos [demo4] 0.016224\n", " label nproc N iterations suite tags run_time\n", "6 180b9c889f94 1 40 100 trilinos [demo4] 0.419316\n", "4 179099003946 2 40 100 trilinos [demo4] 0.252318\n", "2 9695c2529109 4 40 100 trilinos [demo4] 0.173505\n", "0 ac32b9fc6df4 8 40 100 trilinos [demo4] 0.178471\n" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can plot the results we're interested in. Larger system size gives better parallel speed up." ] }, { "cell_type": "code", "collapsed": false, "input": [ "ax = df_N10.plot('nproc', 'run_time', label='N={0}'.format(df_N10.N.iat[0]))\n", "df_N40.plot('nproc', 'run_time', ylim=0, ax=ax, label='N={0}'.format(df_N40.N.iat[0]))\n", "plt.ylabel('Run Time (s)')\n", "plt.xlabel('Number of Processes')\n", "plt.legend()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ "" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAYMAAAENCAYAAADt3gm6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xt0G+WdN/CvnARSWmpZCuzhpQu2rJRwXV8ELS19MZHt\nQE+h3TixaWlf3AX58rJsYImxDdvy0pZNYlEInGUdSdA1dNuzli32dHfpQjJDTEuBEkUKy6W0DqM0\nlEsBXUyBskA87x9aTa1Ytka2RqPH/n7O0Tmeq79xkvl5nmfmeSyqqqogIqJlrcLsAEREZD4WAyIi\nYjEgIiIWAyIigsHFIBQKQZZlBAKBeffzer3a1/39/QCQ9xgiIioew4pBJBIBALjdbgBANBrNuZ8k\nSdizZ4+2HAgEsHbtWtTW1hoVjYiIjmJYMQgGg6iqqgIAOBwOSJKUcz+LxZK1HAgEMDk5ifXr1xsV\njYiIjmJYMUilUrDZbNpyPB6ftU80GtXuHDISiQRkWc5qOiIiImMZ2meQ7322RCIxa53H44Hb7UY8\nHocsy0ZFIyKiGQwrBlarVbvYJ5NJ2O32rO257gr8fj9CoRAAwG63Q1EUo+IREdEMK406cUdHB8Lh\nMNxuN2KxGFpaWgCkm4+sVisURYGiKIjH40gkEohGo6itrYXL5QKQblbKHJNx8skn49VXXzUqMhHR\nklRbW4uDBw/Ou49hdwb19fUAAFmWYbVaUVdXBwBobm4GALS1taGtrQ0WiwVTU1OwWCxwu92QJAmh\nUAhr1qzRjsl49dVXoaqqsJ9bbrnF9AzMb36O5Zhf5OxLIf9LL72U95pt2J0BkG7/B5DVHBQOh2ft\nk9kPSBeJperQoUNmR1gU5jeXyPlFzg6In18PvoFMREQsBqXU2dlpdoRFYX5ziZxf5OyA+Pn1sKiq\nKsx8BhaLBQLFJSIqC3qunbwzKKGJiQmzIywK85tL5PzFym6z2WCxWPiZ4zPzRd9CGdqBTERUTMlk\nkq0D8zh6eJ+CjmUzERGJgteA+c3182EzERER6cJiUEIit/kCzG82kfOLnH25YDEgIlqkVCqFioqK\nWZNyjY+Po7W1taBztbS0IBaLZa2LRCJobGyEzWZDT0/PovPmwmJQQk1NTWZHWBTmN5fI+UXOXoiB\ngQFMTU0t6FhJktDd3Q1Zlmd1BLvdbvT29iIWiyEcDhsyEySLARFRkXR1dWUNr1OIaDSa82kgSZJg\nt9tx9dVXo7KyEjt27IDP51ts1FlYDEpI9HZT5jeXyPlFzl6Ibdu2QZKkBc3F0tfXh127dsFqtWat\nVxQFDQ0N2nJjY6Mhw/sLVwze/u+3zY5ARGXKYinOZzECgQC6u7uL8wcCMDU1NetlslQqVbTzZwhX\nDL73s++ZHWHBRG83ZX5ziZy/VNlVtTifxWhra4PD4YDX681q9hkfH4fNZsv62O12PPjgg/Oeb+ZE\nYUYS7g3kH0R/AE+DB2vta82OQkSUk8/nQ2NjI7q6urR1mzZtwqZNmwo+l8PhyGoWCofDWc1GxSLc\nncGNn78RN+y+wewYCyJ6uynzm0vk/CJnX4iamhoMDg5iaGhIuzvw+/2z7gxsNlveOwO32w1FUbR+\niB07duDyyy8vemZDi0EoFIIsy3kfg/J6vbqP2fKZLXjhzRfwyMFHipqViGgxjn4SqK+vDw6HQ1vu\n6upCIpGY9dm4ceO85wHSM0Z2d3fDZrPB6XRi69atxc9v1NhEkUgEsVgMbW1tCAQCcLlc2lSYM0mS\nhKGhIezevTvvMZnxNX7y4k8wKA/imZ5nsGrFKiPiE1EZ4thE8yvLsYmCwSCqqqoApNu8JEnKud/M\nKjg6OqrrmMtOuwwnf/Jk7ArvKnJqIqLlybBikEqlsh6Hisfjs/aJRqNZ8yMf/QhVrmOAdAG5c8Od\n+O7Pvou33nuriKmNJXq7KfObS+T8ImdfLgztM8h3W5LrcSm9t4BnnXgWOs7swC17b1lQNiIi+hPD\nisHMZ2OTySTsdnvW9qPvCvQcc7RbL7oVYy+M4dnfP1vE5MYR+TlxgPnNJnJ+kbMvF4a9Z9DR0YFw\nOAy3241YLIaWlhYA6eYjq9UKRVGgKAri8TgSiQSi0eicx8zU2dmJ6upqAOni8dVPfBVbHt4C+f/I\neOyxxwD86R9e5taUy1zm8tJZpvwmJiYwMjICANr1Mi/VQH6/X5UkSfX7/dq6xsbGWfs4nU41Go3O\neUxGrrgfHvlQPfOeM9UHX3iwyOmLb+/evWZHWBTmN5fI+YuV3eBLlvDm+vno+bkZ+gZyZvS+mc1B\n4XB41j4zR/nLdcx8VlasxM6Ld6L7P7pxydpLsHrl6sXGJiJadoR7AzmXZkczzj7xbOx8aqfZUeYl\nersp85tL5PwiZ9ejmJPbRCKRWQPTcXKbAtzeejtuf+J2vPqHV82OQkTL1GImt8nweDyzzsHJbQrg\ntDlxdcPVuEm+yewocxK9A4z5zSVyfpGzF2Ixk9sAwNDQEGpra7MesefkNgtw8xduxu6XduPpV542\nOwoRLUOLmdxGURT4/X7s2LFj1vpSTG4j3BDW8zn+2ONx2/rbsOXhLXjir57IOeCTmURvN2V+c4mc\nv1TZLbcW5/+8esvCxz/KTG5z8ODBgo5rb29HIBDQhuTJKNXkNkuqGADAlXVX4p599+DHz/4YV5xz\nhdlxiKiEFnMRL5a2tjb4fD54vd6sUUvHx8ez5jcA0kPr+P1+xONxuFwuXHTRRUgmk1n7lGpymyXV\nTAQAFZYK3H3J3eiX+vHOB++YHSeL6O2mzG8ukfOLnH0hfD4ftm3bhn379mnrNm3aNGv46ng8jra2\nNkiSpM13UFtbCwCw2+04dOgQJ7dZjM/9+edwYfWF2PH4jvw7ExEVWaGT29x7771IpVI4dOiQ1t8Q\ni8VQXV1dssltDJvPwAiFjGX+8tTLqPPVYX/XflRbq40NRkQlUa7zGaRSKdjtdhw5ciRrvdPpRG1t\nLR55RP9kXLnOFY1GsXnzZiQSCXR0dGB4eDjnsYuZz2DJFgMA+M5j38FzbzyH4OaggamIqFTKtRiU\ni7Kc3KYcbP3cVvzylV/isUOPmR0FgPjtpsxvLpHzi5x9uVjSxeC4VcfB2+LFloe34Mj0kfwHEBEt\nU0u6mQhIT5Zz4ciF+Po5X0dXY1f+A4iobLGZaH7sM8gj+loUl/zoErz41y/CutpqQDIiKgUWg/mx\nzyCP+pPqcemnL8V3HvuOqTlEbzdlfnOJnF/k7MvFsigGAPC99d/DA888gF+/9WuzoxARlZ1l0UyU\n8f0nvo9HDz2Kh772UBFTEVGp2Gy2WcM10J9UVVXlHLrC9GaiUCgEWZbnHHt7fHwcsixnTdbQ398P\nAIaM133tZ67FZHwSP538adHPTUTGSyQSUFWVnzk+ixnDyLBiEIlEAPxp+spoNJq1XZZlyLKsvWp9\n4MABAOkisHbtWm18jmI6ZsUxuHPDnfjbR/4WHxz5oOjnz0f0dlPmN5fI+UXODoifXw/DikEwGNSG\nYnU4HJAkKWu72+3WXqlOJBKoq6sDkC4Gk5OTWL9+vSG5vrj2i6i2VuOep+8x5PxERCIyrBikUqms\nMbjj8fisfaampuD1ejE4OKitSyQSkGUZXq/XkFwWiwV3brgTf//43+PNd9805HvMReTx6AHmN5vI\n+UXODoifXw9D+wzydVhUVlair68PPp8PsVgMQHr+T7fbjXg8vqDZgvQ4/YTTccXZV+Bbe79lyPmJ\niERj2OQ2MydkSCaTsNvtWdsjkQgsFgvq6+vR0NCA8fFxWK1W2Gw2tLW1wW63Q1EUrc8ho7OzE9XV\n1dr3qKur06p2pl1Pz/ItF94Cx9864PrAhas3Xl3w8QtZ3rlz54LzlsMy8zP/QpdntrmXQ56lnn9i\nYgIjIyMAoF0v81INEolEVL/fr6qqqg4NDanRaFRVVVVNJpPaOkmSVFVV1e7ubjUUCqmSJKmpVEpV\nVVXt7+/XjskodtzhfcPqhf90oTo9PV3U885l7969Jfk+RmF+c4mcX+Tsqip+fj3XTkPfMwgEAtos\nPR6PBwDgcrkQDocxNTWFYDA9tLSiKNi2bRuA9OOoQHpih61bt2adr9ivoh+ZPoIGfwO+9b+/hU1n\nbCraeYmIygnHJtJhb2wvvvmTb+JX1/wKH1v1saKem4ioHJj+0pkILqq5CI3/qxF3PHmH4d9rZruj\niJjfXCLnFzk7IH5+PZZ9MQAAb4sXdzx1B155+xWzoxARmWLZNxNl3CzfjMNvH8YP//KHhpyfiMgs\nbCYqwOAXBvFo7FE8+fKTZkchIio5FoP/8YljPoFt7m3Y8vAWTKvThnwP0dsdmd9cIucXOTsgfn49\nWAxm+Po5X4fFYsE//9c/mx2FiKik2GdwlKd+9xTagm148ZoXcfyxxxv6vYiISoF9Bgvw2U99Fu4a\nN7Y9vs3sKEREJcNikMM29zb49vugJJWinlf0dkfmN5fI+UXODoifXw8WgxxO/uTJuOH8G7B199b8\nOxMRLQHsM5jD+x+9j9PvOR33XXYf1tcYM9EOEVEpsM9gEVavXI3bW27HdQ9fh4+mPzI7DhGRoVgM\n5rHx9I2wH2dHYH+gKOcTvd2R+c0lcn6RswPi59eDxWAeFosFOzfsxP977P8h8ceE2XGIiAzDPgMd\nev+jF8esOAZ3XXJXyb83EdFicT6DInnz3Tdxxj+egcc6H8MZJ5xR8u9PRLQYpncgh0IhyLKMQCB3\nm/v4+DhkWUZPT4/uY8xwwsdPwM1fuBnXP3L9ooqR6O2OzG8ukfOLnB0QP78ehhWDSCQCANqE9tFo\nNGu7LMuQZRlutxuKoiAajeY9xkzXnHsNDk8dxkOTD5kdhYio6AwrBsFgEFVVVQAAh8MBSZKytrvd\nbgwPDwMAEokE6uvrMTo6Ou8xZlq1YhXu3HAnrn/kenxw5IMFnaOpqam4oUqM+c0lcn6RswPi59fD\nsGKQSqVgs9m05Xg8PmufqakpeL1eDA4Oasv5jjHTxc6LcZr9NNz9y7vNjkJEVFSG9hnka1+vrKxE\nX18ffD4fYrGYrmPMdseGO7D98e34/Tu/L/hY0dsdmd9cIucXOTsgfn49Vhp1YqvVikQi/Wx+MpmE\n3W7P2h6JRGCxWFBfX4+GhgaMj4/nPQYAOjs7UV1drX2Puro67RYu8xdm9PKVf3El/u7Rv8MVn7yi\noOMPHDhQknxGLTM/83NZjOWJiQmMjIwAgHa9zMewR0uj0SjC4TA8Hg+8Xi9aWlpQV1eHVCoFq9UK\nr9eLhoYGuN1u9PT0oLW1FTU1NTmP0cKa9Gjp0VLvp7DuH9bhp1f8FA0nNZgdh4hoXqY+WlpfXw8g\n/dRQ5jd4AGhubgYAdHV1QVEUBAIBVFVVYePGjXMeU26sq6347kXfxZaHt5RFcSIiWiy+dLZAR6aP\nwBVwYeDzA+g4q0PXMRMTE9otnYiY31wi5xc5OyB+ftNfOlvKVlSswF0X34UbpRvx3ofvmR2HiGhR\neGewSO1j7TjzhDNxS9MtZkchIsqJYxOVwKHUITT6G3Gg+wD+vPLPzY5DRDQLm4lKoNpajWvOvQb9\nUn/efTOPfomK+c0lcn6RswPi59eDxaAI+j/fj58f/jkeP/y42VGIiBZEdzNRLBZDJBLBvn37cN55\n56GhoUH3ywzFUo7NRBk/fvbH+P6T38c+zz5UWFhjiah8FKWZKBqNor29HTt27EAikUBLSwveeust\nbN++He3t7dpbkcvdV8/6KlavXI2RAyNmRyEiKpyah9/vX9T2YtIR11RP/+5p9aTbT1Kn3p/KuX3v\n3r2lDVRkzG8ukfOLnF1Vxc+v59qZ987A4/EASI8oGovFtJFGDx06lLWdgHNPPhcbnBtw289uMzsK\nEVFBdPcZtLe3o7u7G2NjY3A4HBgbG8O+ffuMzpelnPsMMl77w2s4e/hsPHnVk1hrX2t2HCKi4j5a\nmkqltFnJbrzxxrK/KJvlpONPQt/n+rB1z1azoxAR6aa7GKiqisHBQTQ0NCAajSKVShmZS2jXffY6\nPP/G89j90u6s9aI/q8z85hI5v8jZAfHz66G7GPh8PthsNgwODiIcDmNsbMzIXEI7duWxuL31dlz/\nyPX4aPojs+MQEeWVt89gYGAAl19+ec7hpKPRKEZHR7F9+3bDAs4kQp9BhqqqaPlhC76y7iv46/P+\n2uw4RLSMFW1soqGhIUiSBKvVCpvNhkQigVQqhZaWFvT19RUtcD4iFQMAeO6N57D+/vX41TW/gv24\n2bO2ERGVgq5rZyHPqiaTSTUSiaipVErX/uPj46okSXO+i+D3+1W/36/29/dr62688UZt29EKjFsW\nrnnoGvWah65RVVX8Z5WZ31wi5xc5u6qKn1/PtbOgcROsVivq6+tRWVmZd99IJAIAcLvdANJNSjPJ\nsozm5mZ4PB4oigJZlgEAgUAAa9euRW1tbSHRytatTbci+HwQz/7+WbOjEBHNybBBdILBIKqqqgAA\nDocDkiRlbVcURVvncDgQi8UApIvB5OQk1q9fb1S0krIfZ8e3L/w2rn/kelx44YVmx1kUkWd6Apjf\nTCJnB8TPr4dhxSCVSsFms2nL8Xg8a7vH49HeXo5EInC5XACARCIBWZbh9XqNilZyPa4evPbOaxh9\nftTsKEREORk6vKaqo7M3EomgsbFRe1rJ4/HA7XYjHo9rTUeiW1mxEr4v+fA3w3+D8+87Hw888wD+\n+OEfzY5VMNGftWZ+84icHRA/vx4r9e4YjUaxefNmVFVVob29HbW1tdi4ceOc+1utViQSCQBAMpmE\n3Z77aRpZlrFt2zYA6SYim82GtrY22O12KIqi9TlkdHZ2akNnW61W1NXVabdwmb+wcly+4JQL0F/b\nj/dXvY9/ee5fcMPuG3CRehEuPe1SfOPL3zA9n57lzAi15ZKH+csrH5fLZ3liYgIjIyMAoH+qAb29\n0Y2NjWoymVQ3b96sLc8nEoloTwQNDQ2p0WhUVdX0E0kZPp9P+1qSJFWSJO1Jpf7+fu2YjALilr2X\nEi+pA3sG1BO9J6ru+93q+PPj6gcffWB2LCJagvRcOwt+mihjZn9ALvX19QDSv/lnfoMHgObmZgCA\nJEkYGBiA0+mEzWaDxWKB2+2GJEkIhUJYs2ZNzhfdlgpHlQPbmrfh8HWHcVX9Vbj76btx6s5T8e29\n38bLUy+bHY+Ilhndo5Z2d3fDYrFAURQ0NDRAURQEg0Gj82UR7aWzo01MTGi3dLk8/8bz8O334UfP\n/ggXnHIBel29aK1tLZuZ0/LlL3fMbx6RswPi5y/qqKU+nw8NDQ1wOByora0teSFYDs488Uzcfcnd\nOHzdYVz66Utx86M3w3m3Ezse34E33n3D7HhEtITpvjPIjEOUGa3UYrFgeHjY0HBHE/3OYCH2vbIP\nw+Fh/OuL/4pLnJeg19WLC065ABaLxexoRCSIoo1NBAAulwuDg4OwWq1QVVVr4y+l5VgMMpJ/TOKB\nZx7Arv27sMKyAj2uHnzjnG+gcnX+t8GJaHkrajNR5pFPt9uN5ubmkheCpSDz6NdCVH2sCls+uwUv\n/N8X8A9f/Af8/PDPUX1XNTz/5kHktUjxQs5jMfnLAfObR+TsgPj59dD9nkFLSwtaW1vhcDgAmNNM\nROmfe1N1E5qqm/D6O6/jB9EfYOPoRpz48RPR6+pFx1kdOG7VcWbHJCLB6G4mcjqd2LFjhzZIHZuJ\nyseR6SN4+ODDGA4P46nfPYWvn/N19Lh6sG7NOrOjEVEZKGqfQXt7u+lPELEY5Pfb1G/h3+/HfdH7\ncPoJp6PX1YuvrPsKjllxjNnRiMgkRe0zSCaTaG1txcDAAAYGBjA4OLjogMtNKdodT7Weitvct+Hw\n9YfR6+rFcHgYp+48FTfLN+O3qd8u6tyit5syv3lEzg6In18P3X0GAwMDRuagIjtmxTFoP7Md7We2\n41dv/gq+/T40+Btw/qfOR6+rFxc7L8aKihVmxySiMpG3mcjr9aKvry9nMSjV3McZbCZanPc+fA+j\nz41iODyMN959A12NXbiq/ir82Sf+zOxoRGQgPdfOvHcGmaeHXC4XX3QS3HGrjsM367+Jb9Z/E/tf\n3Y9d4V1Yd886tNa2otfViwtPvZB/x0TLVN47g3LoOM4Q/c6gHMc3Sb2fwg+f+SF27d+FaXUaPY09\nuLLuSlhXW2ftW475C8H85hE5OyB+/qJ0ICuKUrRAVH6sq6249jPX4rne5+D7kg+/fOWXqLmrBlf9\n5Crse2Wf2fGIqETy3hnYbDZ0d3fPqioWi0WblKZURL8zEMUb776BH0R/AN9+H+wfs6PX1YvLz7oc\nHz/m42ZHI6IFKMp7Bk6nE/39/Tm3ZeYwLhUWg9KaVqfxyMFHMBwexi9e/gWuOPsK9Lh6cMYJZ5gd\njYgKUJRi4HK5EA6HixpsoUQvBiK3Ox6eOoyb77sZkirh0/ZPo9fVi79c95c4duWxZkfTTeSfPyB2\nfpGzA+LnL0qfgcvlWnCAUCgEWZYRCARybg8EAggEAlmPreY7hsxxSuUpuKrhKhy+7jCuPe9aBCIB\nnLLzFAxKg4glY2bHI6JF0j0cRaEikQhisRja2toQCATgcrm0qTCB9HSYDocDNTU1aG9vR3d3N2w2\nGxRFmfMY0e8Mlppfv/Vr+Pb78MAzD+C8k89Dr6sXX1z7Rb7MRlRmijocRaGCwSCqqqoApN9VkCQp\na7uiKNo6h8MBRVEwOjqqzbOc6xgqL6etOQ13bLgDL1//MjrO7MBtP78NNXfV4Hs/+x5e+8NrZscj\nogIYVgxSqRRsNpu2HI/Hs7Z7PB6tAzoSicDlciGVSsFut895jOhEH99krvwfW/UxXFl3JZ66+in8\n5PKf4OWpl3HGP56BzWOb8Wjs0bK5m1uqP38RiJwdED+/HrqLQSAQgNPp1D5r167Ne4yei0AkEkFj\nY6PWHFQuFw5amPqT6uG71IdDWw7houqLsOXhLVh3zzrc+eSdSPwxYXY8IpqD7oHqfD4f9u/fr81n\nkI/VakUikf7Pn0wms37jn0mWZe19BT3HdHZ2orq6Wtu/rq5O6+XPVO9yXc6sK5c8RuavXF2JM949\nA3evuxuraldhV3gXvvVP38IFp1yAWztvxXknn4fHHnusbPOX47LI+Zuamsoqz1LPPzExgZGREQDQ\nrpf56O5A7unpwa5du3SdFACi0SjC4TA8Hg+8Xi9aWlpQV1eHVCql9Qv4/X50dXUBSBcFm82W8xgt\nLDuQhfbmu29i5MAIdu3fhcpjK9Hj6sHXzv4aPnHMJ8yORrSkFbUDOR6Pw+Vy6Z7PINPsI8uy9hs8\nADQ3NwMAJEnCwMAAnE4nbDYbLBbLnMcsFZnKLarF5j/h4yeg7/N9mLx2Etvc2/CfB/8Tp9x5Cq55\n6Bo898ZzxQk5j+X+8zeTyNkB8fProbuZqLu7O2tZz+iWmQ7imdNjZl5ga25u1pqE8h1DS0uFpQIb\nnBuwwbkBv3v7dwjsD2DDP29AjbUGva5ebDpjk1AvsxEtBYa9Z2AENhMtXR8e+RD//pt/x67wLhx4\n/QA66zrR3diNWlut2dGIhFfUOZCPfhPZYrFg377SjmrJYrA8TMYn4dvvw/3P3I/GkxrR4+rBlz79\nJays0H0jS0QzFLXPIBwOa5/t27ezGWcBRG93LFX+tfa1uL31drx8/cu44uwr4H3Ci+qd1bh14la8\n8vYrCz4vf/7mETk7IH5+PRb00llzczMikUixsxBlWb1yNb7xF9/AL/7qF3joaw/h9Xdex1nDZ6Et\n2IY9L+3BtDptdkSiJUN3M5HX69W+jsfjiEQi2L17t2HBcmEzEf3hv/+AHz37IwyHh/Heh++hp7EH\nnXWdsB+X+z0WIipyn0EoFMpabm5u1v0CWrGwGFCGqqp46ndPYTg8jH/79b/hstMuQ4+rB+d/6nzO\n40x0lKIUg6mpKQSDQdTW1mL9+vWIxWKIRCKQJAnDw8NFDZyP6MVg5tujIirX/PH34trLbMetOg69\nrl5ccfYVOP7Y47P2K9f8eomcX7TsqqpiWp3GR9Mf4Yh6BBN7J3D+F87HEfUIjkwfwRH1SHpbnq+P\nTB/RznH018U4R97z/c+20c2jea+deR/P2Lx5M6xWK8bGxuDz+fDSSy/B5XKhtpaP/FF5sB9nxw2f\nuwHXn389Ho09iuHwMG6Sb0LHmR3oPbcX5/zZOWZHNJV2UTPxAvXiCy8icmxkcRfNEmTP7DutTqPC\nUoEVlhXpIdkPAcdGjsWKihVYWbFSW7/C8j/LOb4+et+5vs55jjznP7biWKxcpe97rbCswChG8/47\n0TXt5cGDBwGk50PO9aJYqYh+Z0Cl88rbr+C+6H3w7/fjlMpT0OPqwbo168z9rc6k3yABaBeHhVy4\n5jtu3nNYinRhLOI5CvkzL6XmxqJPe2n2FJgsBlSoj6Y/wkO/eQj3Ru/F6++8vvALSJEvNKU8R4XF\nsJHqSRB6rp18i6eERGs3PZqI+VdWrMSX130ZX173ZSHzzyRyfpGzA+Ln1yNvMYhEInA6nQDSs5Nl\nvrZYLJicnDQ2HRERlUTeZqJUKjXntsxQ1KXCZiIiosIV9T2DcsBiQERUuKKOTUSLJ/r4JsxvLpHz\ni5wdED+/HoYWg1AoBFmWEQgE5tynv78/5/J8xxARUXEZ1kwUiUQQi8XQ1taGQCAAl8ulzWSW4ff7\nMTQ0pL3HAKTfZbDb7fD5fFi/fn12WDYTEREVzNRmomAwiKqqKgCAw+GAJEmz9unq6oLD4chaFwgE\nMDk5OasQEBGRcQwrBqlUCjabTVuOx+O6jkskEpBlOWuU1KVC9HZH5jeXyPlFzg6In18PQ/sMFtKk\n4/F44Ha7EY/HIcuyAamIiOhohhUDq9WqjWOUTCZht+cfbz4QCGhDZdvtdiiKYlQ8U4j+BiPzm0vk\n/CJnB8TPr4dhw1F0dHQgHA7D7XYjFouhpaUFQLr5aK6X1RwOhzbXcjwe146ZqbOzE9XV1QDSBaeu\nrk77i8rcynGZy1zm8nJenpiYwMjICABo18u8VAP5/X5VkiTV7/dr6xobG7Wvx8bG1KqqKjUQCGjr\nxsfH1fGh/3m1AAANSElEQVTxcdXr9c46n8FxDbd3716zIywK85tL5PwiZ1dV8fPruXYaOlCdx+MB\nALjdbm3dzFFPN23ahE2bNmUd09bWZmQkIiLKgcNREBEtcRyOgoiIdGExKKFMB4+omN9cIucXOTsg\nfn49WAyIiIh9BkRESx37DIiISBcWgxISvd2R+c0lcn6RswPi59eDxYCIiNhnQES01LHPgIiIdGEx\nKCHR2x2Z31wi5xc5OyB+fj1YDIiIiH0GRERLHfsMiIhIFxaDEhK93ZH5zSVyfpGzA+Ln14PFgIiI\njO0zCIVCsFqtUBRFm+jmaP39/dixY4euY9hnQERUOFP7DCKRCIA/zXIWjUZn7eP3+xEKhQo6hoiI\nis+wYhAMBlFVVQUgPdG9JEmz9unq6oLD4SjoGJGJ3u7I/OYSOb/I2QHx8+thWDFIpVKw2Wzacjwe\nN+QYIiJaPEM7kBfSvr+U+wSamprMjrAozG8ukfOLnB0QP78ehhUDq9WKRCIBAEgmk7Db7YYcQ0RE\ni7fSqBN3dHQgHA7D7XYjFouhpaUFQLopyGq1FnTMTJ2dnaiurgaQLh51dXVa1c6065Xr8s6dO4XK\ny/zltSxy/plt7uWQZ6nnn5iYwMjICABo18u8VAP5/X5VkiTV7/dr6xobG7Wvx8bG1KqqKjUQCMx7\nTIbBcQ23d+9esyMsCvObS+T8ImdXVfHz67l2cmwiIqIljmMTERGRLiwGJTSz3VFEzG8ukfOLnB0Q\nP78eLAZERMT5DIiIljr2GRARkS4sBiUkersj85tL5PwiZwfEz68HiwEREbHPgIhoqWOfARER6cJi\nUEKitzsyv7lEzi9ydkD8/HqwGBAREfsMiIiWOvYZEBGRLiwGJSR6uyPzm0vk/CJnB8TPrweLARER\nsc+AiGipY58BERHpYmgxCIVCkGUZgUBA9/b+/n4AmPMYkYne7sj85hI5v8jZAfHz62FYMYhEIgAA\nt9sNAIhGo7q2BwIBrF27FrW1tUZFIyKioxhWDILBIKqqqgAADocDkiTp2h4IBDA5OYn169cbFc00\nTU1NZkdYFOY3l8j5Rc4OiJ9fD8OKQSqVgs1m05bj8biu7YlEArIsw+v1GhWNiIiOYmifQb7e61zb\nPR4P3G434vE4ZFk2KpopRG93ZH5ziZxf5OyA+Pn1WGnUia1WKxKJBAAgmUzCbrfPuT2VSsFutyMQ\nCMBms6GtrQ12ux2Komh9ChmdnZ2orq7WzlFXV6fdwmX+wsp1+cCBA2WVh/nLK99Sz8/l0i1PTExg\nZGQEALTrZT6GvWcQjUYRDofh8Xjg9XrR0tKCuro6pFIpWK3WWdubm5uRSCTgcrlQWVmJgYEBXH75\n5airq/tTWL5nQERUMFPfM6ivrwcAyLKs/QYPAM3NzTm319fXw+12Q5IkhEIhrFmzJqsQEBGRcfgG\ncglNTExot3QiYn5ziZxf5OyA+Pn5BjIREenCOwMioiWOdwZERKQLi0EJZR79EhXzm0vk/CJnB8TP\nrweLARERsc+AiGipY58BERHpwmJQQqK3OzK/uUTOL3J2QPz8erAYEBER+wyIiJY69hkQEZEuLAYl\nJHq7I/ObS+T8ImcHyi//9DTwwQfAe+8Bb78NJBLAG28Ar74KHD4MKArwm98AL7wA/Nd/6TunYfMZ\nEBEZZXoa+Oij0n1efBF4+unSfs/5PqoKrFoFrFyp76OHcH0Gr7yioqICWZ8VKzBrXWa9xZL+EC0l\npb4Yltun0IvhUvtUFNimo6fPQLhicNJJKqanoX2OHEHW8tHrVTVdDPQUjkKKzEL3zRSnXF/n216M\n40rxPco1m5kX0A8/5MXQzIvhcmd6MQiFQrBarVAUBR6PR9f2+Y5ZyNNEqpr+6CkchRSZhex74MAE\nzjqrKSvT0V/nWqfna6P2nfn1b387gU99qqkss+nZ/u67E7DZmky/kK1cubAL+RNPTGD9+iYhL4ai\nzwcgen49186VRn3zSCQCAHC73VAUBdFoVJvdbK7tmbBzHbMQM38zXGnYn1af3/zmAC67rMncEIuw\nc+cBXHddk9kxFkz0/C++eAAXX9xkdowFOXDggNAXU9Hz62HY7xfBYBBVVVUAAIfDAUmS8m4PBoOw\nWq1zHiO6VCpldoRFYX5ziZxf5OyA+Pn1MKwYpFIp2Gw2bTkej+fdnu8YIiIyhqEtj/naqATquy6K\nQ4cOmR1hUZjfXCLnFzk7IH5+PQxrRbdarUgkEgCAZDIJu90+5/ZUKqVtn++Y2tpaWAR/TvT+++83\nO8KiML+5RM4vcnZA7Py1tbV59zGsGHR0dCAcDsPtdiMWi6GlpQVA+sJvtVqztiuKgpaWFqiqmvOY\njIMHDxoVl4hoWTOsmSjzFJAsy7BarairqwMANDc3z7l9rmOIiMhYQr101t/fjx07dpgdgwTl9XrR\n19dndgyisiTMqyt+vx+hUMjsGAsWCAQQCAQwMDBgdpSCjY+PQ5Zl9PT0mB1lwSRJwp49e8yOsSD9\n/f0A0v+GRBSJRBAKhYTMH4lEUFFRAafTCafTKeT/gVAoBFmW8/78hSkGXV1dcDgcZsdYEFmW0dzc\nDI/HA0VRIMuy2ZF0k2UZsixrfTsHDhwwO9KCiPzgQSAQwNq1a3V1Apaj7du3o62tDalUCtFo1Ow4\nBUkmk5iensbBgwcxNjYm3C9z0WgUDocDbrcbDodj3p+/MMVAZIqiaC/QORwOKIpiciL93G43hoeH\nAaSf9BKxHycajcLtdpsdY8ECgQAmJyexfv16s6MUbHx8HOeeey4AoK+vb9EjCpTazH834XAY1dXV\n5oVZoMydpaIo8/78WQxKwOPxaOMsRSIR7T+HKKampuD1ejE4OGh2lAXJPK4sqkQiAVmW4fV6zY5S\nsHA4jHg8jmg0KmT+DFmW0d7ebnaMgtXX16OmpgY2my3rhd5cWAxKKBKJoLGxUbjfrisrK9HX1wef\nz4dYLGZ2nIKIflcApH+ZcLvdiMfjQjUxZqxZs0b7jVTUfr89e/agsrLS7BgFS6VScDqdCAQC8Hg8\n8/7/ZTEoIVmWsW3bNrNjFCQSiWjtjA0NDRgfHzc5UWEURUEoFILf70cikRCuzToQCGgXULvdLlQT\nI5DOXFNTAyD9oum+fftMTrQwmYE1RRMIBNDd3Y22tjaMjY3N+/9XmGIwPj6OcDiMe++91+woC+L3\n+7XHGkX67U6W5aw3xUXrxGxra0NbWxssFgumpqaE60h2OBzauznxeFy4JsZNmzZpBSyVSuG8884z\nOVHhRCvAR/vkJz8JIN3/kRkINBeh3jMQlSRJaG9vh81mQyKRwPj4uDCdgVNTUwgGgwDS/ylEu7NZ\nCjJ3BrFYDFu3bjU5TeECgQBsNhvC4bCQ/35isRiGhoa0BylE4/V64XA4kEgkcs4rk8FiQERE4jQT\nERGRcVgMiIiIxYCIiFgMiIgILAZERAQWAyIiAosBERGBxYDKiCRJqKioyBo/ZWhoaFHj4A8NDRk6\nHk5LS0vWW/GpVAoVFRVwuVxwuVzauDBE5c6wOZCJCmWxWOBwOLB582aEw2Ft3WLPaRRFUWCxWHD1\n1VdnrXc4HFr+qakpVFVVzfvmJ1E54J0BlZWGhgace+65s36bDoVCWUMgu1wuAOm7iZaWFrS3t8Pp\ndMLr9aK1tRUul0sblG50dFRbN/Muobu7W/sNPrPv+Pg4uru74XQ6cejQIW3fzZs3a+fIjC21Y8cO\nhMNhPPjgg3P+ed566y2tIM08dywWy3nOo79X5i4pV1ZFUdDS0oLW1la0t7djamoq57pCjqfli3cG\nVDYyI6MMDw/D6XRqA7TlY7FYEAwGEQqF4PP5sHv3boRCIYyOjsJutyOVSmH37t0AAKfTiba2Nvj9\nflgsFoTDYaRSKbhcLhw8eBAAsH//fu1rIN3U9JnPfAZbt27F1NQUampqkEgkMDAwgFgsho0bN2bl\nURRFK1YAMDY2pn2dOfdc55y5PhqNQpIkqKqaM2soFILL5cK2bdu0AQVzrRsdHdV9vIjDNFNx8M6A\nypLP50N3d7eufRsaGgCk513ITI1qtVqRSqUApNv1MxwOB2KxGCKRCBRFQXt7O7q6ulBVVaXtc3QR\nmjlD1MyL5VzDemWaiTKfTLGwWCxalrnOGQ6Hte9fX18Pj8czZ9auri6oqorW1laMjY3BZrPlXLd/\n/37dx9PyxWJAZSkzZ6vP59PWxeNxANCmEJ3PzAv1nj17tK8VRUFNTQ0aGxvR0NCAYDCIYDA47yxW\ntbW12nj2mX6ChcrkmuucDodDyytJEgYGBubMGgwG0dHRgd27d8PhcMDv9+dc53K5dB9Pyxebiahs\nWCyWrAvtrl27YLPZYLFY0NzcDJ/Ph9bWVjQ0NGj7zTzm6K8zrFYrWltbtWYYID17WHt7u7b+pptu\nypkBSM/dO3Pfmc0+uQrDfMUis22uc27fvl1bn0wmMTY2hurq6pxZXS4XNm/eDKvVCovFgrGxMSST\nyVnrCjmeli8OYU1ERGwmIiIiFgMiIgKLARERgcWAiIjAYkBERGAxICIisBgQERGA/w+acxCBsjMu\nAwAAAABJRU5ErkJggg==\n", "text": [ "" ] } ], "prompt_number": 18 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using Pandas it is easy to store a custom data frame." ] }, { "cell_type": "code", "collapsed": false, "input": [ "df.to_hdf('store.h5', 'df')" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stderr", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/pandas/io/pytables.py:1992: PerformanceWarning: \n", "your performance may suffer as PyTables will pickle object types that it cannot\n", "map directly to c-types [inferred_type->mixed,key->axis0] [items->None]\n", "\n", " warnings.warn(ws, PerformanceWarning)\n", "/home/wd15/anaconda/lib/python2.7/site-packages/pandas/io/pytables.py:1992: PerformanceWarning: \n", "your performance may suffer as PyTables will pickle object types that it cannot\n", "map directly to c-types [inferred_type->mixed,key->block0_items] [items->None]\n", "\n", " warnings.warn(ws, PerformanceWarning)\n" ] }, { "output_type": "stream", "stream": "stderr", "text": [ "/home/wd15/anaconda/lib/python2.7/site-packages/pandas/io/pytables.py:1992: PerformanceWarning: \n", "your performance may suffer as PyTables will pickle object types that it cannot\n", "map directly to c-types [inferred_type->mixed,key->block2_values] [items->[u'datastore', u'dependencies', u'diff', u'executable', u'input_data', u'input_datastore', u'label', u'launch_mode', u'main_file', u'outcome', u'output_data', u'parameters', u'platforms', u'reason', u'repeats', u'repository', u'script_arguments', u'stdout_stderr', u'tags', u'timestamp', u'user', u'version', 'suite']]\n", "\n", " warnings.warn(ws, PerformanceWarning)\n", "/home/wd15/anaconda/lib/python2.7/site-packages/pandas/io/pytables.py:1992: PerformanceWarning: \n", "your performance may suffer as PyTables will pickle object types that it cannot\n", "map directly to c-types [inferred_type->mixed,key->block2_items] [items->None]\n", "\n", " warnings.warn(ws, PerformanceWarning)\n" ] } ], "prompt_number": 19 }, { "cell_type": "code", "collapsed": false, "input": [ "store = pandas.HDFStore('store.h5')\n", "print store.df.dependencies" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0 [{u'name': u'IPython', u'module': u'python', u...\n", "1 [{u'name': u'IPython', u'module': u'python', u...\n", "2 [{u'name': u'IPython', u'module': u'python', u...\n", "3 [{u'name': u'IPython', u'module': u'python', u...\n", "4 [{u'name': u'IPython', u'module': u'python', u...\n", "5 [{u'name': u'IPython', u'module': u'python', u...\n", "6 [{u'name': u'IPython', u'module': u'python', u...\n", "7 [{u'name': u'IPython', u'module': u'python', u...\n", "8 [{u'name': u'IPython', u'module': u'python', u...\n", "9 [{u'name': u'IPython', u'module': u'python', u...\n", "10 [{u'name': u'IPython', u'module': u'python', u...\n", "11 [{u'name': u'IPython', u'module': u'python', u...\n", "12 [{u'name': u'IPython', u'module': u'python', u...\n", "13 [{u'name': u'IPython', u'module': u'python', u...\n", "14 [{u'name': u'IPython', u'module': u'python', u...\n", "15 [{u'name': u'IPython', u'module': u'python', u...\n", "16 [{u'name': u'IPython', u'module': u'python', u...\n", "17 [{u'name': u'IPython', u'module': u'python', u...\n", "Name: dependencies, dtype: object\n" ] } ], "prompt_number": 21 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Conclusion\n", "\n", "Sumatra stores data in an SQL style database and this isn't ideal for pulling data into Python for data manipulation. Pandas is good for data manipulation and pulling the records out of Sumatra and into Pandas is very easy." ] } ], "metadata": {} } ] }