{ "metadata": { "name": "", "signature": "sha256:3e182762cb13047048d97db09b34a646a983cb94c8f2d94e55d8d9342ddf1714" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "# Blaze - A Quick Tour\n", "\n", "Blaze provides a lightweight interface on top of pre-existing computational infrastructure. This notebook gives a quick overview of how Blaze interacts with a variety of data types." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from blaze import Data, by, compute" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Blaze wraps pre-existing data\n", "\n", "Blaze interacts with normal Python objects. Operations on Blaze `Data` objects create expression trees. \n", "\n", "These expressions deliver an intuitive numpy/pandas-like feel." ] }, { "cell_type": "code", "collapsed": false, "input": [ "x = Data(1)\n", "x" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "1" ], "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "1" ] } ], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "x.dshape" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "dshape(\"int64\")" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "x + 1" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "2" ], "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "2" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "print type(x + 1)\n", "print type(compute(x + 1))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "\n" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Lists\n", "\n", "Starting small, Blaze interacts happily with collections of data. \n", "\n", "It uses Pandas for pretty notebook printing." ] }, { "cell_type": "code", "collapsed": false, "input": [ "x = Data([1, 2, 3, 4, 5])\n", "x" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
_2
0 1
1 2
2 3
3 4
4 5
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ " _2\n", "0 1\n", "1 2\n", "2 3\n", "3 4\n", "4 5" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "x[x > 2] * 10" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
_2
0 30
1 40
2 50
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ " _2\n", "0 30\n", "1 40\n", "2 50" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "x.dshape" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "dshape(\"5 * int64\")" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Or Tabular, Pandas-like datasets\n", "\n", "Slightly more exciting, Blaze operates on tabular data" ] }, { "cell_type": "code", "collapsed": false, "input": [ "L = [[1, 'Alice', 100],\n", " [2, 'Bob', -200],\n", " [3, 'Charlie', 300],\n", " [4, 'Dennis', 400],\n", " [5, 'Edith', -500]]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "x = Data(L, fields=['id', 'name', 'amount'])\n", "x.dshape" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ "dshape(\"5 * {id: int64, name: string, amount: int64}\")" ] } ], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "x" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnameamount
0 1 Alice 100
1 2 Bob-200
2 3 Charlie 300
3 4 Dennis 400
4 5 Edith-500
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ " id name amount\n", "0 1 Alice 100\n", "1 2 Bob -200\n", "2 3 Charlie 300\n", "3 4 Dennis 400\n", "4 5 Edith -500" ] } ], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "deadbeats = x[x.amount < 0].name\n", "deadbeats" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
name
0 Bob
1 Edith
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ " name\n", "0 Bob\n", "1 Edith" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Or it can even just drive pandas\n", "\n", "Blaze doesn't do work, it just tells other systems to do work.\n", "\n", "In the previous example, Blaze told Python which for-loops to write. In this example, it calls the right functions in Pandas. \n", "\n", "The user experience is identical, only performance differs." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from pandas import DataFrame\n", "\n", "df = DataFrame([[1, 'Alice', 100], \n", " [2, 'Bob', -200],\n", " [3, 'Charlie', 300],\n", " [4, 'Denis', 400],\n", " [5, 'Edith', -500]], columns=['id', 'name', 'amount'])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "df" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnameamount
0 1 Alice 100
1 2 Bob-200
2 3 Charlie 300
3 4 Denis 400
4 5 Edith-500
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ " id name amount\n", "0 1 Alice 100\n", "1 2 Bob -200\n", "2 3 Charlie 300\n", "3 4 Denis 400\n", "4 5 Edith -500" ] } ], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "x = Data(df)\n", "x" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnameamount
0 1 Alice 100
1 2 Bob-200
2 3 Charlie 300
3 4 Denis 400
4 5 Edith-500
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ " id name amount\n", "0 1 Alice 100\n", "1 2 Bob -200\n", "2 3 Charlie 300\n", "3 4 Denis 400\n", "4 5 Edith -500" ] } ], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "deadbeats = x[x.amount < 0].name\n", "deadbeats" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
name
1 Bob
4 Edith
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 16, "text": [ " name\n", "1 Bob\n", "4 Edith" ] } ], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calling `compute`, we see that Blaze returns a thing like what it was given." ] }, { "cell_type": "code", "collapsed": false, "input": [ "type(compute(deadbeats))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ "pandas.core.series.Series" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Other data types like SQLAlchemy Tables\n", "\n", "Blaze extends beyond just Python and Pandas (that's the main motivation.) \n", "\n", "Here it drives SQLAlchemy." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sqlalchemy import Table, Column, MetaData, Integer, String, create_engine\n", "\n", "tab = Table('bank', MetaData(),\n", " Column('id', Integer),\n", " Column('name', String),\n", " Column('amount', Integer))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "x = Data(tab)\n", "x.dshape" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ "dshape(\"var * {id: ?int32, name: ?string, amount: ?int32}\")" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Just like computations on pandas objects produce pandas objects, computations on SQLAlchemy tables produce SQLAlchemy Select statements. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "deadbeats = x[x.amount < 0].name\n", "compute(deadbeats)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 20, "text": [ "" ] } ], "prompt_number": 20 }, { "cell_type": "code", "collapsed": false, "input": [ "print compute(deadbeats) # SQLAlchemy generates actual SQL" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "SELECT bank.name \n", "FROM bank \n", "WHERE bank.amount < :amount_1\n" ] } ], "prompt_number": 21 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Connect to a real database\n", "\n", "When we drive a SQLAlchemy table connected to a database we get actual computation." ] }, { "cell_type": "code", "collapsed": false, "input": [ "engine = create_engine('sqlite:////home/mrocklin/workspace/blaze/blaze/examples/data/iris.db')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 22 }, { "cell_type": "code", "collapsed": false, "input": [ "x = Data(engine)\n", "x" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "Data: Engine(sqlite:////home/mrocklin/workspace/blaze/blaze/examples/data/iris.db)
DataShape: {
iris: var * {
sepal_length: ?float64,
sepal_width: ?float64,
petal_length: ?float64,
petal_width: ?float64,
species: ?string
..." ], "metadata": {}, "output_type": "pyout", "prompt_number": 23, "text": [ "Data: Engine(sqlite:////home/mrocklin/workspace/blaze/blaze/examples/data/iris.db)\n", "DataShape: {\n", " iris: var * {\n", " sepal_length: ?float64,\n", " sepal_width: ?float64,\n", " petal_length: ?float64,\n", " petal_width: ?float64,\n", " species: ?string\n", " ..." ] } ], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "x.iris" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepal_lengthsepal_widthpetal_lengthpetal_widthspecies
0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
5 5.4 3.9 1.7 0.4 Iris-setosa
6 4.6 3.4 1.4 0.3 Iris-setosa
7 5.0 3.4 1.5 0.2 Iris-setosa
8 4.4 2.9 1.4 0.2 Iris-setosa
9 4.9 3.1 1.5 0.1 Iris-setosa
10 5.4 3.7 1.5 0.2 Iris-setosa
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 24, "text": [ " sepal_length sepal_width petal_length petal_width species\n", "0 5.1 3.5 1.4 0.2 Iris-setosa\n", "1 4.9 3.0 1.4 0.2 Iris-setosa\n", "2 4.7 3.2 1.3 0.2 Iris-setosa\n", "3 4.6 3.1 1.5 0.2 Iris-setosa\n", "4 5.0 3.6 1.4 0.2 Iris-setosa\n", "5 5.4 3.9 1.7 0.4 Iris-setosa\n", "6 4.6 3.4 1.4 0.3 Iris-setosa\n", "7 5.0 3.4 1.5 0.2 Iris-setosa\n", "8 4.4 2.9 1.4 0.2 Iris-setosa\n", "9 4.9 3.1 1.5 0.1 Iris-setosa\n", "..." ] } ], "prompt_number": 24 }, { "cell_type": "code", "collapsed": false, "input": [ "by(x.iris.species, shortest=x.iris.sepal_length.min(), \n", " longest=x.iris.sepal_length.max())" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
specieslongestshortest
0 Iris-setosa 5.8 4.3
1 Iris-versicolor 7.0 4.9
2 Iris-virginica 7.9 4.9
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 25, "text": [ " species longest shortest\n", "0 Iris-setosa 5.8 4.3\n", "1 Iris-versicolor 7.0 4.9\n", "2 Iris-virginica 7.9 4.9" ] } ], "prompt_number": 25 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Use URI strings to ease access\n", "\n", "Often just figuring out how to produce the relevant Python object can be a challenge.\n", "\n", "Blaze supports many formats of URI strings" ] }, { "cell_type": "code", "collapsed": false, "input": [ "x = Data('sqlite:////home/mrocklin/workspace/blaze/blaze/examples/data/iris.db::iris')\n", "x" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepal_lengthsepal_widthpetal_lengthpetal_widthspecies
0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
5 5.4 3.9 1.7 0.4 Iris-setosa
6 4.6 3.4 1.4 0.3 Iris-setosa
7 5.0 3.4 1.5 0.2 Iris-setosa
8 4.4 2.9 1.4 0.2 Iris-setosa
9 4.9 3.1 1.5 0.1 Iris-setosa
10 5.4 3.7 1.5 0.2 Iris-setosa
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 26, "text": [ " sepal_length sepal_width petal_length petal_width species\n", "0 5.1 3.5 1.4 0.2 Iris-setosa\n", "1 4.9 3.0 1.4 0.2 Iris-setosa\n", "2 4.7 3.2 1.3 0.2 Iris-setosa\n", "3 4.6 3.1 1.5 0.2 Iris-setosa\n", "4 5.0 3.6 1.4 0.2 Iris-setosa\n", "5 5.4 3.9 1.7 0.4 Iris-setosa\n", "6 4.6 3.4 1.4 0.3 Iris-setosa\n", "7 5.0 3.4 1.5 0.2 Iris-setosa\n", "8 4.4 2.9 1.4 0.2 Iris-setosa\n", "9 4.9 3.1 1.5 0.1 Iris-setosa\n", "..." ] } ], "prompt_number": 26 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Once you have SQL, might as well go big" ] }, { "cell_type": "code", "collapsed": false, "input": [ "x = Data('impala://ec2-54-90-201-28.compute-1.amazonaws.com')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 27 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### MongoDB\n", "\n", "Github's database is mirrored in a Mongo collection hosted in the Netherlands.\n", "\n", "Connecting via ssh tunnel. See http://ghtorrent.org/ to obtain access." ] }, { "cell_type": "code", "collapsed": false, "input": [ "users = Data('mongodb://ghtorrentro:ghtorrentro@localhost/github::users')\n", "users" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
avatar_urlbioblogcompanycreated_atemailfollowersfollowinggravatar_idhireablehtml_urlidlocationloginnamepublic_gistspublic_repostypeurl
0 https://secure.gravatar.com/avatar/a7e55f31bb4... None None None 2012-05-04T13:59:54Z None 0 0 a7e55f31bb45321f30211e901cd89ffa None https://github.com/Michaelwussler 1706010 None Michaelwussler None 0 3 User https://api.github.com/users/Michaelwussler
1 https://secure.gravatar.com/avatar/eb8139078bc... None None None 2012-05-03T18:47:13Z None 0 0 eb8139078bc623dee103ed3917c080dc None https://github.com/praiser 1703505 None praiser None 0 3 User https://api.github.com/users/praiser
2 https://secure.gravatar.com/avatar/13c7b665e0c... None 2010-04-07T12:15:00Z vad.viktor@gmail.com 2 3 13c7b665e0cbd94e0155387c35957d13 False https://github.com/vadviktor 238703 Budapest vadviktor Vad Viktor 0 10 User https://api.github.com/users/vadviktor
3 https://secure.gravatar.com/avatar/b7937805411... None Appcelerator 2012-04-02T16:13:58Z yjin@appcelerator.com 0 0 b7937805411d278ceb839175e251e2a0 False https://github.com/ypjin 1598831 Beijing ypjin Yuping 0 5 User https://api.github.com/users/ypjin
4 https://secure.gravatar.com/avatar/89e109fca84... http://blogs.perl.org/users/steven_haryanto - 2010-02-26T01:28:09Z stevenharyanto@gmail.com 39 307 89e109fca8474e5636c9feef7a8422ea False https://github.com/sharyanto 211084 Jakarta, Indonesia sharyanto Steven Haryanto 5 195 User https://api.github.com/users/sharyanto
5 https://secure.gravatar.com/avatar/7490b4e3e9c... Perl, C, C++, JavaScript, PHP, Haskell, Ruby, ... http://c9s.me 2009-02-01T15:20:08Z cornelius.howl@gmail.com 330 599 7490b4e3e9cb85a1f7dc0c8ea01a86e5 True https://github.com/c9s 50894 Taipei, Taiwan c9s Yo-An Lin 281 206 User https://api.github.com/users/c9s
6 https://secure.gravatar.com/avatar/dc078ac4dbd... None azhari.harahap.us CapungRiders 2010-10-31T05:53:40Z azhari@harahap.us 26 11 dc078ac4dbdc06d3e3c0ec0b6801b53d False https://github.com/back2arie 461397 Indonesia back2arie Azhari Harahap 1 15 User https://api.github.com/users/back2arie
7 https://secure.gravatar.com/avatar/fb844ffed6c... Git Ninja and language-agnostic problem solver... http://dukeleto.pl Leto Labs LLC 2008-10-22T03:02:15Z jonathan@leto.net 175 635 fb844ffed6c5a2e69638627e3b721308 True https://github.com/leto 30298 Portland, OR leto Jonathan \"Duke\" Leto 276 112 User https://api.github.com/users/leto
8 https://secure.gravatar.com/avatar/3843ec7861e... http://alanhaggai.org/ Thought Ripples 2009-01-13T16:25:15Z haggai@cpan.org 46 365 3843ec7861e271e803ea076035d683dd False https://github.com/alanhaggai 46288 IN alanhaggai Alan Haggai Alavi 4 54 User https://api.github.com/users/alanhaggai
9 https://secure.gravatar.com/avatar/f611628c558... None arisdottle.net Team Rooster Pirates 2009-05-12T19:29:09Z amiri@roosterpirates.com 16 87 f611628c5588f7a0a72c65ec1f94dfb8 False https://github.com/amiri 83806 Los Angeles, CA amiri Amiri Barksdale 16 18 User https://api.github.com/users/amiri
10 https://secure.gravatar.com/avatar/c57483c5cfe... None http://www.geekfarm.org/wu/muse/WebHome.html None 2009-02-08T03:28:54Z git-c@geekfarm.org 16 87 c57483c5cfe159b98a6e33ee7e9eec38 False https://github.com/wu 52700 None wu Alex White 0 15 User https://api.github.com/users/wu
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 28, "text": [ " avatar_url \\\n", "0 https://secure.gravatar.com/avatar/a7e55f31bb4... \n", "1 https://secure.gravatar.com/avatar/eb8139078bc... \n", "2 https://secure.gravatar.com/avatar/13c7b665e0c... \n", "3 https://secure.gravatar.com/avatar/b7937805411... \n", "4 https://secure.gravatar.com/avatar/89e109fca84... \n", "5 https://secure.gravatar.com/avatar/7490b4e3e9c... \n", "6 https://secure.gravatar.com/avatar/dc078ac4dbd... \n", "7 https://secure.gravatar.com/avatar/fb844ffed6c... \n", "8 https://secure.gravatar.com/avatar/3843ec7861e... \n", "9 https://secure.gravatar.com/avatar/f611628c558... \n", "10 https://secure.gravatar.com/avatar/c57483c5cfe... \n", "\n", " bio \\\n", "0 None \n", "1 None \n", "2 None \n", "3 \n", "4 \n", "5 Perl, C, C++, JavaScript, PHP, Haskell, Ruby, ... \n", "6 None \n", "7 Git Ninja and language-agnostic problem solver... \n", "8 \n", "9 None \n", "10 None \n", "\n", " blog company \\\n", "0 None None \n", "1 None None \n", "2 \n", "3 None Appcelerator \n", "4 http://blogs.perl.org/users/steven_haryanto - \n", "5 http://c9s.me \n", "6 azhari.harahap.us CapungRiders \n", "7 http://dukeleto.pl Leto Labs LLC \n", "8 http://alanhaggai.org/ Thought Ripples \n", "9 arisdottle.net Team Rooster Pirates \n", "10 http://www.geekfarm.org/wu/muse/WebHome.html None \n", "\n", " created_at email followers following \\\n", "0 2012-05-04T13:59:54Z None 0 0 \n", "1 2012-05-03T18:47:13Z None 0 0 \n", "2 2010-04-07T12:15:00Z vad.viktor@gmail.com 2 3 \n", "3 2012-04-02T16:13:58Z yjin@appcelerator.com 0 0 \n", "4 2010-02-26T01:28:09Z stevenharyanto@gmail.com 39 307 \n", "5 2009-02-01T15:20:08Z cornelius.howl@gmail.com 330 599 \n", "6 2010-10-31T05:53:40Z azhari@harahap.us 26 11 \n", "7 2008-10-22T03:02:15Z jonathan@leto.net 175 635 \n", "8 2009-01-13T16:25:15Z haggai@cpan.org 46 365 \n", "9 2009-05-12T19:29:09Z amiri@roosterpirates.com 16 87 \n", "10 2009-02-08T03:28:54Z git-c@geekfarm.org 16 87 \n", "\n", " gravatar_id hireable \\\n", "0 a7e55f31bb45321f30211e901cd89ffa None \n", "1 eb8139078bc623dee103ed3917c080dc None \n", "2 13c7b665e0cbd94e0155387c35957d13 False \n", "3 b7937805411d278ceb839175e251e2a0 False \n", "4 89e109fca8474e5636c9feef7a8422ea False \n", "5 7490b4e3e9cb85a1f7dc0c8ea01a86e5 True \n", "6 dc078ac4dbdc06d3e3c0ec0b6801b53d False \n", "7 fb844ffed6c5a2e69638627e3b721308 True \n", "8 3843ec7861e271e803ea076035d683dd False \n", "9 f611628c5588f7a0a72c65ec1f94dfb8 False \n", "10 c57483c5cfe159b98a6e33ee7e9eec38 False \n", "\n", " html_url id location \\\n", "0 https://github.com/Michaelwussler 1706010 None \n", "1 https://github.com/praiser 1703505 None \n", "2 https://github.com/vadviktor 238703 Budapest \n", "3 https://github.com/ypjin 1598831 Beijing \n", "4 https://github.com/sharyanto 211084 Jakarta, Indonesia \n", "5 https://github.com/c9s 50894 Taipei, Taiwan \n", "6 https://github.com/back2arie 461397 Indonesia \n", "7 https://github.com/leto 30298 Portland, OR \n", "8 https://github.com/alanhaggai 46288 IN \n", "9 https://github.com/amiri 83806 Los Angeles, CA \n", "10 https://github.com/wu 52700 None \n", "\n", " login name public_gists public_repos type \\\n", "0 Michaelwussler None 0 3 User \n", "1 praiser None 0 3 User \n", "2 vadviktor Vad Viktor 0 10 User \n", "3 ypjin Yuping 0 5 User \n", "4 sharyanto Steven Haryanto 5 195 User \n", "5 c9s Yo-An Lin 281 206 User \n", "6 back2arie Azhari Harahap 1 15 User \n", "7 leto Jonathan \"Duke\" Leto 276 112 User \n", "8 alanhaggai Alan Haggai Alavi 4 54 User \n", "9 amiri Amiri Barksdale 16 18 User \n", "10 wu Alex White 0 15 User \n", "\n", " url \n", "0 https://api.github.com/users/Michaelwussler \n", "1 https://api.github.com/users/praiser \n", "2 https://api.github.com/users/vadviktor \n", "3 https://api.github.com/users/ypjin \n", "4 https://api.github.com/users/sharyanto \n", "5 https://api.github.com/users/c9s \n", "6 https://api.github.com/users/back2arie \n", "7 https://api.github.com/users/leto \n", "8 https://api.github.com/users/alanhaggai \n", "9 https://api.github.com/users/amiri \n", "..." ] } ], "prompt_number": 28 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Handle NumPy-like computations\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import h5py\n", "f = h5py.File('/home/mrocklin/Downloads/OMI-Aura_L2-OMAERO_2014m1105t2304-o54838_v003-2014m1106t215558.he5')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 29 }, { "cell_type": "code", "collapsed": false, "input": [ "x = Data(f)\n", "x.dshape" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 30, "text": [ "dshape(\"\"\"{\n", " HDFEOS: {\n", " ADDITIONAL: {FILE_ATTRIBUTES: {}},\n", " SWATHS: {\n", " ColumnAmountAerosol: {\n", " Data Fields: {\n", " AerosolIndexUV: 1643 * 60 * int16,\n", " AerosolIndexVIS: 1643 * 60 * int16,\n", " AerosolModelMW: 1643 * 60 * uint16,\n", " AerosolModelsPassedThreshold: 1643 * 60 * 10 * uint16,\n", " AerosolOpticalThicknessMW: 1643 * 60 * 14 * int16,\n", " AerosolOpticalThicknessMWPrecision: 1643 * 60 * int16,\n", " AerosolOpticalThicknessNUV: 1643 * 60 * 2 * int16,\n", " AerosolOpticalThicknessPassedThreshold: 1643 * 60 * 10 * 9 * int16,\n", " AerosolOpticalThicknessPassedThresholdMean: 1643 * 60 * 9 * int16,\n", " AerosolOpticalThicknessPassedThresholdStd: 1643 * 60 * 9 * int16,\n", " CloudFlags: 1643 * 60 * uint8,\n", " CloudPressure: 1643 * 60 * int16,\n", " EffectiveCloudFraction: 1643 * 60 * int8,\n", " InstrumentConfigurationId: 1643 * uint8,\n", " MeasurementQualityFlags: 1643 * uint8,\n", " NumberOfModelsPassedThreshold: 1643 * 60 * uint8,\n", " ProcessingQualityFlagsMW: 1643 * 60 * uint16,\n", " ProcessingQualityFlagsNUV: 1643 * 60 * uint16,\n", " RootMeanSquareErrorOfFitPassedThreshold: 1643 * 60 * 10 * int16,\n", " SingleScatteringAlbedoMW: 1643 * 60 * 14 * int16,\n", " SingleScatteringAlbedoMWPrecision: 1643 * 60 * int16,\n", " SingleScatteringAlbedoNUV: 1643 * 60 * 2 * int16,\n", " SingleScatteringAlbedoPassedThreshold: 1643 * 60 * 10 * 9 * int16,\n", " SingleScatteringAlbedoPassedThresholdMean: 1643 * 60 * 9 * int16,\n", " SingleScatteringAlbedoPassedThresholdStd: 1643 * 60 * 9 * int16,\n", " SmallPixelRadiancePointerUV: 1643 * 2 * int16,\n", " SmallPixelRadiancePointerVIS: 1643 * 2 * int16,\n", " SmallPixelRadianceUV: 6783 * 60 * float32,\n", " SmallPixelRadianceVIS: 6786 * 60 * float32,\n", " SmallPixelWavelengthUV: 6783 * 60 * uint16,\n", " SmallPixelWavelengthVIS: 6786 * 60 * uint16,\n", " TerrainPressure: 1643 * 60 * int16,\n", " TerrainReflectivity: 1643 * 60 * 9 * int16,\n", " XTrackQualityFlags: 1643 * 60 * uint8\n", " },\n", " Geolocation Fields: {\n", " GroundPixelQualityFlags: 1643 * 60 * uint16,\n", " Latitude: 1643 * 60 * float32,\n", " Longitude: 1643 * 60 * float32,\n", " OrbitPhase: 1643 * float32,\n", " SolarAzimuthAngle: 1643 * 60 * float32,\n", " SolarZenithAngle: 1643 * 60 * float32,\n", " SpacecraftAltitude: 1643 * float32,\n", " SpacecraftLatitude: 1643 * float32,\n", " SpacecraftLongitude: 1643 * float32,\n", " TerrainHeight: 1643 * 60 * int16,\n", " Time: 1643 * float64,\n", " ViewingAzimuthAngle: 1643 * 60 * float32,\n", " ViewingZenithAngle: 1643 * 60 * float32\n", " }\n", " }\n", " }\n", " },\n", " HDFEOS INFORMATION: {\n", " ArchiveMetadata.0: string[65535, 'A'],\n", " CoreMetadata.0: string[65535, 'A'],\n", " StructMetadata.0: string[32000, 'A']\n", " }\n", " }\"\"\")" ] } ], "prompt_number": 30 }, { "cell_type": "code", "collapsed": false, "input": [ "x.HDFEOS.SWATHS.ColumnAmountAerosol.Data_Fields.CloudPressure" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "array([[-32767, -32767, -32767, ..., -32767, -32767, -32767],
[-32767, -32767, -32767, ..., -32767, -32767, -32767],
[-32767, -32767, -32767, ..., -32767, -32767, -32767],
...,
[-32767, -32767, -32767, ..., -32767, -32767, -32767],
[-32767, -32767, -32767, ..., -32767, -32767, -32767],
[-32767, -32767, -32767, ..., -32767, -32767, -32767]], dtype=int16)" ], "metadata": {}, "output_type": "pyout", "prompt_number": 31, "text": [ "array([[-32767, -32767, -32767, ..., -32767, -32767, -32767],\n", " [-32767, -32767, -32767, ..., -32767, -32767, -32767],\n", " [-32767, -32767, -32767, ..., -32767, -32767, -32767],\n", " ..., \n", " [-32767, -32767, -32767, ..., -32767, -32767, -32767],\n", " [-32767, -32767, -32767, ..., -32767, -32767, -32767],\n", " [-32767, -32767, -32767, ..., -32767, -32767, -32767]], dtype=int16)" ] } ], "prompt_number": 31 }, { "cell_type": "code", "collapsed": false, "input": [ "x.HDFEOS.SWATHS.ColumnAmountAerosol.Data_Fields.CloudPressure.max()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "1013" ], "metadata": {}, "output_type": "pyout", "prompt_number": 32, "text": [ "1013" ] } ], "prompt_number": 32 } ], "metadata": {} } ] }