{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# How to leverage the entire PyData Stack"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# A quick poll ..."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"### Who uses pandas?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"### Who uses numpy?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Blaze - A Quick Tour\n",
"\n",
"Blaze provides a lightweight interface on top of pre-existing computational infrastructure. This notebook gives a quick overview of how Blaze interacts with a variety of data types."
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"collapsed": true,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"%reload_ext autotime\n",
"\n",
"from blaze import Data, by, compute"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Blaze wraps pre-existing data\n",
"\n",
"Blaze interacts with normal Python objects. Operations on Blaze `Data` objects create expression trees. \n",
"\n",
"These expressions deliver an intuitive numpy/pandas-like feel."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Lists\n",
"\n",
"Starting small, Blaze interacts happily with collections of data. \n",
"\n",
"It uses Pandas for pretty notebook printing."
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
" \n",
" \n",
" | \n",
" None | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
" \n",
"0 1\n",
"1 2\n",
"2 3\n",
"3 4\n",
"4 5"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 7 ms\n"
]
}
],
"source": [
"x = Data([1, 2, 3, 4, 5])\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" None | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 30 | \n",
"
\n",
" \n",
" 1 | \n",
" 40 | \n",
"
\n",
" \n",
" 2 | \n",
" 50 | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
" \n",
"0 30\n",
"1 40\n",
"2 50"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 18.5 ms\n"
]
}
],
"source": [
"x[x > 2] * 10"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"dshape(\"5 * int64\")"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 1.49 ms\n"
]
}
],
"source": [
"x.dshape"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Or Tabular, Pandas-like datasets\n",
"\n",
"Slightly more exciting, Blaze operates on tabular data"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 1.75 ms\n"
]
}
],
"source": [
"L = [[1, 'Alice', 100],\n",
" [2, 'Bob', -200],\n",
" [3, 'Charlie', 300],\n",
" [4, 'Dennis', 400],\n",
" [5, 'Edith', -500]]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 1.93 ms\n"
]
}
],
"source": [
"x = Data(L, fields=['id', 'name', 'amount'])"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"ename": "AttributeError",
"evalue": "'InteractiveSymbol' object has no attribute 'amount'",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mamount\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmean\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m/Users/pcloud/Documents/code/py/blaze/blaze/expr/expressions.py\u001b[0m in \u001b[0;36m__getattr__\u001b[0;34m(self, key)\u001b[0m\n\u001b[1;32m 169\u001b[0m \u001b[0;32mpass\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 170\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 171\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mobject\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__getattribute__\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 172\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mAttributeError\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 173\u001b[0m fields = dict(zip(map(valid_identifier, self.fields),\n",
"\u001b[0;31mAttributeError\u001b[0m: 'InteractiveSymbol' object has no attribute 'amount'"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 194 ms\n"
]
}
],
"source": [
"x.amount.mean()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"dshape(\"5 * {id: int64, name: string, amount: int64}\")"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 1.84 ms\n"
]
}
],
"source": [
"x.dshape"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Here's `x` again"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" id | \n",
" name | \n",
" amount | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" Alice | \n",
" 100 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" Bob | \n",
" -200 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" Charlie | \n",
" 300 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" Dennis | \n",
" 400 | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" Edith | \n",
" -500 | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
" id name amount\n",
"0 1 Alice 100\n",
"1 2 Bob -200\n",
"2 3 Charlie 300\n",
"3 4 Dennis 400\n",
"4 5 Edith -500"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 9.45 ms\n"
]
}
],
"source": [
"x"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" name | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" Bob | \n",
"
\n",
" \n",
" 1 | \n",
" Edith | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
" name\n",
"0 Bob\n",
"1 Edith"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 12.1 ms\n"
]
}
],
"source": [
"deadbeats = x[x.amount < 0].name\n",
"deadbeats"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Or it can even just drive pandas"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Blaze doesn't do work, it just tells other systems to do work.\n",
"\n",
"In the previous example, Blaze told Python which for-loops to write. In this example, it calls the right functions in Pandas. \n",
"\n",
"The user experience is mostly identical, only performance differs."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 2.52 ms\n"
]
}
],
"source": [
"from pandas import DataFrame\n",
"\n",
"df = DataFrame([[1, 'Alice', 100], \n",
" [2, 'Bob', -200],\n",
" [3, 'Charlie', 300],\n",
" [4, 'Denis', 400],\n",
" [5, 'Edith', -500]], columns=['id', 'name', 'amount'])"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" id | \n",
" name | \n",
" amount | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" Alice | \n",
" 100 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" Bob | \n",
" -200 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" Charlie | \n",
" 300 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" Denis | \n",
" 400 | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" Edith | \n",
" -500 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" id name amount\n",
"0 1 Alice 100\n",
"1 2 Bob -200\n",
"2 3 Charlie 300\n",
"3 4 Denis 400\n",
"4 5 Edith -500"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 4.79 ms\n"
]
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" id | \n",
" name | \n",
" amount | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" Alice | \n",
" 100 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" Bob | \n",
" -200 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" Charlie | \n",
" 300 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" Denis | \n",
" 400 | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" Edith | \n",
" -500 | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
" id name amount\n",
"0 1 Alice 100\n",
"1 2 Bob -200\n",
"2 3 Charlie 300\n",
"3 4 Denis 400\n",
"4 5 Edith -500"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 10.8 ms\n"
]
}
],
"source": [
"x = Data(df)\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" name | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" Bob | \n",
"
\n",
" \n",
" 4 | \n",
" Edith | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
" name\n",
"1 Bob\n",
"4 Edith"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 19.6 ms\n"
]
}
],
"source": [
"deadbeats = x[x.amount < 0].name\n",
"deadbeats"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Outputs are Blaze expressions"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"blaze.expr.expressions.Field"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 1.52 ms\n"
]
}
],
"source": [
"type(deadbeats)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### `compute` turns Blaze expressions into something concrete"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"1 Bob\n",
"4 Edith\n",
"Name: name, dtype: object"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 4.96 ms\n"
]
}
],
"source": [
"compute(deadbeats)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"pandas.core.series.Series"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 3.33 ms\n"
]
}
],
"source": [
"type(compute(deadbeats))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Blaze also works with other data types like SQLAlchemy `Table`s"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Blaze extends beyond just Python and Pandas (that's the main motivation.) \n",
"\n",
"Here it drives SQLAlchemy."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 1.97 ms\n"
]
}
],
"source": [
"from sqlalchemy import Table, Column, MetaData, Integer, String, create_engine\n",
"\n",
"tab = Table('bank', MetaData(),\n",
" Column('id', Integer),\n",
" Column('name', String),\n",
" Column('amount', Integer))"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"dshape(\"var * {id: ?int32, name: ?string, amount: ?int32}\")"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 2.62 ms\n"
]
}
],
"source": [
"x = Data(tab)\n",
"x.dshape"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Just like computations on pandas objects produce pandas objects, computations on SQLAlchemy tables produce SQLAlchemy Select statements. "
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 7.63 ms\n"
]
}
],
"source": [
"deadbeats = x[x.amount < 0].name\n",
"compute(deadbeats)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SELECT bank.name \n",
"FROM bank \n",
"WHERE bank.amount < :amount_1\n",
"time: 3.31 ms\n"
]
}
],
"source": [
"print(compute(deadbeats)) # SQLAlchemy generates SQL"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Let's connect to a real database\n",
"\n",
"When we drive a SQLAlchemy table connected to a database we get actual computation."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 10 ms\n"
]
}
],
"source": [
"engine = create_engine('sqlite:///../blaze/blaze/examples/data/iris.db')"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"Data: Engine(sqlite:///../blaze/blaze/examples/data/iris.db)
DataShape: {
iris: var * {
sepal_length: ?float64,
sepal_width: ?float64,
petal_length: ?float64,
petal_width: ?float64,
species: ?string
..."
],
"text/plain": [
"Data: Engine(sqlite:///../blaze/blaze/examples/data/iris.db)\n",
"DataShape: {\n",
" iris: var * {\n",
" sepal_length: ?float64,\n",
" sepal_width: ?float64,\n",
" petal_length: ?float64,\n",
" petal_width: ?float64,\n",
" species: ?string\n",
" ..."
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 8.96 ms\n"
]
}
],
"source": [
"x = Data(engine)\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['iris']"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 1.2 ms\n"
]
}
],
"source": [
"x.fields"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"5.843333333333335"
],
"text/plain": [
"5.843333333333335"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 10.6 ms\n"
]
}
],
"source": [
"x.iris.sepal_length.mean()"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" species | \n",
" longest | \n",
" shortest | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" Iris-setosa | \n",
" 5.8 | \n",
" 4.3 | \n",
"
\n",
" \n",
" 1 | \n",
" Iris-versicolor | \n",
" 7.0 | \n",
" 4.9 | \n",
"
\n",
" \n",
" 2 | \n",
" Iris-virginica | \n",
" 7.9 | \n",
" 4.9 | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
" species longest shortest\n",
"0 Iris-setosa 5.8 4.3\n",
"1 Iris-versicolor 7.0 4.9\n",
"2 Iris-virginica 7.9 4.9"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 51 ms\n"
]
}
],
"source": [
"by(\n",
" x.iris.species,\n",
" shortest=x.iris.sepal_length.min(),\n",
" longest=x.iris.sepal_length.max()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SELECT iris.species, max(iris.sepal_length) AS longest, min(iris.sepal_length) AS shortest \n",
"FROM iris GROUP BY iris.species\n",
"time: 8.3 ms\n"
]
}
],
"source": [
"print(compute(_))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Use URI strings to ease access\n",
"\n",
"Often just figuring out how to produce the relevant Python object can be a challenge.\n",
"\n",
"Blaze supports many formats of URI strings"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 7.4 ms\n"
]
}
],
"source": [
"x = Data('sqlite:///../blaze/blaze/examples/data/iris.db::iris')"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" sepal_length | \n",
" sepal_width | \n",
" petal_length | \n",
" petal_width | \n",
" species | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 5.1 | \n",
" 3.5 | \n",
" 1.4 | \n",
" 0.2 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 1 | \n",
" 4.9 | \n",
" 3.0 | \n",
" 1.4 | \n",
" 0.2 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 2 | \n",
" 4.7 | \n",
" 3.2 | \n",
" 1.3 | \n",
" 0.2 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 3 | \n",
" 4.6 | \n",
" 3.1 | \n",
" 1.5 | \n",
" 0.2 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 4 | \n",
" 5.0 | \n",
" 3.6 | \n",
" 1.4 | \n",
" 0.2 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 5 | \n",
" 5.4 | \n",
" 3.9 | \n",
" 1.7 | \n",
" 0.4 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 6 | \n",
" 4.6 | \n",
" 3.4 | \n",
" 1.4 | \n",
" 0.3 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 7 | \n",
" 5.0 | \n",
" 3.4 | \n",
" 1.5 | \n",
" 0.2 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 8 | \n",
" 4.4 | \n",
" 2.9 | \n",
" 1.4 | \n",
" 0.2 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 9 | \n",
" 4.9 | \n",
" 3.1 | \n",
" 1.5 | \n",
" 0.1 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
" 10 | \n",
" 5.4 | \n",
" 3.7 | \n",
" 1.5 | \n",
" 0.2 | \n",
" Iris-setosa | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
" sepal_length sepal_width petal_length petal_width species\n",
"0 5.1 3.5 1.4 0.2 Iris-setosa\n",
"1 4.9 3.0 1.4 0.2 Iris-setosa\n",
"2 4.7 3.2 1.3 0.2 Iris-setosa\n",
"3 4.6 3.1 1.5 0.2 Iris-setosa\n",
"4 5.0 3.6 1.4 0.2 Iris-setosa\n",
"5 5.4 3.9 1.7 0.4 Iris-setosa\n",
"6 4.6 3.4 1.4 0.3 Iris-setosa\n",
"7 5.0 3.4 1.5 0.2 Iris-setosa\n",
"8 4.4 2.9 1.4 0.2 Iris-setosa\n",
"9 4.9 3.1 1.5 0.1 Iris-setosa\n",
"..."
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"time: 16.7 ms\n"
]
}
],
"source": [
"x"
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.4.3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}