{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Extracting Vega Lite Transformations\n", "\n", "In this notebook, we use extract out the [transformations](https://vega.github.io/vega-lite/docs/transform.html) in the Vega Lite spec generated by Altair and push them to the server, by adding them to the SQL query with Ibis." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import altair as alt\n", "import ibis.mapd\n", "import IPython.display \n", "import ibis_vega_transform" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Carrier names\n", "\n", "First we connect to the table using Ibis:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "conn = ibis.mapd.connect(\n", " host='metis.mapd.com', user='mapd', password='HyperInteractive',\n", " port=443, database='mapd', protocol= 'https'\n", ")\n", "t = conn.table(\"flights_donotmodify\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we compose an Altair chart using an ibis expression." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "c = alt.Chart(t[t.carrier_name]).mark_bar().encode(\n", " x='carrier_name',\n", " y='count()'\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we enable rendering that extracts the aggregate expressions and adds them onto the Ibis expresion:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "application/vnd.vega.ibis.v5+json": { "span": { "uber-trace-id": "67d471afbbb277a0:d9eaff11875463bd:0:1" }, "spec": { "$schema": "https://vega.github.io/schema/vega-lite/v4.0.0.json", "config": { "view": { "continuousHeight": 300, "continuousWidth": 400 } }, "data": { "name": "ibis:-447731155248692576" }, "encoding": { "x": { "field": "carrier_name", "type": "nominal" }, "y": { "aggregate": "count", "type": "quantitative" } }, "mark": "bar" } }, "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The only data loaded into the browser for this chart is one row for each carrier, because the counting transformation is pushed to the SQL statement." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Delay by Month" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "application/vnd.vega.ibis.v5+json": { "span": { "uber-trace-id": "12c41fd8914dc7cc:8e825d42f3bad2e3:0:1" }, "spec": { "$schema": "https://vega.github.io/schema/vega-lite/v4.0.0.json", "config": { "view": { "continuousHeight": 300, "continuousWidth": 400 } }, "data": { "name": "ibis:6158538728759919949" }, "encoding": { "color": { "aggregate": "average", "field": "depdelay", "type": "quantitative" }, "x": { "field": "flight_dayofmonth", "type": "ordinal" }, "y": { "field": "flight_month", "type": "ordinal" } }, "mark": "rect" } }, "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "delay_by_month = alt.Chart(t[t.flight_dayofmonth, t.flight_month, t.depdelay]).mark_rect().encode(\n", " x='flight_dayofmonth:O',\n", " y='flight_month:O',\n", " color='average(depdelay)'\n", ")\n", "delay_by_month" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Interactive support: grouping by month\n", "Now let's use widgets to facet this by month:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "application/vnd.vega.ibis.v5+json": { "span": { "uber-trace-id": "6e5ebe4c3daf09e0:d5067b39b8eec36b:0:1" }, "spec": { "$schema": "https://vega.github.io/schema/vega-lite/v4.0.0.json", "config": { "view": { "continuousHeight": 300, "continuousWidth": 400 } }, "data": { "name": "ibis:2803383568713325053" }, "encoding": { "x": { "field": "flight_dayofmonth", "type": "ordinal" }, "y": { "aggregate": "average", "field": "depdelay", "type": "quantitative" } }, "mark": "line", "selection": { "selector001": { "bind": { "input": "range", "max": 12, "min": 1, "step": 1 }, "fields": [ "flight_month" ], "init": { "flight_month": 1 }, "type": "single" } }, "transform": [ { "filter": { "selection": "selector001" } } ] } }, "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "slider = alt.binding_range(min=1, max=12, step=1)\n", "select_month = alt.selection_single(fields=['flight_month'],\n", " bind=slider, init={'flight_month': 1})\n", "\n", "alt.Chart(t[t.flight_dayofmonth, t.depdelay, t.flight_month]).mark_line().encode(\n", " x='flight_dayofmonth:O',\n", " y='average(depdelay)'\n", ").add_selection(\n", " select_month\n", ").transform_filter(\n", " select_month\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Combing Tables: Comparing with Political Contributions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's make a graph that also uses data from another table. Let's see how average delay time for flights from a state relates to the average political contribution size." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "t_contributions = conn.table(\"contributions_donotmodify\")" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "application/vnd.vega.ibis.v5+json": { "span": { "uber-trace-id": "a5ecdd291f7702bd:573951263b3ce912:0:1" }, "spec": { "$schema": "https://vega.github.io/schema/vega-lite/v4.0.0.json", "config": { "view": { "continuousHeight": 300, "continuousWidth": 400 } }, "layer": [ { "data": { "name": "ibis:1204177582417492997" }, "encoding": { "color": { "value": "green" }, "size": { "aggregate": "count", "type": "quantitative" }, "x": { "field": "contributor_state", "type": "nominal" }, "y": { "aggregate": "mean", "axis": { "grid": false }, "field": "amount", "type": "quantitative" } }, "mark": "circle" }, { "data": { "name": "ibis:-6295626855409094441" }, "encoding": { "color": { "value": "firebrick" }, "size": { "aggregate": "count", "type": "quantitative" }, "x": { "field": "origin_state", "type": "nominal" }, "y": { "aggregate": "mean", "axis": { "grid": false }, "field": "depdelay", "type": "quantitative" } }, "mark": "square" } ], "resolve": { "scale": { "size": "independent", "y": "independent" } } } }, "text/plain": [ "alt.LayerChart(...)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "amount_by_state = alt.Chart(\n", " t_contributions[[\"contributor_state\", \"amount\"]]\n", ").mark_circle().encode(\n", " x=\"contributor_state\",\n", " y=alt.Y(\"mean(amount):Q\", axis=alt.Axis(grid=False)),\n", " color=alt.Color(value=\"green\"),\n", " size=\"count()\"\n", ")\n", "\n", "delay_by_state = alt.Chart(\n", " t[[\"origin_state\", \"depdelay\"]]\n", ").mark_square().encode(\n", " x=\"origin_state\",\n", " y=alt.Y(\n", " \"mean(depdelay):Q\",\n", " axis=alt.Axis(grid=False)\n", " ),\n", " color=alt.Color(value=\"firebrick\"),\n", " size=\"count()\"\n", ")\n", "\n", "combined = (amount_by_state + delay_by_state).resolve_scale(y='independent', size='independent')\n", "combined" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }