{ "metadata": { "name": "", "signature": "sha256:57345d38a361f2725d4669523bb997200374f7a71028f1b54f3f99e03971bedc" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# [matta](https://github.com/carnby/matta) - view and scaffold d3.js visualizations in IPython notebooks\n", "\n", "## Let's Make Scaffold a Barchart\n", "\n", "By [@carnby](https://twitter.com/carnby)\n", "\n", "Probably you have seen the barchart example by [Mike Bostock](http://bost.ocks.org/mike/). It is [here](http://bl.ocks.org/mbostock/3885304). It is embedded below so you can see it. \n", "\n", "In this notebook I explain how to use [matta](https://github.com/carnby/matta) to implement this barchart.\n", "\n", "**Why use matta?** Because one thing is to have an example of a visualization, and another one is to have a [reusable implementation](http://bost.ocks.org/mike/chart/). Reusable implementations are not about having a function, are about an entire context where you can easily use your visualization with other datasets.\n", "\n", "**How do we do it?** In this notebook we see the basic _scaffolding_ done by matta to reproduce the example chart to visualize a pandas DataFrame. By being able to use a DataFrame, we can forget about converting the dataset to the specific layout the visualization designed had in mind, and instead, you can focus on converting to a DataFrame (which will probably be very, very easy)\n", "\n", "Let's begin." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from IPython.display import IFrame\n", "IFrame('http://bl.ocks.org/mbostock/raw/3885304', 1000, 550)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " " ], "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initial Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we load matta. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you see the README, you will notice that you can install matta's javascript and css into your IPython profile. In this way you do not need to issue a `init_javascript` call. It is here just for demonstration - if you use a core matta visualization and export the notebook to NBViewer, you will need to execute it, to allow your visitor's browser to load the required js/css files.\n", "\n", "If you installed matta into your profile, then using the function will do no harm - it detects that matta was loaded and does nothing." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import matta\n", "matta.init_javascript(path='https://rawgit.com/carnby/matta/master/matta/libs/')" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", "\n", "matta Javascript code added.\n", " " ], "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data\n", "\n", "Mike's example loads a TSV (Tab Separated Values) file with letter frequency. We can load directly into a pandas DataFrame." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import pandas as pd\n", "\n", "df = pd.read_csv('http://bl.ocks.org/mbostock/raw/3885304/964f9100166627a89c7e6c23ce8128f5aefd5510/data.tsv', delimiter='\\t')\n", "df.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
letterfrequency
0 A 0.08167
1 B 0.01492
2 C 0.02782
3 D 0.04253
4 E 0.12702
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ " letter frequency\n", "0 A 0.08167\n", "1 B 0.01492\n", "2 C 0.02782\n", "3 D 0.04253\n", "4 E 0.12702" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sketching the Visualization\n", "\n", "First, let's sketch the visualization by defining what are its options and code.\n", "\n", "The visualization options or arguments are contained in a dictionary. Note that the dictionary contains a subdictionary named `variables`. Those variables will be exposed as methods of the scaffolded visualization, and are available in code as `_variable_name`.\n", "\n", "Note also the `data` dictionary. It indicates that the visualization receives a pandas DataFrame. This dataframe is available internally as the `_data_dataframe` variable." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# the options\n", "barchart_args = {\n", " 'requirements': ['d3'],\n", " 'visualization_name': 'barchart',\n", " 'visualization_js': './barchart.js',\n", " 'figure_id': None,\n", " 'container_type': 'svg',\n", " 'data': {\n", " 'dataframe': None,\n", " },\n", " 'options': {\n", " 'background_color': None,\n", " 'x_axis': True,\n", " 'y_axis': True,\n", " },\n", " 'variables': {\n", " 'width': 960,\n", " 'height': 500,\n", " 'padding': {'left': 30, 'top': 20, 'right': 30, 'bottom': 30},\n", " 'x': 'x',\n", " 'y': 'y',\n", " 'y_axis_ticks': 10,\n", " 'color': 'steelblue',\n", " 'y_label': None,\n", " 'rotate_label': True,\n", " },\n", "}" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is the visualization code. Note that is almost a copy-and-paste version of the original example. We just renamed the variables to `_variable_name` and used other auxiliary variables like `_vis_width` which are exposed by matta.\n", "\n", "Note that the code is not strictly javascript. Actually, the file is expected to be a [jinja2](http://jinja.pocoo.org/) template.\n", "\n", "We save this template as `barchart.js`, as `barchart_args['visualization_js']` points to it." ] }, { "cell_type": "code", "collapsed": false, "input": [ "barchart_code = '''\n", "var x = d3.scale.ordinal()\n", " .rangeRoundBands([0, _vis_width], .1);\n", "\n", "var y = d3.scale.linear()\n", " .range([_vis_height, 0]);\n", "\n", "if (_y_label == null) {\n", " _y_label = _y;\n", "}\n", "\n", "x.domain(_data_dataframe.map(function(d) { return d[_x]; }));\n", "y.domain([0, d3.max(_data_dataframe, function(d) { return d[_y]; })]);\n", "\n", "{% if options.x_axis %}\n", " var xAxis = d3.svg.axis()\n", " .scale(x)\n", " .orient(\"bottom\");\n", "\n", " container.append(\"g\")\n", " .attr(\"class\", \"x axis\")\n", " .attr(\"transform\", \"translate(0,\" + _vis_height + \")\")\n", " .call(xAxis);\n", "{% endif %}\n", "\n", "{% if options.y_axis %}\n", " var yAxis = d3.svg.axis()\n", " .scale(y)\n", " .orient(\"left\");\n", "\n", " if (_y_axis_ticks != null) {\n", " yAxis.ticks(_y_axis_ticks);\n", " }\n", "\n", " var y_label = container.append(\"g\")\n", " .attr(\"class\", \"y axis\")\n", " .call(yAxis)\n", " .append(\"text\");\n", "\n", " if (_rotate_label) {\n", " y_label.attr(\"transform\", \"rotate(-90)\")\n", " .attr(\"y\", 6)\n", " .attr(\"dy\", \".71em\")\n", " .style(\"text-anchor\", \"end\");\n", " } else {\n", " y_label\n", " .attr(\"y\", 6)\n", " .attr('x', 12)\n", " .attr(\"dy\", \".71em\")\n", " .style(\"text-anchor\", \"start\");\n", " }\n", "\n", " y_label.text(_y_label);\n", "{% endif %}\n", "\n", "var bar = container.selectAll(\".bar\")\n", " .data(_data_dataframe);\n", "\n", "bar.enter().append('rect').classed('bar', true);\n", "\n", "bar.exit().remove();\n", "\n", "bar.attr(\"x\", function(d) { return x(d[_x]); })\n", " .attr(\"width\", x.rangeBand())\n", " .attr(\"y\", function(d) { return y(d[_y]); })\n", " .attr(\"height\", function(d) { return _vis_height - y(d[_y]); })\n", " .attr('fill', _color);\n", "'''\n", "\n", "with open('./barchart.js', 'w') as f:\n", " f.write(barchart_code)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is the actual matta code to display the visualization in the notebook. Note that the keyword arguments are keys from the barchart_args dictionary. If you use a keyword argument not present in the dictionary, an `Exception` will be raised. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "from matta.sketch import build_sketch\n", "barchart = build_sketch(barchart_args)\n", "barchart(dataframe=df, x='letter', y='frequency', rotate_label=False)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", "\n", "
\n", "\n", "" ], "metadata": {}, "output_type": "display_data" } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's it! :)\n", "\n", "We copied-and-pasted implemented a barchart. The cool thing is that we didn't had to worry about data formats, since we knew the data was a DataFrame. We also didn't have to worry about dependencies like loading `d3.js` or to have a reusable visualization, because matta does all that.\n", "\n", "The next step is to scaffold a reusable visualization. Actually, the code is very similar:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "barchart(x='letter', y='frequency').scaffold(filename='./scaffolded_barchart.js', define_js_module=False)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "What this does is to create a file named `scaffolded_barchart.js` which contains a reusable visualization. All variables declared in the arguments dictionary are available as property methods. The values specified when defining the arguments or when scaffolding will serve as defaults, but everything is changeable. Note that we did not specify a DataFrame this time!" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!cat ./scaffolded_barchart.js" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\r\n", "\r\n", "/**\r\n", " * mod_barchart was scaffolded using matta\r\n", " * Variables that start with an underscore (_) are passed as arguments in Python.\r\n", " * Variables that start with _data are data parameters of the visualization, and expected to be given as datum.\r\n", " *\r\n", " * For instance, d3.select('#figure').datum({'graph': a_json_graph, 'dataframe': a_json_dataframe}).call(visualization)\r\n", " * will fill the variables _data_graph and _data_dataframe.\r\n", " */\r\n", "\r\n", "var matta_barchart = function() {\r\n", " var __fill_data__ = function(__data__) {\r\n", " \r\n", " func_barchart.dataframe(__data__.dataframe);\r\n", " \r\n", " };\r\n", "\r\n", " var func_barchart = function (selection) {\r\n", " console.log('selection', selection);\r\n", "\r\n", " var _vis_width = _width - _padding.left - _padding.right;\r\n", " var _vis_height = _height - _padding.top - _padding.bottom;\r\n", "\r\n", " selection.each(function(__data__) {\r\n", " __fill_data__(__data__);\r\n", "\r\n", " var container = null;\r\n", "\r\n", " if (d3.select(this).select(\"svg.barchart-container\").empty()) {\r\n", " \r\n", " var svg = d3.select(this).append(\"svg\")\r\n", " .attr(\"width\", _width)\r\n", " .attr(\"height\", _height)\r\n", " .attr('class', 'barchart-container');\r\n", "\r\n", " \r\n", "\r\n", " container = svg.append(\"g\")\r\n", " .classed('barchart-container', true)\r\n", " .attr('transform', 'translate(' + _padding.left + ',' + _padding.top + ')');\r\n", "\r\n", " \r\n", " } else {\r\n", " container = d3.select(this).select(\"svg.barchart-container\");\r\n", " }\r\n", "\r\n", " console.log('container', container.node());\r\n", "\r\n", " \r\n", " \r\n", "var x = d3.scale.ordinal()\r\n", " .rangeRoundBands([0, _vis_width], .1);\r\n", "\r\n", "var y = d3.scale.linear()\r\n", " .range([_vis_height, 0]);\r\n", "\r\n", "if (_y_label == null) {\r\n", " _y_label = _y;\r\n", "}\r\n", "\r\n", "x.domain(_data_dataframe.map(function(d) { return d[_x]; }));\r\n", "y.domain([0, d3.max(_data_dataframe, function(d) { return d[_y]; })]);\r\n", "\r\n", "\r\n", " var xAxis = d3.svg.axis()\r\n", " .scale(x)\r\n", " .orient(\"bottom\");\r\n", "\r\n", " container.append(\"g\")\r\n", " .attr(\"class\", \"x axis\")\r\n", " .attr(\"transform\", \"translate(0,\" + _vis_height + \")\")\r\n", " .call(xAxis);\r\n", "\r\n", "\r\n", "\r\n", " var yAxis = d3.svg.axis()\r\n", " .scale(y)\r\n", " .orient(\"left\");\r\n", "\r\n", " if (_y_axis_ticks != null) {\r\n", " yAxis.ticks(_y_axis_ticks);\r\n", " }\r\n", "\r\n", " var y_label = container.append(\"g\")\r\n", " .attr(\"class\", \"y axis\")\r\n", " .call(yAxis)\r\n", " .append(\"text\");\r\n", "\r\n", " if (_rotate_label) {\r\n", " y_label.attr(\"transform\", \"rotate(-90)\")\r\n", " .attr(\"y\", 6)\r\n", " .attr(\"dy\", \".71em\")\r\n", " .style(\"text-anchor\", \"end\");\r\n", " } else {\r\n", " y_label\r\n", " .attr(\"y\", 6)\r\n", " .attr('x', 12)\r\n", " .attr(\"dy\", \".71em\")\r\n", " .style(\"text-anchor\", \"start\");\r\n", " }\r\n", "\r\n", " y_label.text(_y_label);\r\n", "\r\n", "\r\n", "var bar = container.selectAll(\".bar\")\r\n", " .data(_data_dataframe);\r\n", "\r\n", "bar.enter().append('rect').classed('bar', true);\r\n", "\r\n", "bar.exit().remove();\r\n", "\r\n", "bar.attr(\"x\", function(d) { return x(d[_x]); })\r\n", " .attr(\"width\", x.rangeBand())\r\n", " .attr(\"y\", function(d) { return y(d[_y]); })\r\n", " .attr(\"height\", function(d) { return _vis_height - y(d[_y]); })\r\n", " .attr('fill', _color);\r\n", " \r\n", "\r\n", " });\r\n", " };\r\n", "\r\n", " \r\n", " var _data_dataframe = null;\r\n", " func_barchart.dataframe = function(__) {\r\n", " if (arguments.length) {\r\n", " _data_dataframe = __;\r\n", " console.log('DATA dataframe', _data_dataframe);\r\n", " return func_barchart;\r\n", " }\r\n", " return _data_dataframe;\r\n", " };\r\n", " \r\n", "\r\n", " \r\n", " \r\n", " var _color = \"steelblue\";\r\n", " func_barchart.color = function(__) {\r\n", " if (arguments.length) {\r\n", " _color = __;\r\n", " console.log('setted color', _color);\r\n", " return func_barchart;\r\n", " }\r\n", " return _color;\r\n", " };\r\n", " \r\n", " var _y_label = null;\r\n", " func_barchart.y_label = function(__) {\r\n", " if (arguments.length) {\r\n", " _y_label = __;\r\n", " console.log('setted y_label', _y_label);\r\n", " return func_barchart;\r\n", " }\r\n", " return _y_label;\r\n", " };\r\n", " \r\n", " var _y_axis_ticks = 10;\r\n", " func_barchart.y_axis_ticks = function(__) {\r\n", " if (arguments.length) {\r\n", " _y_axis_ticks = __;\r\n", " console.log('setted y_axis_ticks', _y_axis_ticks);\r\n", " return func_barchart;\r\n", " }\r\n", " return _y_axis_ticks;\r\n", " };\r\n", " \r\n", " var _height = 500;\r\n", " func_barchart.height = function(__) {\r\n", " if (arguments.length) {\r\n", " _height = __;\r\n", " console.log('setted height', _height);\r\n", " return func_barchart;\r\n", " }\r\n", " return _height;\r\n", " };\r\n", " \r\n", " var _padding = {\"top\": 20, \"right\": 30, \"left\": 30, \"bottom\": 30};\r\n", " func_barchart.padding = function(__) {\r\n", " if (arguments.length) {\r\n", " _padding = __;\r\n", " console.log('setted padding', _padding);\r\n", " return func_barchart;\r\n", " }\r\n", " return _padding;\r\n", " };\r\n", " \r\n", " var _width = 960;\r\n", " func_barchart.width = function(__) {\r\n", " if (arguments.length) {\r\n", " _width = __;\r\n", " console.log('setted width', _width);\r\n", " return func_barchart;\r\n", " }\r\n", " return _width;\r\n", " };\r\n", " \r\n", " var _rotate_label = true;\r\n", " func_barchart.rotate_label = function(__) {\r\n", " if (arguments.length) {\r\n", " _rotate_label = __;\r\n", " console.log('setted rotate_label', _rotate_label);\r\n", " return func_barchart;\r\n", " }\r\n", " return _rotate_label;\r\n", " };\r\n", " \r\n", " var _y = \"frequency\";\r\n", " func_barchart.y = function(__) {\r\n", " if (arguments.length) {\r\n", " _y = __;\r\n", " console.log('setted y', _y);\r\n", " return func_barchart;\r\n", " }\r\n", " return _y;\r\n", " };\r\n", " \r\n", " var _x = \"letter\";\r\n", " func_barchart.x = function(__) {\r\n", " if (arguments.length) {\r\n", " _x = __;\r\n", " console.log('setted x', _x);\r\n", " return func_barchart;\r\n", " }\r\n", " return _x;\r\n", " };\r\n", " \r\n", " \r\n", "\r\n", " \r\n", " return func_barchart;\r\n", "};\r\n" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Testing the Visualization\n", "\n", "To test the visualization, we will serialize the DataFrame and then display an IFrame with the visualization using a very simple template (which we, again, copied from the original source by Mike).\n", "\n", "matta includes a `dump_data` function that calls a JSON serializer under the hoods. This serializer is able to handle DataFrames and other typical python data structures." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from matta import dump_data\n", "dump_data(df, './data.json')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "!head ./data.json" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[{\"frequency\": 0.08167, \"letter\": \"A\"}, {\"frequency\": 0.01492, \"letter\": \"B\"}, {\"frequency\": 0.027819999999999998, \"letter\": \"C\"}, {\"frequency\": 0.04253, \"letter\": \"D\"}, {\"frequency\": 0.12702, \"letter\": \"E\"}, {\"frequency\": 0.02288, \"letter\": \"F\"}, {\"frequency\": 0.02015, \"letter\": \"G\"}, {\"frequency\": 0.06094, \"letter\": \"H\"}, {\"frequency\": 0.06966, \"letter\": \"I\"}, {\"frequency\": 0.0015300000000000001, \"letter\": \"J\"}, {\"frequency\": 0.00772, \"letter\": \"K\"}, {\"frequency\": 0.04025, \"letter\": \"L\"}, {\"frequency\": 0.024059999999999998, \"letter\": \"M\"}, {\"frequency\": 0.06749, \"letter\": \"N\"}, {\"frequency\": 0.07507, \"letter\": \"O\"}, {\"frequency\": 0.01929, \"letter\": \"P\"}, {\"frequency\": 0.00095, \"letter\": \"Q\"}, {\"frequency\": 0.05987000000000001, \"letter\": \"R\"}, {\"frequency\": 0.06327, \"letter\": \"S\"}, {\"frequency\": 0.09056, \"letter\": \"T\"}, {\"frequency\": 0.02758, \"letter\": \"U\"}, {\"frequency\": 0.00978, \"letter\": \"V\"}, {\"frequency\": 0.0236, \"letter\": \"W\"}, {\"frequency\": 0.0015, \"letter\": \"X\"}, {\"frequency\": 0.01974, \"letter\": \"Y\"}, {\"frequency\": 0.00074, \"letter\": \"Z\"}]" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's write the HTML file. Note the following code:\n", "\n", "```javascript\n", "d3.json('./data.json', function(json) {\n", " var barchart = matta_barchart();\n", " d3.select('body').datum({dataframe: json}).call(barchart)\n", " \n", "});\n", "```\n", "If you would like to change the width and the x attribute of the visualization, you would say instead:\n", " \n", "```javascript\n", "var barchart = matta_barchart().width(700).x('other_column_in_the_dataframe');\n", "```" ] }, { "cell_type": "code", "collapsed": false, "input": [ "with open('./test_barchart.html', 'w') as f:\n", " f.write('''\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "''')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you are viewing this in NBViewer then you will not see the IFrame. But trust me, it works ;)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from IPython.display import IFrame\n", "IFrame('http://localhost:8888/files/test_barchart.html', 1000, 600)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " " ], "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "" ] } ], "prompt_number": 13 }, { "cell_type": "markdown", "metadata": {}, "source": [ "I hope you found this useful, and that you start creating visualizations using matta. I do! :)" ] } ], "metadata": {} } ] }