{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Chapter2\n", "D3.js in Actionの2章の勉強ノートです。\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "loaded nvd3 IPython extension\n", "run nvd3.ipynb.initialize_javascript() to set up the notebook\n", "help(nvd3.ipynb.initialize_javascript) for options\n" ] }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "$.getScript(\"https://cdnjs.cloudflare.com/ajax/libs/nvd3/1.7.0/nv.d3.min.js\")" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "$.getScript(\"https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.min.js\", function() {\n", " $.getScript(\"https://cdnjs.cloudflare.com/ajax/libs/nvd3/1.7.0/nv.d3.min.js\", function() {})});" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%load_ext sage\n", "from IPython.core.display import HTML\n", "from string import Template\n", "import json\n", "import nvd3\n", "nvd3.ipynb.initialize_javascript(use_remote=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## データの読み込み\n", "D3では様々なデータをサポートしています。\n", "- TEXT: d3.text()\n", "- XML: d3.xml()\n", "- CSV: d3.csv()\n", "- JSON: d3.json()\n", "- HTML: d3.html()\n", "\n", "pythonとのインタフェースを取ることを考えると、一般的に構造を保持できるJSONとCSVがデータの受け渡しに使われるます。\n", "\n", "例として以下のようなcities.csvを読み込んでみましょう。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\"label\",\"population\",\"country\",\"x\",\"y\"\r\n", "\"San Francisco\", 750000,\"USA\",37,-122\r\n", "\"Fresno\", 500000,\"USA\",36,-119\r\n", "\"Lahore\",12500000,\"Pakistan\",31,74\r\n", "\"Karachi\",13000000,\"Pakistan\",24,67\r\n", "\"Rome\",2500000,\"Italy\",41,12\r\n", "\"Naples\",1000000,\"Italy\",40,14\r\n", "\"Rio\",12300000,\"Brazil\",-22,-43\r\n", "\"Sao Paolo\",12300000,\"Brazil\",-23,-46" ] } ], "source": [ "!cat data/cities.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "読み込まれたデータは、function(error, data)形式のコールバックで与えられます。\n", "このコールバックの中で実行したい処理を記述する方式になります。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "d3.csv(\"data/cities.csv\",function(error,data) {console.log(error,data)});" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "d3.csv(\"data/cities.csv\",function(error,data) {console.log(error,data)});" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "javascriptのコンソールに以下のようにデータの内容が出力されます。\n", "\n", "![cities console log](images/cities_log.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "これを見るとCSVのデータがヘッダのカラム名をキーとする辞書の配列として渡されていることがわかります。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pythonデータの受け渡し\n", "jupyterの計算結果をD3に渡す方法を以下に紹介します。\n", "\n", "jsonとTemplateライブラリを使用します。" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from string import Template\n", "import json\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "pandasを使ってcities.csvを読み込み、データフレームdfにセットします。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
labelpopulationcountryxy
0San Francisco750000USA37-122
1Fresno500000USA36-119
2Lahore12500000Pakistan3174
3Karachi13000000Pakistan2467
4Rome2500000Italy4112
\n", "
" ], "text/plain": [ " label population country x y\n", "0 San Francisco 750000 USA 37 -122\n", "1 Fresno 500000 USA 36 -119\n", "2 Lahore 12500000 Pakistan 31 74\n", "3 Karachi 13000000 Pakistan 24 67\n", "4 Rome 2500000 Italy 41 12" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"data/cities.csv\")\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Templateを使ってscriptタグにdfを変数dataに代入します。\n", "\n", "$python_dataの置換で、json.dumpsとto_dict(orient='records')を使用するのがポイントです。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\n" ] } ], "source": [ "data_text = Template('''\n", "\n", "''')\n", "data_text = data_text.substitute({'python_data': json.dumps(df.to_dict(orient='records'))})\n", "print data_text" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "以下のコマンドを実行して、javascriptのコンソールをみてください。d3で読み込んだ時と同じログが出力されています。" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "HTML(data_text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## スケールマッピング\n", "D3のスケール処理はとても良くできており、データに応じて選べるようになっています。\n", "- d3.scale.linear(): 線形補間\n", "- d3.scale.quantile(): 分位数(区間で分けられた値)\n", "\n", "最初に、線形補間を見てみましょう。\n", "domain関数では、問題領域(ここではデータの分布領域)をrange関数で指定された範囲にマッピングします。\n", "\n", "console.logの代わりにelement.textを使うとjupyterのノートブックに出力できます(ただし1回だけ指定可能みたい)。\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "var newRamp = d3.scale.linear().domain([500000,13000000]).range([0, 500]);\n", "element.text(newRamp(1000000) + \", \" +newRamp(9000000)+\", \"+newRamp.invert(313));" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "var newRamp = d3.scale.linear().domain([500000,13000000]).range([0, 500]);\n", "element.text(newRamp(1000000) + \", \" +newRamp(9000000)+\", \"+newRamp.invert(313));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "D3のscaleの凄いのは、数値だけでなくカラーにもマッピングできることです。" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "var newRamp = d3.scale.linear().domain([500000,13000000]).range([\"blue\",\"red\"]);\n", "element.text(newRamp(1000000) + \", \" +newRamp(9000000));" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "var newRamp = d3.scale.linear().domain([500000,13000000]).range([\"blue\",\"red\"]);\n", "element.text(newRamp(1000000) + \", \" +newRamp(9000000));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### カテゴリわけ\n", "次にquantile関数を使っていくつかのカテゴリに分けてみます。\n", "\n", "以下の例では、sampleArrayのデータを3つのグループに分けて、small, medium, largetとします。\n", "値ではなく、ソートした後にデータ数が等分になるようにカテゴリわけしているみたいです。\n", "\n", "[1,10,44], [58,66,124], [423, 524, 900]から\n", "\n", "(-∞..44], (44..124], (124..∞)の区間をsmall, medium, largeにマッピングしています。\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "var sampleArray = [423,124,66,424,58,10,900,44,1];\n", "var qScaleName =\n", "d3.scale.quantile().domain(sampleArray).range([\"small\", \"medium\",\"large\"]);\n", "element.text(qScaleName(68) + \", \" +qScaleName(20)+\", \"+qScaleName(10000));" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "var sampleArray = [423,124,66,424,58,10,900,44,1];\n", "var qScaleName =\n", "d3.scale.quantile().domain(sampleArray).range([\"small\", \"medium\",\"large\"]);\n", "element.text(qScaleName(68) + \", \" +qScaleName(20)+\", \"+qScaleName(10000));\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## データバインディング\n", "D3.jsの最も重要な機能は、データバインディングだと思います。\n", "\n", "1章で見たのはenterメソッドでしたが、exit, updateについてその動きを確認してみましょう。\n", "\n", "update, enter, exitの違いが、Fig. 2.24に解説されているので、引用します。\n", "\n", "![Fig. 2.24](images/Fig_2.24.png)\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "バインディングする要素よりもデータが多い場合のenterの動作を見てみましょう。" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "
\n", "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%HTML\n", "
\n", "
\n", "
" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "var sampleData = [1, 2, 3, 4];\n", "\n", "d3.select('#ex1').selectAll('div')\n", " .data(sampleData)\n", " .enter()\n", " .append(\"div\")\n", " .html(function (d) { return d; })" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "var sampleData = [1, 2, 3, 4];\n", "\n", "d3.select('#ex1').selectAll('div')\n", " .data(sampleData)\n", " .enter()\n", " .append(\"div\")\n", " .html(function (d) { return d; })" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "この例では、1個のdivに対してデータの1がバインディングされて、残りの2, 3, 4に対しては新たにdiv要素を追加し、そこにデータの値をhtmlでセットします。\n", "\n", "結果は期待に反し、2, 3, 4だけが表示されてました。最初のdivに対しては何も処理をしていないため、このようになります。\n", "\n", "それでは、最初の1に対してもhtmlの処理を追加してみましょう。\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "
\n", "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%HTML\n", "
\n", "
\n", "
" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "var sampleData = [1, 2, 3, 4];\n", "\n", "d3.select('#ex2').selectAll('div')\n", " .data(sampleData)\n", " .html(function (d) { return d; })\n", " .enter()\n", " .append(\"div\")\n", " .html(function (d) { return d; });" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "var sampleData = [1, 2, 3, 4];\n", "\n", "d3.select('#ex2').selectAll('div')\n", " .data(sampleData)\n", " .html(function (d) { return d; })\n", " .enter()\n", " .append(\"div\")\n", " .html(function (d) { return d; });" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "これでは同じ処理を2度書かなくてはなりません。\n", "そこで、最初にdivを削除してから処理をします。" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "var sampleData = [1, 2, 3, 4];\n", "// remove all divs under #ex2\n", "d3.select('#ex2').selectAll('div').remove();\n", "\n", "d3.select('#ex2').selectAll('div')\n", " .data(sampleData)\n", " .enter()\n", " .append(\"div\")\n", " .html(function (d) { return d; });" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "var sampleData = [1, 2, 3, 4];\n", "// remove all divs under #ex2\n", "d3.select('#ex2').selectAll('div').remove();\n", "\n", "d3.select('#ex2').selectAll('div')\n", " .data(sampleData)\n", " .enter()\n", " .append(\"div\")\n", " .html(function (d) { return d; });" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### exitによる削除\n", "次に要素よりもデータが少ない場合に使用するexitを試してみます。\n", "\n", "4個のdiv要素にa, b, c, dがセットされているところに、1, 2の2個の要素をバインディングします。\n" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "
a
\n", "
b
\n", "
c
\n", "
d
\n", "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%HTML\n", "
\n", "
a
\n", "
b
\n", "
c
\n", "
d
\n", "
" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "var sampleData = [1, 2];\n", "\n", "d3.select('#ex3').selectAll('div')\n", " .data(sampleData)\n", " .html(function (d) { return d; })\n", " .exit()\n", " .remove();" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "var sampleData = [1, 2];\n", "\n", "d3.select('#ex3').selectAll('div')\n", " .data(sampleData)\n", " .html(function (d) { return d; })\n", " .exit()\n", " .remove();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 棒グラフの表示\n", "svgタグに棒グラフを描いてみましょう。\n", "\n", "- rect要素を追加し、xを10pxずつ移動し、yをデータの値*10pxで表示" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " \n", "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%HTML\n", "
\n", " \n", "
" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "d3.select('#ex4').select(\"svg\")\n", " .selectAll(\"rect\")\n", " .data([15, 50, 22, 8, 100, 10])\n", " .enter()\n", " .append(\"rect\")\n", " .attr(\"width\", 10)\n", " .attr(\"height\", function(d) {return d;})\n", " .style(\"fill\", \"blue\")\n", " .style(\"stroke\", \"red\")\n", " .style(\"stroke-width\", \"1px\")\n", " .style(\"opacity\", .25)\n", " .attr(\"x\", function(d, i) {return i * 10})\n", " .attr(\"y\", function(d) {return 100 - d;});" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "d3.select('#ex4').select(\"svg\")\n", " .selectAll(\"rect\")\n", " .data([15, 50, 22, 8, 100, 10])\n", " .enter()\n", " .append(\"rect\")\n", " .attr(\"width\", 10)\n", " .attr(\"height\", function(d) {return d;})\n", " .style(\"fill\", \"blue\")\n", " .style(\"stroke\", \"red\")\n", " .style(\"stroke-width\", \"1px\")\n", " .style(\"opacity\", .25)\n", " .attr(\"x\", function(d, i) {return i * 10})\n", " .attr(\"y\", function(d) {return 100 - d;});" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## CSVデータから棒グラフを作る\n", "2章のメインテーマは、CSVファイルから棒グラフを作成することです。\n", "\n", "例題にしたがって、cities.csvから棒グラフを作ってみます。" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " \n", "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%HTML\n", "
\n", " \n", "
" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "// dataフォルダのcities.csvを読み込み、dataViz関数を呼び出す\n", "d3.csv(\"data/cities.csv\",function(error,data) {dataViz(data);});\n", "function dataViz(incomingData) {\n", " var maxPopulation = d3.max(incomingData, function(el) {\n", " // 人口のデータを文字列から数値に変換\n", " return parseInt(el.population);}\n", " );\n", " // 人口の最大値を0-230の範囲にスケーリングするyScaleを作成\n", " var yScale = d3.scale.linear().domain([0,maxPopulation]).range([0,230]);\n", " // 棒グラフの作成\n", " d3.select('#ex5').select(\"svg\").attr(\"style\",\"height: 240px; width: 300px;\");\n", " d3.select(\"#ex5 svg\")\n", " .selectAll(\"rect\")\n", " .data(incomingData)\n", " .enter()\n", " .append(\"rect\")\n", " .attr(\"width\", 25)\n", " .attr(\"height\", function(d) {return yScale(parseInt(d.population));})\n", " .attr(\"x\", function(d,i) {return i * 30;})\n", " .attr(\"y\", function(d) {return 240 - yScale(parseInt(d.population));})\n", " .style(\"fill\", \"blue\")\n", " .style(\"stroke\", \"red\")\n", " .style(\"stroke-width\", \"1px\")\n", " .style(\"opacity\", .25);\n", "}" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%javascript\n", "// dataフォルダのcities.csvを読み込み、dataViz関数を呼び出す\n", "d3.csv(\"data/cities.csv\",function(error,data) {dataViz(data);});\n", "function dataViz(incomingData) {\n", " var maxPopulation = d3.max(incomingData, function(el) {\n", " // 人口のデータを文字列から数値に変換\n", " return parseInt(el.population);}\n", " );\n", " // 人口の最大値を0-230の範囲にスケーリングするyScaleを作成\n", " var yScale = d3.scale.linear().domain([0,maxPopulation]).range([0,230]);\n", " // 棒グラフの作成\n", " d3.select('#ex5').select(\"svg\").attr(\"style\",\"height: 240px; width: 300px;\");\n", " d3.select(\"#ex5 svg\")\n", " .selectAll(\"rect\")\n", " .data(incomingData)\n", " .enter()\n", " .append(\"rect\")\n", " .attr(\"width\", 25)\n", " .attr(\"height\", function(d) {return yScale(parseInt(d.population));})\n", " .attr(\"x\", function(d,i) {return i * 30;})\n", " .attr(\"y\", function(d) {return 240 - yScale(parseInt(d.population));})\n", " .style(\"fill\", \"blue\")\n", " .style(\"stroke\", \"red\")\n", " .style(\"stroke-width\", \"1px\")\n", " .style(\"opacity\", .25);\n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }