{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Chapter2\n",
"D3.js in Actionの2章の勉強ノートです。\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loaded nvd3 IPython extension\n",
"run nvd3.ipynb.initialize_javascript() to set up the notebook\n",
"help(nvd3.ipynb.initialize_javascript) for options\n"
]
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"$.getScript(\"https://cdnjs.cloudflare.com/ajax/libs/nvd3/1.7.0/nv.d3.min.js\")"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"$.getScript(\"https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.min.js\", function() {\n",
" $.getScript(\"https://cdnjs.cloudflare.com/ajax/libs/nvd3/1.7.0/nv.d3.min.js\", function() {})});"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%load_ext sage\n",
"from IPython.core.display import HTML\n",
"from string import Template\n",
"import json\n",
"import nvd3\n",
"nvd3.ipynb.initialize_javascript(use_remote=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## データの読み込み\n",
"D3では様々なデータをサポートしています。\n",
"- TEXT: d3.text()\n",
"- XML: d3.xml()\n",
"- CSV: d3.csv()\n",
"- JSON: d3.json()\n",
"- HTML: d3.html()\n",
"\n",
"pythonとのインタフェースを取ることを考えると、一般的に構造を保持できるJSONとCSVがデータの受け渡しに使われるます。\n",
"\n",
"例として以下のようなcities.csvを読み込んでみましょう。"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\"label\",\"population\",\"country\",\"x\",\"y\"\r\n",
"\"San Francisco\", 750000,\"USA\",37,-122\r\n",
"\"Fresno\", 500000,\"USA\",36,-119\r\n",
"\"Lahore\",12500000,\"Pakistan\",31,74\r\n",
"\"Karachi\",13000000,\"Pakistan\",24,67\r\n",
"\"Rome\",2500000,\"Italy\",41,12\r\n",
"\"Naples\",1000000,\"Italy\",40,14\r\n",
"\"Rio\",12300000,\"Brazil\",-22,-43\r\n",
"\"Sao Paolo\",12300000,\"Brazil\",-23,-46"
]
}
],
"source": [
"!cat data/cities.csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"読み込まれたデータは、function(error, data)形式のコールバックで与えられます。\n",
"このコールバックの中で実行したい処理を記述する方式になります。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"d3.csv(\"data/cities.csv\",function(error,data) {console.log(error,data)});"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"d3.csv(\"data/cities.csv\",function(error,data) {console.log(error,data)});"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"javascriptのコンソールに以下のようにデータの内容が出力されます。\n",
"\n",
"![cities console log](images/cities_log.png)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"これを見るとCSVのデータがヘッダのカラム名をキーとする辞書の配列として渡されていることがわかります。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Pythonデータの受け渡し\n",
"jupyterの計算結果をD3に渡す方法を以下に紹介します。\n",
"\n",
"jsonとTemplateライブラリを使用します。"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from string import Template\n",
"import json\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"pandasを使ってcities.csvを読み込み、データフレームdfにセットします。"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" label | \n",
" population | \n",
" country | \n",
" x | \n",
" y | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" San Francisco | \n",
" 750000 | \n",
" USA | \n",
" 37 | \n",
" -122 | \n",
"
\n",
" \n",
" 1 | \n",
" Fresno | \n",
" 500000 | \n",
" USA | \n",
" 36 | \n",
" -119 | \n",
"
\n",
" \n",
" 2 | \n",
" Lahore | \n",
" 12500000 | \n",
" Pakistan | \n",
" 31 | \n",
" 74 | \n",
"
\n",
" \n",
" 3 | \n",
" Karachi | \n",
" 13000000 | \n",
" Pakistan | \n",
" 24 | \n",
" 67 | \n",
"
\n",
" \n",
" 4 | \n",
" Rome | \n",
" 2500000 | \n",
" Italy | \n",
" 41 | \n",
" 12 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" label population country x y\n",
"0 San Francisco 750000 USA 37 -122\n",
"1 Fresno 500000 USA 36 -119\n",
"2 Lahore 12500000 Pakistan 31 74\n",
"3 Karachi 13000000 Pakistan 24 67\n",
"4 Rome 2500000 Italy 41 12"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv(\"data/cities.csv\")\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Templateを使ってscriptタグにdfを変数dataに代入します。\n",
"\n",
"$python_dataの置換で、json.dumpsとto_dict(orient='records')を使用するのがポイントです。"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\n"
]
}
],
"source": [
"data_text = Template('''\n",
"\n",
"''')\n",
"data_text = data_text.substitute({'python_data': json.dumps(df.to_dict(orient='records'))})\n",
"print data_text"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"以下のコマンドを実行して、javascriptのコンソールをみてください。d3で読み込んだ時と同じログが出力されています。"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"HTML(data_text)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## スケールマッピング\n",
"D3のスケール処理はとても良くできており、データに応じて選べるようになっています。\n",
"- d3.scale.linear(): 線形補間\n",
"- d3.scale.quantile(): 分位数(区間で分けられた値)\n",
"\n",
"最初に、線形補間を見てみましょう。\n",
"domain関数では、問題領域(ここではデータの分布領域)をrange関数で指定された範囲にマッピングします。\n",
"\n",
"console.logの代わりにelement.textを使うとjupyterのノートブックに出力できます(ただし1回だけ指定可能みたい)。\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"var newRamp = d3.scale.linear().domain([500000,13000000]).range([0, 500]);\n",
"element.text(newRamp(1000000) + \", \" +newRamp(9000000)+\", \"+newRamp.invert(313));"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"var newRamp = d3.scale.linear().domain([500000,13000000]).range([0, 500]);\n",
"element.text(newRamp(1000000) + \", \" +newRamp(9000000)+\", \"+newRamp.invert(313));"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"D3のscaleの凄いのは、数値だけでなくカラーにもマッピングできることです。"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"var newRamp = d3.scale.linear().domain([500000,13000000]).range([\"blue\",\"red\"]);\n",
"element.text(newRamp(1000000) + \", \" +newRamp(9000000));"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"var newRamp = d3.scale.linear().domain([500000,13000000]).range([\"blue\",\"red\"]);\n",
"element.text(newRamp(1000000) + \", \" +newRamp(9000000));"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### カテゴリわけ\n",
"次にquantile関数を使っていくつかのカテゴリに分けてみます。\n",
"\n",
"以下の例では、sampleArrayのデータを3つのグループに分けて、small, medium, largetとします。\n",
"値ではなく、ソートした後にデータ数が等分になるようにカテゴリわけしているみたいです。\n",
"\n",
"[1,10,44], [58,66,124], [423, 524, 900]から\n",
"\n",
"(-∞..44], (44..124], (124..∞)の区間をsmall, medium, largeにマッピングしています。\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"var sampleArray = [423,124,66,424,58,10,900,44,1];\n",
"var qScaleName =\n",
"d3.scale.quantile().domain(sampleArray).range([\"small\", \"medium\",\"large\"]);\n",
"element.text(qScaleName(68) + \", \" +qScaleName(20)+\", \"+qScaleName(10000));"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"var sampleArray = [423,124,66,424,58,10,900,44,1];\n",
"var qScaleName =\n",
"d3.scale.quantile().domain(sampleArray).range([\"small\", \"medium\",\"large\"]);\n",
"element.text(qScaleName(68) + \", \" +qScaleName(20)+\", \"+qScaleName(10000));\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## データバインディング\n",
"D3.jsの最も重要な機能は、データバインディングだと思います。\n",
"\n",
"1章で見たのはenterメソッドでしたが、exit, updateについてその動きを確認してみましょう。\n",
"\n",
"update, enter, exitの違いが、Fig. 2.24に解説されているので、引用します。\n",
"\n",
"![Fig. 2.24](images/Fig_2.24.png)\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"バインディングする要素よりもデータが多い場合のenterの動作を見てみましょう。"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%HTML\n",
""
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"var sampleData = [1, 2, 3, 4];\n",
"\n",
"d3.select('#ex1').selectAll('div')\n",
" .data(sampleData)\n",
" .enter()\n",
" .append(\"div\")\n",
" .html(function (d) { return d; })"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"var sampleData = [1, 2, 3, 4];\n",
"\n",
"d3.select('#ex1').selectAll('div')\n",
" .data(sampleData)\n",
" .enter()\n",
" .append(\"div\")\n",
" .html(function (d) { return d; })"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"この例では、1個のdivに対してデータの1がバインディングされて、残りの2, 3, 4に対しては新たにdiv要素を追加し、そこにデータの値をhtmlでセットします。\n",
"\n",
"結果は期待に反し、2, 3, 4だけが表示されてました。最初のdivに対しては何も処理をしていないため、このようになります。\n",
"\n",
"それでは、最初の1に対してもhtmlの処理を追加してみましょう。\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%HTML\n",
""
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"var sampleData = [1, 2, 3, 4];\n",
"\n",
"d3.select('#ex2').selectAll('div')\n",
" .data(sampleData)\n",
" .html(function (d) { return d; })\n",
" .enter()\n",
" .append(\"div\")\n",
" .html(function (d) { return d; });"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"var sampleData = [1, 2, 3, 4];\n",
"\n",
"d3.select('#ex2').selectAll('div')\n",
" .data(sampleData)\n",
" .html(function (d) { return d; })\n",
" .enter()\n",
" .append(\"div\")\n",
" .html(function (d) { return d; });"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"これでは同じ処理を2度書かなくてはなりません。\n",
"そこで、最初にdivを削除してから処理をします。"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"var sampleData = [1, 2, 3, 4];\n",
"// remove all divs under #ex2\n",
"d3.select('#ex2').selectAll('div').remove();\n",
"\n",
"d3.select('#ex2').selectAll('div')\n",
" .data(sampleData)\n",
" .enter()\n",
" .append(\"div\")\n",
" .html(function (d) { return d; });"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"var sampleData = [1, 2, 3, 4];\n",
"// remove all divs under #ex2\n",
"d3.select('#ex2').selectAll('div').remove();\n",
"\n",
"d3.select('#ex2').selectAll('div')\n",
" .data(sampleData)\n",
" .enter()\n",
" .append(\"div\")\n",
" .html(function (d) { return d; });"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### exitによる削除\n",
"次に要素よりもデータが少ない場合に使用するexitを試してみます。\n",
"\n",
"4個のdiv要素にa, b, c, dがセットされているところに、1, 2の2個の要素をバインディングします。\n"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
a
\n",
"
b
\n",
"
c
\n",
"
d
\n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%HTML\n",
"\n",
"
a
\n",
"
b
\n",
"
c
\n",
"
d
\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"var sampleData = [1, 2];\n",
"\n",
"d3.select('#ex3').selectAll('div')\n",
" .data(sampleData)\n",
" .html(function (d) { return d; })\n",
" .exit()\n",
" .remove();"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"var sampleData = [1, 2];\n",
"\n",
"d3.select('#ex3').selectAll('div')\n",
" .data(sampleData)\n",
" .html(function (d) { return d; })\n",
" .exit()\n",
" .remove();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 棒グラフの表示\n",
"svgタグに棒グラフを描いてみましょう。\n",
"\n",
"- rect要素を追加し、xを10pxずつ移動し、yをデータの値*10pxで表示"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%HTML\n",
"\n",
" \n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"d3.select('#ex4').select(\"svg\")\n",
" .selectAll(\"rect\")\n",
" .data([15, 50, 22, 8, 100, 10])\n",
" .enter()\n",
" .append(\"rect\")\n",
" .attr(\"width\", 10)\n",
" .attr(\"height\", function(d) {return d;})\n",
" .style(\"fill\", \"blue\")\n",
" .style(\"stroke\", \"red\")\n",
" .style(\"stroke-width\", \"1px\")\n",
" .style(\"opacity\", .25)\n",
" .attr(\"x\", function(d, i) {return i * 10})\n",
" .attr(\"y\", function(d) {return 100 - d;});"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"d3.select('#ex4').select(\"svg\")\n",
" .selectAll(\"rect\")\n",
" .data([15, 50, 22, 8, 100, 10])\n",
" .enter()\n",
" .append(\"rect\")\n",
" .attr(\"width\", 10)\n",
" .attr(\"height\", function(d) {return d;})\n",
" .style(\"fill\", \"blue\")\n",
" .style(\"stroke\", \"red\")\n",
" .style(\"stroke-width\", \"1px\")\n",
" .style(\"opacity\", .25)\n",
" .attr(\"x\", function(d, i) {return i * 10})\n",
" .attr(\"y\", function(d) {return 100 - d;});"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## CSVデータから棒グラフを作る\n",
"2章のメインテーマは、CSVファイルから棒グラフを作成することです。\n",
"\n",
"例題にしたがって、cities.csvから棒グラフを作ってみます。"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%HTML\n",
"\n",
" \n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"// dataフォルダのcities.csvを読み込み、dataViz関数を呼び出す\n",
"d3.csv(\"data/cities.csv\",function(error,data) {dataViz(data);});\n",
"function dataViz(incomingData) {\n",
" var maxPopulation = d3.max(incomingData, function(el) {\n",
" // 人口のデータを文字列から数値に変換\n",
" return parseInt(el.population);}\n",
" );\n",
" // 人口の最大値を0-230の範囲にスケーリングするyScaleを作成\n",
" var yScale = d3.scale.linear().domain([0,maxPopulation]).range([0,230]);\n",
" // 棒グラフの作成\n",
" d3.select('#ex5').select(\"svg\").attr(\"style\",\"height: 240px; width: 300px;\");\n",
" d3.select(\"#ex5 svg\")\n",
" .selectAll(\"rect\")\n",
" .data(incomingData)\n",
" .enter()\n",
" .append(\"rect\")\n",
" .attr(\"width\", 25)\n",
" .attr(\"height\", function(d) {return yScale(parseInt(d.population));})\n",
" .attr(\"x\", function(d,i) {return i * 30;})\n",
" .attr(\"y\", function(d) {return 240 - yScale(parseInt(d.population));})\n",
" .style(\"fill\", \"blue\")\n",
" .style(\"stroke\", \"red\")\n",
" .style(\"stroke-width\", \"1px\")\n",
" .style(\"opacity\", .25);\n",
"}"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%javascript\n",
"// dataフォルダのcities.csvを読み込み、dataViz関数を呼び出す\n",
"d3.csv(\"data/cities.csv\",function(error,data) {dataViz(data);});\n",
"function dataViz(incomingData) {\n",
" var maxPopulation = d3.max(incomingData, function(el) {\n",
" // 人口のデータを文字列から数値に変換\n",
" return parseInt(el.population);}\n",
" );\n",
" // 人口の最大値を0-230の範囲にスケーリングするyScaleを作成\n",
" var yScale = d3.scale.linear().domain([0,maxPopulation]).range([0,230]);\n",
" // 棒グラフの作成\n",
" d3.select('#ex5').select(\"svg\").attr(\"style\",\"height: 240px; width: 300px;\");\n",
" d3.select(\"#ex5 svg\")\n",
" .selectAll(\"rect\")\n",
" .data(incomingData)\n",
" .enter()\n",
" .append(\"rect\")\n",
" .attr(\"width\", 25)\n",
" .attr(\"height\", function(d) {return yScale(parseInt(d.population));})\n",
" .attr(\"x\", function(d,i) {return i * 30;})\n",
" .attr(\"y\", function(d) {return 240 - yScale(parseInt(d.population));})\n",
" .style(\"fill\", \"blue\")\n",
" .style(\"stroke\", \"red\")\n",
" .style(\"stroke-width\", \"1px\")\n",
" .style(\"opacity\", .25);\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}