<!DOCTYPE html> <html lang="en"> <head> <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> <script> (adsbygoogle = window.adsbygoogle || []).push({ google_ad_client: "ca-pub-3523953066677938", enable_page_level_ads: true }); </script> <!-- Global site tag (gtag.js) - Google Analytics --> <script async src="https://www.googletagmanager.com/gtag/js?id=UA-79254642-6"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-79254642-6'); </script> <meta charset="utf-8"> <title>Data manipulation in d3.js</title> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta name="description" content="A few code chuncks describing the most common data manipulation needed in d3.js. Includes filtering, sorting, nesting and more."> <meta name="keywords" content="Data,Dataviz,Datavisualization,Javascript, JS, d3.js, axis, x, y"> <meta name="author" content="Yan Holtz"> <link rel="icon" href="../img/logo/D3_single_small.png"> <meta property="og:title" content="Data manipulation in d3.js"> <meta property="og:image" content="img/overview_RGG.png"> <meta property="og:description" content="A few code chuncks describing the most common data manipulation needed in d3.js. Includes filtering, sorting, nesting and more.."> <!-- Bootstrap core CSS --> <link href="../vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet"> <!-- Custom fonts for this template --> <link href="../vendor/font-awesome/css/font-awesome.min.css" rel="stylesheet" type="text/css"> <!-- Custom styles for this template --> <link href="../css/agency.css" rel="stylesheet"> <!-- PRISM --> <script src="../LIB/prism.js"></script> <link href="../LIB/prism.css" rel="stylesheet" /> <!-- D3.JS v4 --> <script src="../js/d3.v4.js"></script> <!-- JQUERY --> <script src="../vendor/jquery/jquery.min.js"></script> <!-- HTML TO CANVAS --> <script src="../js/html2canvas.js"></script> <!-- Function to parse html and js code chunks made by Yan Holtz --> <script src="../js/myParser.js"></script> <!-- Bootstrap js --> <script src="../vendor/bootstrap/js/bootstrap.bundle.min.js"></script> <script src="../js/agency.min.js"></script> </head> <body data-spy="scroll" data-target="#myScrollspy" data-offset="1"> <!-- THIS ALLOWS TO INSERT THE MENU THAT IS STORED IN A MENU.HTML FILE--> <nav class="navbar navbar-expand-lg fixed-top" id="mainNav"></nav> <script> $(function(){ $("#mainNav").load("../html_chunk/menu.html"); }); </script> <!-- THIS ALLOWS TO INSERT THE MODAL OF THE MENU THAT IS STORED IN A MENU_MODAL.HTML FILE--> <div id="modal_menu_insertion"> </div> <script> $(function(){ $("#modal_menu_insertion").load("../html_chunk/menu_modal.html"); }); </script> <!-- Header --> <header class="masthead" style="padding-top: 150px; padding-bottom: 80px"> <div class="textlanding"> <h1>Data manipulation in d3.js</h1> <hr class="short_hr"> <br> <ul class="list-inline social-buttons"> <li class="list-inline-item"> <a href="https://twitter.com/R_Graph_Gallery"> <i class="fa fa-twitter"></i> </a> </li> <li class="list-inline-item social-buttons"> <a href="https://github.com/holtzy"> <i class="fa fa-github" style="color: white"></i> </a> </li> <li class="list-inline-item social-buttons"> <a href="https://www.linkedin.com/in/yan-holtz-2477534a/"> <i class="fa fa-linkedin"></i> </a> </li> </ul> <br><br> <p style="max-width: 700px; margin: auto">This post describes the most common <u>data manipulation tasks</u> you will have to perform using <code>d3.js</code>. It includes <a href="#sorting">sorting</a>, <a href="#filtering">filtering</a>, <a href="#grouping">grouping</a>, <a href="#nesting">nesting</a> and more.</p> </div> </header> <!-- TABLE of CONTENT --> <div> <nav class="col-sm-3 col-4" id="myScrollspy"> <ul class="nav nav-pills flex-column"> <li class="nav-item"> <a class="nav-link" href="#math">Mathematics</a> </li> <li class="nav-item"> <a class="nav-link" href="#list">Lists and Arrays</a> </li> <li class="nav-item"> <a class="nav-link" href="#filtering">Filter</a> </li> <li class="nav-item"> <a class="nav-link" href="#sorting">Sort</a> </li> <li class="nav-item"> <a class="nav-link" href="#nesting">Nest</a> </li> <li class="nav-item"> <a class="nav-link" href="#grouping">Group</a> </li> <li class="nav-item"> <a class="nav-link" href="#loop">Loop</a> </li> <li class="nav-item"> <a class="nav-link" href="#reshape">Reshape</a> </li> <li class="nav-item"> <a class="nav-link" href="#stack">Stack</a> </li> </ul> </nav> </div> <!-- ==================== MATH ==================== --> <section id="math"> <div class="container" > <h1>Mathematics</h1> <hr> <h6>Get the maximum value of a column called <code>value</code></h6> <pre class="language-js"> d3.max(data, function(d) { return +d.value; })]) </pre> <br> <h6>Get the minimum value of a column called <code>value</code></h6> <pre class="language-js"> d3.min(data, function(d) { return +d.value; })]) </pre> <br> </div> </section> <!-- ==================== MATH ==================== --> <section id="list"> <div class="container" > <h1>Object and Arrays</h1> <hr> <h6>JavaScript <code>objects</code> are containers for named values called properties or methods. We can call any element of the object with its <u>name</u>:</h6> <pre class="language-js"> var myObject = {name:"Nicolas", sex:"Male", age:34} myObject.name myObject["name"] </pre> <br> <h6>Create an <code>array</code> of 5 numbers</h6> <p>An array is a special variable, which can hold more than one value at a time. Arrays use numbered indexes.</p> <pre class="language-js"> var myArray = [12, 34, 23, 12, 89] </pre> <br> <h6>Access the first element in the array.</h6> <pre class="language-js"> myArray[0] </pre> <br> <h6>Access the last element in the array.</h6> <pre class="language-js"> myArray[myArray.length - 1] </pre> <br> <h6>Remove last element of the array. Add a new element a the end.</h6> <pre class="language-js"> myArray.pop() myArray.push(45) </pre> <br> <h6>What is the index of the element '34' in the array?.</h6> <pre class="language-js"> myArray.indexOf(34) </pre> <br> <h6>Remove elements that are provided in a second array:</h6> <pre class="language-js"> var myArray = ['a', 'b', 'c', 'd', 'e', 'f', 'g']; var toRemove = ['b', 'c', 'g']; filteredArray = myArray.filter( function( el ) { return !toRemove.includes( el ); } ); </pre> <br> <h6>Keep only elements that are provided in a second array:</h6> <pre class="language-js"> var myArray = ['a', 'b', 'c', 'd', 'e', 'f', 'g']; var tokeep = ['b', 'c', 'g', 'o', 'z']; filteredArray = myArray.filter( function( el ) { return tokeep.includes( el ); } ); </pre> <br> <u>Note</u>: an array can be composed by several objects! </div> </section> <!-- ==================== FILTERING ==================== --> <section id="filtering"> <div class="container" > <h1>Filtering</h1> <hr> <h6>Keep rows where the variable <code>name</code> is <code>toto</code></h6> <pre class="language-js"> data.filter(function(d){ return d.name == "toto" }) </pre> <br> <h6>Keep rows where the variable <code>name</code> is <u>different</u> from <code>toto</code></h6> <pre class="language-js"> data.filter(function(d){ return d.name != "toto" }) </pre> <br> <h6>Keep rows where the variable <code>name</code> is <code>toto</code> OR <code>tutu</code></h6> <pre class="language-js"> data.filter(function(d){ return (d.name == "toto" || d.name == "tutu") }) </pre> <br> <h6>Keep rows where the variable <code>name</code> has a value included in the list <code>tokeep</code></h6> <pre class="language-js"> tokeep = ["option1", "option2", "option3"] data.filter(function(d,i){ return tokeep.indexOf(d.name) >= 0 }) </pre> <br> <h6>Keep the 10 first rows</h6> <pre class="language-js"> data.filter(function(d,i){ return i<10 }) </pre> <br> <h6>color points using <code>ifelse</code> statement</h6> <pre class="language-js"> .style("fill", function(d){ if(d.x<140){return "orange"} else {return "blue"}}) </pre> <br> </div> </section> <!-- ==================== SORTING==================== --> <section id="sorting"> <div class="container" > <h1>Sorting</h1> <hr> <h6>Sorting on 1 numeric column called <code>value</code>. Use <code>+</code> instead of <code>-</code> for reverse order.</h6> <pre class="language-js"> data.sort(function(a,b) { return +a.value - +b.value }) </pre> <br> <h6>Sorting alphabetically on 1 categoric column called <code>name</code>. Use <code>descending</code> for reverse order.</h6> <pre class="language-js"> data.sort(function (a,b) {return d3.ascending(a.name, b.name);}); </pre> <br> <h6>Sorting alphabetically on 2 categoric columns called <code>name1</code> and <code>name2</code>.</h6> <pre class="language-js"> data.sort(function(a,b) { return d3.ascending(a.name1, b.name1) || d3.ascending(a.name1, b.name2) } ) </pre> <br> <h6>Sorting on 1 categoric columns called <code>name1</code> and then on 1 numeric called <code>value</code>.</h6> <pre class="language-js"> data.sort(function(a,b) { return d3.ascending(a.name1, b.name1) || (a.value - b.value) } ) </pre> <br> <h6>Sorting on 1 categoric columns called <code>name1</code> according to the order provided in the variable <code>targetOrder</code>.</h6> <pre class="language-js"> data.sort(function(a,b) { return targetOrder.indexOf( a.name1 ) > targetOrder.indexOf( b.name1 ); }); </pre> <br> </div> </section> <!-- ==================== NESTING ==================== --> <section id="nesting"> <div class="container" > <h1>Nesting</h1> <hr> <h6>Sorting on 1 numeric column called <code>value</code>. Use <code>+</code> instead of <code>-</code> for reverse order.</h6> <pre class="language-js"> data.sort(function(a,b) { return +a.value - +b.value }) </pre> <br> <h6>Sorting alphabetically on 1 categoric column called <code>name</code>. Use <code>descending</code> for reverse order.</h6> <pre class="language-js"> data.sort(function (a,b) {return d3.ascending(a.name, b.name);}); </pre> <br> </div> </section> <!-- ==================== GROUPING ==================== --> <section id="grouping"> <div class="container" > <h1>Grouping</h1> <hr> <h6>Get a list of unique entries of a column called <code>name</code></h6> <pre class="language-js"> var allGroup = d3.map(data, function(d){return(d.name)}).keys() </pre> <br> </div> </section> <!-- ==================== GROUPING ==================== --> <section id="loop"> <div class="container" > <h1>Loop</h1> <hr> <h6>A <code>for</code> loop from one to ten:</h6> <pre class="language-js"> var i for (i = 0; i < 10; i++) { console.log(i) } </pre> <br> <h6>A <code>for</code> loop for all the elements of a list: (Note that it returns 0, 1, 2, not a, b, c)</h6> <pre class="language-js"> var allGroup = ["a", "b", "c"] for (i in allGroup){ console.log(i) } </pre> <br> <h6>A <code>while</code> loop to count from 0 to 10</h6> <pre class="language-js"> while (i < 10) { console.log(i) i++; } </pre> <br> </div> </section> <!-- ==================== RESHAPE ==================== --> <section id="reshape"> <div class="container" > <h1>Reshape</h1> <hr> <p>It is a common task in data science to swap between <u>wide</u> (or untidy) format to <u>long</u> (or tidy) format. In R, there is a package called <code>tidyr</code> that is entirely dedicated to it. It is definitely doable in Javascript using the code snippets below. In case you're not familiar with this concept, here is a description of what these formats are:</p> <center><img width="650px" src="../img/other/long_vs_wide_data_format.png" alt="long versus wide data format summary" style="padding: 30px; margin: 20px; border: solid; border-width: 1px; border-color: #B8B8B8; border-radius: 4px"></center> <p><u>Note</u>: it is strongly advised to perform these data wrangling steps out of your javascript to save <u>loading time</u> of your dataviz</p> <br> <h6>Going from <code>wide to long</code> format.</h6> <pre class="language-js"> d3.csv("https://raw.githubusercontent.com/holtzy/D3-graph-gallery/master/DATA/data_correlogram.csv", function(data) { // Going from wide to long format var data_long = []; data.forEach(function(d) { var x = d[""]; delete d[""]; for (prop in d) { var y = prop, value = d[prop]; data_long.push({ x: x, y: y, value: +value }); } }); // Show result console.log(data_long) </pre> <br> <br> <h6>Going from <code>long to wide</code> format.</h6> <pre class="language-js"> d3.csv("https://raw.githubusercontent.com/holtzy/D3-graph-gallery/master/DATA/data.csv", function(data) { //todo }) </pre> </div> </section> <!-- ==================== RESHAPE ==================== --> <section id="stack"> <div class="container" > <h1>Stack</h1> <hr> <p>Stacking data is a common practice in dataviz, notably for <a href="../barplot.html">barcharts</a> and <a href="../area.html">area charts</a>. Stacking applies when a dataset is composed by <u>groups</u> (like species) and <u>subgroups</u> like soil condition. Stacking is possible thanks to the <code>d3.stack()</code> function, which is part of the <a href="https://github.com/d3/d3-shape#stacks">d3-shape module</a>. Here is an illustration of what happens when you read data from <code>.csv</code> format and stack it.</p> <center><img width="950px" src="../img/other/stackingExpla.png" alt="Stacking data explanation" style="padding: 30px; margin: 20px; border: solid; border-width: 1px; border-color: #B8B8B8; border-radius: 4px"></center> <br> <script> d3.csv("https://raw.githubusercontent.com/holtzy/D3-graph-gallery/master/DATA/data_stacked.csv", function(data) { console.log(data) // List of subgroups = header of the csv files = soil condition here var subgroups = data.columns.slice(1) // List of groups = species here = value of the first column called group var groups = d3.map(data, function(d){return(d.group)}).keys() //stack the data? --> stack per subgroup var stackedData = d3.stack() .keys(subgroups) (data) console.log(stackedData) }) </script> <h6>Stacking from <code>.csv</code> format</h6> <pre class="language-js"> d3.csv("https://raw.githubusercontent.com/holtzy/D3-graph-gallery/master/DATA/data_stacked.csv", function(data) { // Have a look to the data console.log(data) // List of subgroups = header of the csv files = soil condition here var subgroups = data.columns.slice(1) // List of groups = species here = value of the first column called group var groups = d3.map(data, function(d){return(d.group)}).keys() //stack the data? --> stack per subgroup var stackedData = d3.stack() .keys(subgroups) (data) // Have a look to the stacked data console.log(stackedData) // Get the stacked data for the second group }) </pre> <br> <h6>Get the key of each element of stackedData ???</h6> </div> </section> <!-- ============================ CONTACT SECTION ============================ --> <!-- ANCHOR --> <a name="contactanchor"></a> <section id="contact" class="bg" style="background-color: white"></section> <!-- THIS ALLOWS TO INSERT THE CONTACT CHUNK THAT IS STORED IN A CONTACT.HTML FILE--> <script> $(function(){ $("#contact").load("../html_chunk/contact.html"); }); </script> <!-- ============================ FOOTER SECTION ============================ --> <footer class="bg-light" id="myFooter"></footer> <!-- THIS ALLOWS TO INSERT THE FOOTER THAT IS STORED IN A FOOTER.HTML FILE--> <script> $(function(){ $("#myFooter").load("../html_chunk/footer.html"); }); </script> <!-- ============================ --> </body> </html>