<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>Boxplot with individual data points – the R Graph Gallery</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta name="generator" content="pandoc" /> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta name="description" content="This post explains how to build a boxplot with ggplot2, adding individual data points with jitter on top of it."> <meta name="keywords" content="R,ggplot2,tidyverse,Example,Data,Dataviz,Datavisualization,Plot,Chart,Graph,Learning,Caveat,Pitfall,Mistake"> <meta name="author" content="Yan Holtz"> <link rel="icon" href="img/logo/R_single_small.png"> <!-- Control appearance when shared by social media --> <meta property="og:title" content="Boxplot with individual data points" /> <meta property="og:image" content="https://github.com/holtzy/R-graph-gallery/raw/master/img/logo/R_single_big.png" /> <meta property="og:description" content="This post explains how to build a boxplot with ggplot2, adding individual data points with jitter on top of it." /> <meta property='og:url' content="https://www.r-graph-gallery.com/89-box-and-scatter-plot-with-ggplot2.html" /> <meta property="og:type" content="website" /> <!-- Bootstrap core CSS --> <link href="vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet"> <!-- Custom fonts for this template --> <link href="vendor/font-awesome/css/font-awesome.min.css" rel="stylesheet" type="text/css"> <!-- Custom styles for this template --> <link href="css/agency.css" rel="stylesheet"> <!-- JQUERY --> <script src="vendor/jquery/jquery.min.js"></script> </head> <body data-spy="scroll" data-target="#myScrollspy" data-offset="1"> <!-- THIS ALLOWS TO INSERT THE MENU THAT IS STORED IN A MENU.HTML FILE--> <nav class="navbar navbar-expand-lg fixed-top" id="mainNav"></nav> <script> $(function(){ $("#mainNav").load("html_chunk/menu.html"); }); </script> <!-- THIS ALLOWS TO INSERT THE MODAL OF THE MENU THAT IS STORED IN A MENU_MODAL.HTML FILE--> <div id="modal_menu_insertion"> </div> <script> $(function(){ $("#modal_menu_insertion").load("html_chunk/menu_modal.html"); }); </script> <!-- Header = Title in big + social media Icon + quick description --> <header class="masthead" style="padding-bottom: 30px;"> <div class="textlanding"> <center><h1>Boxplot with individual data points</h1></center> <hr class="short_hr"> <br> <ul class="list-inline social-buttons"> <li class="list-inline-item"> <a href="https://twitter.com/R_Graph_Gallery"> <i class="fa fa-twitter"></i> </a> </li> <li class="list-inline-item social-buttons"> <a href="https://github.com/holtzy"> <i class="fa fa-github" style="color: white"></i> </a> </li> <li class="list-inline-item social-buttons"> <a href="https://www.linkedin.com/in/yan-holtz-2477534a/"> <i class="fa fa-linkedin"></i> </a> </li> <li class="list-inline-item social-buttons"> <a href="https://www.yan-holtz.com"> <i class="fa fa-home"></i> </a> </li> </ul> <center><p style="max-width: 600px; margin-top: 40px">A <a href="boxplot.html">boxplot</a> summarizes the distribution of a continuous variable. it is often criticized for hiding the underlying distribution of each group. Thus, showing individual observation using jitter on top of boxes is a good practice. This post explains how to do so using <a href="ggplot2-package.html">ggplot2</a>.</p></center> <div style="text-align:center"> <a class="btn btn-secondary btn-xl text-uppercase js-scroll-trigger" href='boxplot.html'>Boxplot Section</a> <a class="btn btn-secondary btn-xl text-uppercase js-scroll-trigger" href='https://www.data-to-viz.com/caveat/boxplot.html'>Boxplot pitfalls</a> </div> </div> </header> <!-- THIS ALLOWS TO INSERT THE ADVERTISEMENT BANNER THAT IS STORED IN A BANNER.HTML FILE--> <div id="position_for_images"></div> <script> $(function(){ $("#position_for_images").load("html_chunk/images.html"); }); </script> <!-- STYLE for chart pages but not the rest of tthe website --> <style> img { margin-top: 20px; } </style> <div class="container" style="padding-top: 100px"> <div class="row"> <div class="col-md-6 col-sm-12 align-self-center"> <p>If you’re not convinced about that danger of using basic boxplot, please read <a href="https://www.data-to-viz.com/caveat/boxplot.html">this post</a> that explains it in depth.</p> <p>Fortunately, <a href="ggplot2-package.html">ggplot2</a> makes it a breeze to add invdividual observation on top of boxes thanks to the <code>geom_jitter()</code> function. This function shifts all dots by a random value ranging from 0 to <code>size</code>, avoiding overlaps.</p> <p>Now, do you see the bimodal distribution hidden behind group B?</p> </div> <div class="col-md-6 col-sm-12"> <p><img src="89-box-and-scatter-plot-with-ggplot2_files/figure-html/unnamed-chunk-1-1.png" width="100%" /></p> </div> </div> <div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" data-line-number="1"><span class="co"># Libraries</span></a> <a class="sourceLine" id="cb1-2" data-line-number="2"><span class="kw">library</span>(tidyverse)</a> <a class="sourceLine" id="cb1-3" data-line-number="3"><span class="kw">library</span>(hrbrthemes)</a> <a class="sourceLine" id="cb1-4" data-line-number="4"><span class="kw">library</span>(viridis)</a> <a class="sourceLine" id="cb1-5" data-line-number="5"></a> <a class="sourceLine" id="cb1-6" data-line-number="6"><span class="co"># create a dataset</span></a> <a class="sourceLine" id="cb1-7" data-line-number="7">data <-<span class="st"> </span><span class="kw">data.frame</span>(</a> <a class="sourceLine" id="cb1-8" data-line-number="8"> <span class="dt">name=</span><span class="kw">c</span>( <span class="kw">rep</span>(<span class="st">"A"</span>,<span class="dv">500</span>), <span class="kw">rep</span>(<span class="st">"B"</span>,<span class="dv">500</span>), <span class="kw">rep</span>(<span class="st">"B"</span>,<span class="dv">500</span>), <span class="kw">rep</span>(<span class="st">"C"</span>,<span class="dv">20</span>), <span class="kw">rep</span>(<span class="st">'D'</span>, <span class="dv">100</span>) ),</a> <a class="sourceLine" id="cb1-9" data-line-number="9"> <span class="dt">value=</span><span class="kw">c</span>( <span class="kw">rnorm</span>(<span class="dv">500</span>, <span class="dv">10</span>, <span class="dv">5</span>), <span class="kw">rnorm</span>(<span class="dv">500</span>, <span class="dv">13</span>, <span class="dv">1</span>), <span class="kw">rnorm</span>(<span class="dv">500</span>, <span class="dv">18</span>, <span class="dv">1</span>), <span class="kw">rnorm</span>(<span class="dv">20</span>, <span class="dv">25</span>, <span class="dv">4</span>), <span class="kw">rnorm</span>(<span class="dv">100</span>, <span class="dv">12</span>, <span class="dv">1</span>) )</a> <a class="sourceLine" id="cb1-10" data-line-number="10">)</a> <a class="sourceLine" id="cb1-11" data-line-number="11"></a> <a class="sourceLine" id="cb1-12" data-line-number="12"><span class="co"># Plot</span></a> <a class="sourceLine" id="cb1-13" data-line-number="13">data <span class="op">%>%</span></a> <a class="sourceLine" id="cb1-14" data-line-number="14"><span class="st"> </span><span class="kw">ggplot</span>( <span class="kw">aes</span>(<span class="dt">x=</span>name, <span class="dt">y=</span>value, <span class="dt">fill=</span>name)) <span class="op">+</span></a> <a class="sourceLine" id="cb1-15" data-line-number="15"><span class="st"> </span><span class="kw">geom_boxplot</span>() <span class="op">+</span></a> <a class="sourceLine" id="cb1-16" data-line-number="16"><span class="st"> </span><span class="kw">scale_fill_viridis</span>(<span class="dt">discrete =</span> <span class="ot">TRUE</span>, <span class="dt">alpha=</span><span class="fl">0.6</span>) <span class="op">+</span></a> <a class="sourceLine" id="cb1-17" data-line-number="17"><span class="st"> </span><span class="kw">geom_jitter</span>(<span class="dt">color=</span><span class="st">"black"</span>, <span class="dt">size=</span><span class="fl">0.4</span>, <span class="dt">alpha=</span><span class="fl">0.9</span>) <span class="op">+</span></a> <a class="sourceLine" id="cb1-18" data-line-number="18"><span class="st"> </span><span class="kw">theme_ipsum</span>() <span class="op">+</span></a> <a class="sourceLine" id="cb1-19" data-line-number="19"><span class="st"> </span><span class="kw">theme</span>(</a> <a class="sourceLine" id="cb1-20" data-line-number="20"> <span class="dt">legend.position=</span><span class="st">"none"</span>,</a> <a class="sourceLine" id="cb1-21" data-line-number="21"> <span class="dt">plot.title =</span> <span class="kw">element_text</span>(<span class="dt">size=</span><span class="dv">11</span>)</a> <a class="sourceLine" id="cb1-22" data-line-number="22"> ) <span class="op">+</span></a> <a class="sourceLine" id="cb1-23" data-line-number="23"><span class="st"> </span><span class="kw">ggtitle</span>(<span class="st">"A boxplot with jitter"</span>) <span class="op">+</span></a> <a class="sourceLine" id="cb1-24" data-line-number="24"><span class="st"> </span><span class="kw">xlab</span>(<span class="st">""</span>)</a></code></pre></div> <p><br><br><br> In case you’re not convinced, here is how the basic <a href="boxplot.html">boxplot</a> and the basic <a href="violin.html">violin plot</a> look like:</p> <div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb2-1" data-line-number="1"><span class="co"># Boxplot basic</span></a> <a class="sourceLine" id="cb2-2" data-line-number="2">data <span class="op">%>%</span></a> <a class="sourceLine" id="cb2-3" data-line-number="3"><span class="st"> </span><span class="kw">ggplot</span>( <span class="kw">aes</span>(<span class="dt">x=</span>name, <span class="dt">y=</span>value, <span class="dt">fill=</span>name)) <span class="op">+</span></a> <a class="sourceLine" id="cb2-4" data-line-number="4"><span class="st"> </span><span class="kw">geom_boxplot</span>() <span class="op">+</span></a> <a class="sourceLine" id="cb2-5" data-line-number="5"><span class="st"> </span><span class="kw">scale_fill_viridis</span>(<span class="dt">discrete =</span> <span class="ot">TRUE</span>, <span class="dt">alpha=</span><span class="fl">0.6</span>, <span class="dt">option=</span><span class="st">"A"</span>) <span class="op">+</span></a> <a class="sourceLine" id="cb2-6" data-line-number="6"><span class="st"> </span><span class="kw">theme_ipsum</span>() <span class="op">+</span></a> <a class="sourceLine" id="cb2-7" data-line-number="7"><span class="st"> </span><span class="kw">theme</span>(</a> <a class="sourceLine" id="cb2-8" data-line-number="8"> <span class="dt">legend.position=</span><span class="st">"none"</span>,</a> <a class="sourceLine" id="cb2-9" data-line-number="9"> <span class="dt">plot.title =</span> <span class="kw">element_text</span>(<span class="dt">size=</span><span class="dv">11</span>)</a> <a class="sourceLine" id="cb2-10" data-line-number="10"> ) <span class="op">+</span></a> <a class="sourceLine" id="cb2-11" data-line-number="11"><span class="st"> </span><span class="kw">ggtitle</span>(<span class="st">"Basic boxplot"</span>) <span class="op">+</span></a> <a class="sourceLine" id="cb2-12" data-line-number="12"><span class="st"> </span><span class="kw">xlab</span>(<span class="st">""</span>)</a> <a class="sourceLine" id="cb2-13" data-line-number="13"></a> <a class="sourceLine" id="cb2-14" data-line-number="14"><span class="co"># Violin basic</span></a> <a class="sourceLine" id="cb2-15" data-line-number="15">data <span class="op">%>%</span></a> <a class="sourceLine" id="cb2-16" data-line-number="16"><span class="st"> </span><span class="kw">ggplot</span>( <span class="kw">aes</span>(<span class="dt">x=</span>name, <span class="dt">y=</span>value, <span class="dt">fill=</span>name)) <span class="op">+</span></a> <a class="sourceLine" id="cb2-17" data-line-number="17"><span class="st"> </span><span class="kw">geom_violin</span>() <span class="op">+</span></a> <a class="sourceLine" id="cb2-18" data-line-number="18"><span class="st"> </span><span class="kw">scale_fill_viridis</span>(<span class="dt">discrete =</span> <span class="ot">TRUE</span>, <span class="dt">alpha=</span><span class="fl">0.6</span>, <span class="dt">option=</span><span class="st">"A"</span>) <span class="op">+</span></a> <a class="sourceLine" id="cb2-19" data-line-number="19"><span class="st"> </span><span class="kw">theme_ipsum</span>() <span class="op">+</span></a> <a class="sourceLine" id="cb2-20" data-line-number="20"><span class="st"> </span><span class="kw">theme</span>(</a> <a class="sourceLine" id="cb2-21" data-line-number="21"> <span class="dt">legend.position=</span><span class="st">"none"</span>,</a> <a class="sourceLine" id="cb2-22" data-line-number="22"> <span class="dt">plot.title =</span> <span class="kw">element_text</span>(<span class="dt">size=</span><span class="dv">11</span>)</a> <a class="sourceLine" id="cb2-23" data-line-number="23"> ) <span class="op">+</span></a> <a class="sourceLine" id="cb2-24" data-line-number="24"><span class="st"> </span><span class="kw">ggtitle</span>(<span class="st">"Violin chart"</span>) <span class="op">+</span></a> <a class="sourceLine" id="cb2-25" data-line-number="25"><span class="st"> </span><span class="kw">xlab</span>(<span class="st">""</span>)</a></code></pre></div> <p><img src="89-box-and-scatter-plot-with-ggplot2_files/figure-html/unnamed-chunk-3-1.png" width="50%" /><img src="89-box-and-scatter-plot-with-ggplot2_files/figure-html/unnamed-chunk-3-2.png" width="50%" /></p> <!-- Close container --> </div> <!-- ============================ RELATED SECTION ============================ --> <section class="bg-light" id="portfolio_landing" style="padding-top: 30px; padding-bottom: 30px; margin-top: 100px;"> <div class="container"> <p class="mySeryTitle">Related chart types</p> <hr> <div class="row"> <div class="col-md-2 col-sm-4 portfolio-item" > <a class="portfolio-link" href="violin.html"> <div class="portfolio-hover"> <div class="portfolio-hover-content"> <i class="fa fa-plus fa-3x"></i> </div> </div> <img class="img-fluid" src="img/section/Violin150.png" alt=""> </a> <div class="captionPortfolio">Violin</div> </div> <div class="col-md-2 col-sm-4 portfolio-item"> <a class="portfolio-link" href="density-plot.html"> <div class="portfolio-hover"> <div class="portfolio-hover-content"> <i class="fa fa-plus fa-3x"></i> </div> </div> <img class="img-fluid" src="img/section/Density150.png" alt=""> </a> <div class="captionPortfolio">Density</div> </div> <div class="col-md-2 col-sm-4 portfolio-item"> <a class="portfolio-link" href="histogram.html"> <div class="portfolio-hover"> <div class="portfolio-hover-content"> <i class="fa fa-plus fa-3x"></i> </div> </div> <img class="img-fluid" src="img/section/Histogram150.png" alt=""> </a> <div class="captionPortfolio">Histogram</div> </div> <div class="col-md-2 col-sm-4 portfolio-item"> <a class="portfolio-link" href="boxplot.html"> <div class="portfolio-hover"> <div class="portfolio-hover-content"> <i class="fa fa-plus fa-3x"></i> </div> </div> <img class="img-fluid" src="img/section/Box1150.png" alt=""> </a> <div class="captionPortfolio">Boxplot</div> </div> <div class="col-md-2 col-sm-4 portfolio-item"> <a class="portfolio-link" href="ridgeline-plot.html"> <div class="portfolio-hover"> <div class="portfolio-hover-content"> <i class="fa fa-plus fa-3x"></i> </div> </div> <img class="img-fluid" src="img/section/Joyplot150.png" alt=""> </a> <div class="captionPortfolio">Ridgeline</div> </div> </div> </div> </section> <!-- ============================ CONTACT SECTION ============================ --> <section id="contact" class="bg" style="background-color: white; padding-top: 60px"> <div class="container"> <div class="row"> <div class="col-lg-2 text-center"></div> <div class="col-lg-8 text-center"> <br><br><br> <h2 class="section-heading text-uppercase" style="color: black">Contact</h2> <p>This document is a work by <a href="https://www.yan-holtz.com">Yan Holtz</a>. Any feedback is highly encouraged. You can fill an issue on <a href="https://github.com/holtzy/D3-graph-gallery/issues">Github</a>, drop me a message on <a href="https://twitter.com/R_Graph_Gallery">Twitter</a>, or send an email pasting <a href="">yan.holtz.data</a> with <a href="">gmail.com</a>.</p> <div style="text-align:center"> <a class="btn btn-primary btn-xl text-uppercase js-scroll-trigger" href="https://github.com/holtzy">Github</a> <a class="btn btn-primary btn-xl text-uppercase js-scroll-trigger" href="https://twitter.com/R_Graph_Gallery">Twitter</a> </div> </div> </div> </div> </section> <!-- ============================ FOOTER SECTION ============================ --> <footer class="bg-light" id="myFooter"> <div class="container" > <div class="row"> <div class="col-md-4"> <span class="copyright">Copyright © the R graph gallery 2018</span> </div> <div class="col-md-4"> <ul class="list-inline social-buttons"> <li class="list-inline-item"> <a href="https://twitter.com/R_Graph_Gallery"> <i class="fa fa-twitter"></i> </a> </li> <li class="list-inline-item"> <a href="https://github.com/holtzy"> <i class="fa fa-github"></i> </a> </li> <li class="list-inline-item"> <a href="https://www.linkedin.com/in/yan-holtz-2477534a/"> <i class="fa fa-linkedin"></i> </a> </li> </ul> </div> <div class="col-md-4"> <ul class="list-inline quicklinks"> <li class="list-inline-item"> <a href="#">Privacy Policy</a> </li> <li class="list-inline-item"> <a href="#">Terms of Use</a> </li> </ul> </div> </div> </div> </footer> <script> // add bootstrap table styles to pandoc tables function bootstrapStylePandocTables() { $('tr.header').parent('thead').parent('table').addClass('table table-condensed'); } $(document).ready(function () { bootstrapStylePandocTables(); }); </script> <!-- ============================ JAVASCRIPT SECTION ============================ --> <!-- Bootstrap core JavaScript --> <script src="vendor/bootstrap/js/bootstrap.bundle.min.js"></script> <!-- Custom scripts for this template --> <script src="js/agency.min.js"></script> <!-- Global site tag (gtag.js) - Google Analytics --> <script async src="https://www.googletagmanager.com/gtag/js?id=UA-79254642-1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-79254642-1'); </script> </body> </html>