--- title: "Faceting (Part 2)" description: | A look at faceting in vega-lite. author: - name: Kris Sankaran url: {} date: 02-02-2021 output: distill::distill_article: self_contained: false --- ```{r setup, include=FALSE} knitr::opts_chunk$set(cache = FALSE, message = FALSE, warning = FALSE, echo = TRUE) library("robservable") ``` _[Reading](https://observablehq.com/@uwdata/multi-view-composition?collection=@uwdata/visualization-curriculum), [Recording](https://mediaspace.wisc.edu/media/1_o82g7gyk), [Notebook](https://observablehq.com/@krisrs1128/examples-of-faceting), [Rmarkdown](https://raw.githubusercontent.com/krisrs1128/stat479/master/_posts/2021-01-20-week2-2/week2-2.Rmd)_ In these notes, we’re going to see how to facet using vega-lite. We’re going to look at the weather data, which shows the precipitation and temperature for NYC and Seattle over the course of a year. ```{r} robservable("@krisrs1128/examples-of-faceting", include = 4, height = 280) ``` The most direct way to facet is to use the row and column encoding types. For example, here we are faceting histograms of temperature across the type of weather. ````js { const seattle = weather.filter(d => d.location == "Seattle"); const colors = { domain: ['drizzle', 'fog', 'rain', 'snow', 'sun'], range: ['#aec7e8', '#c7c7c7', '#1f77b4', '#9467bd', '#e7ba52'] // weather themed colors }; return vl.markBar() .data(seattle) .encode( vl.x().fieldQ('temp_max').title('Temperature (°C)'), vl.y().count().title("Number of Days"), vl.color().fieldN("weather").scale(colors), vl.column().fieldN("weather") ) .width(175) .height(120) .render() } ```` ```{r layout="l-page"} robservable("@krisrs1128/examples-of-faceting", include = 5, height = 200) ``` If we want, we can facet by both weather and city, using `row` and `column` to distinguish between variations along rows and columns. ````js { const colors = { domain: ['drizzle', 'fog', 'rain', 'snow', 'sun'], range: ['#aec7e8', '#c7c7c7', '#1f77b4', '#9467bd', '#e7ba52'] // weather themed colors }; return vl.markBar() .data(weather) .encode( vl.x().fieldQ('temp_max').title('Temperature (°C)'), vl.y().count().title("Number of Days"), vl.color().fieldN("weather").scale(colors), vl.column().fieldN("weather"), vl.row().fieldN("location") ) .width(175) .height(120) .render() } ```` ```{r layout="l-page"} robservable("@krisrs1128/examples-of-faceting", include = 6) ``` There is an alternative approach which is more general. The approach above defined the facet within the encoding for the `markBar`. There are situations where we might want to facet a plot that is made from more than just one type of mark (e.g., a line overlaid on points). In this case, we will want to create a facet call that can be applied to a full plot specification. This is exactly analogous to `facet_wrap` and `facet_grid`, where the faceting operator can be applied to several geom layers simultaneously. Notice that we need to put the `data` call *after* the facet -- the faceting won't know to what it should apply otherwise (try moving the command up). ````js { const colors = { domain: ['drizzle', 'fog', 'rain', 'snow', 'sun'], range: ['#aec7e8', '#c7c7c7', '#1f77b4', '#9467bd', '#e7ba52'] }; return vl.markBar() .encode( vl.x().fieldQ('temp_max').title('Temperature (°C)'), vl.y().count().title("Number of Days"), vl.color().fieldN("weather").scale(colors) ) .width(175) .height(120) .facet({column: vl.field("weather"), row: vl.field("location")}) .data(weather) .render() } ```` ```{r layout="l-page"} robservable("@krisrs1128/examples-of-faceting", include = 7) ``` Here’s an example that shows why this is useful: How do the minimum and maximum temperatures vary between NYC and Seattle? In addition to showing the minimum and maximum temperatures (a ribbon), we may want to show the midpoint between these as its own line. For just NYC, we can make this plot using `vl.layer` to superimpose the two marks, ````js { const nyc_ = weather.derive({temp_mid: d => 0.5 * (d.temp_min + d.temp_max)}) .filter(d => d.location == "New York") // ribbon layer const tempMinMax = vl.markArea({opacity: 0.3}).encode( vl.x().month('date'), // averages months over several years vl.y().average('temp_max').title('Temperature (°C)'), vl.y2().average('temp_min') ); // line layer const tempMid = vl.markLine().encode( vl.x().month('date'), vl.y().average('temp_mid') ); // overlay return vl.layer(tempMinMax, tempMid) .data(nyc_) .render(); } ```` ```{r} robservable("@krisrs1128/examples-of-faceting", include = 8, height = 220) ``` Now, within each facet, we want to overlay two types of marks. We can no longer specify `.column()` or `.row()` within the `.encode()` calls for each mark. Instead, we have to specify a facet at the end, after we've already overlayed the two elements. ````js { const weather_ = weather.derive({temp_mid: d => 0.5 * (d.temp_min + d.temp_max)}); // same ribbon, colored by location const tempMinMax = vl.markArea({opacity: 0.3}).encode( vl.x().month('date'), vl.y().average('temp_max').title('Temperature (°C)'), vl.y2().average('temp_min'), vl.color().fieldN("location") ); // same line, colored by location const tempMid = vl.markLine().encode( vl.x().month('date'), vl.y().average('temp_mid'), vl.color().fieldN("location") ); return vl.layer(tempMinMax, tempMid) .width(300) .height(200) .facet({column: vl.field("location")}) .data(weather_) .render(); } ```` ```{r} robservable("@krisrs1128/examples-of-faceting", include = 9, height = 230) ```