Visualizations and the Grammar of Graphics

MACS 30500 University of Chicago

ID \(N\) \(\bar{X}\) \(\bar{Y}\) \(\sigma_{X}\) \(\sigma_{Y}\) \(R\)
1 142 54.26327 47.83225 16.76514 26.93540 -0.0644719
2 142 54.26610 47.83472 16.76983 26.93974 -0.0641284
3 142 54.26144 47.83025 16.76590 26.93988 -0.0617148
4 142 54.26993 47.83699 16.76996 26.93768 -0.0694456
5 142 54.26015 47.83972 16.76996 26.93000 -0.0655833
6 142 54.26734 47.83955 16.76896 26.93027 -0.0629611
7 142 54.26881 47.83545 16.76670 26.94000 -0.0685042
8 142 54.26030 47.83983 16.76774 26.93019 -0.0603414
9 142 54.26732 47.83772 16.76001 26.93004 -0.0683434
10 142 54.26873 47.83082 16.76924 26.93573 -0.0685864
11 142 54.26588 47.83150 16.76885 26.93861 -0.0686092
12 142 54.26785 47.83590 16.76676 26.93610 -0.0689797
13 142 54.26692 47.83160 16.77000 26.93790 -0.0665752

Grammar

The whole system and structure of a language or of languages in general, usually taken as consisting of syntax and morphology (including inflections) and sometimes also phonology and semantics.

Grammar of graphics

  • “The fundamental principles or rules of an art or science”
  • A grammar used to describe and create a wide range of statistical graphics
  • Layered grammar of graphics
    • ggplot2

Carte figurative des pertes successives en hommes de l’Armee Français dans la campagne de Russe 1812–1813 by Charles Joseph Minard

Building Minard’s map in R

troops
## # A tibble: 51 × 5
##     long   lat survivors direction group
##    <dbl> <dbl>     <int>     <chr> <int>
## 1   24.0  54.9    340000         A     1
## 2   24.5  55.0    340000         A     1
## 3   25.5  54.5    340000         A     1
## 4   26.0  54.7    320000         A     1
## 5   27.0  54.8    300000         A     1
## 6   28.0  54.9    280000         A     1
## 7   28.5  55.0    240000         A     1
## 8   29.0  55.1    210000         A     1
## 9   30.0  55.2    180000         A     1
## 10  30.3  55.3    175000         A     1
## # ... with 41 more rows
cities
## # A tibble: 20 × 3
##     long   lat           city
##    <dbl> <dbl>          <chr>
## 1   24.0  55.0          Kowno
## 2   25.3  54.7          Wilna
## 3   26.4  54.4       Smorgoni
## 4   26.8  54.3      Moiodexno
## 5   27.7  55.2      Gloubokoe
## 6   27.6  53.9          Minsk
## 7   28.5  54.3     Studienska
## 8   28.7  55.5        Polotzk
## 9   29.2  54.4           Bobr
## 10  30.2  55.3        Witebsk
## 11  30.4  54.5         Orscha
## 12  30.4  53.9        Mohilow
## 13  32.0  54.8       Smolensk
## 14  33.2  54.9    Dorogobouge
## 15  34.3  55.2          Wixma
## 16  34.4  55.5          Chjat
## 17  36.0  55.5        Mojaisk
## 18  37.6  55.8         Moscou
## 19  36.6  55.3      Tarantino
## 20  36.5  55.0 Malo-Jarosewii

Minard’s grammar

  • Troops
    • Latitude
    • Longitude
    • Survivors
    • Advance/retreat
  • Cities
    • Latitude
    • Longitude
    • City name

plot_troops <- ggplot(data = troops,
                      mapping = aes(x = long, y = lat)) +
  geom_path(aes(size = survivors,
                color = direction,
                group = group))
plot_troops

plot_both <- plot_troops + 
  geom_text(data = cities, mapping = aes(label = city), size = 4)
plot_both

plot_polished <- plot_both +
  scale_size(range = c(0, 12),
             breaks = c(10000, 20000, 30000),
             labels = c("10,000", "20,000", "30,000")) + 
  scale_color_manual(values = c("tan", "grey50")) +
  coord_map() +
  labs(title = "Map of Napoleon's Russian campaign of 1812",
       x = NULL,
       y = NULL)
plot_polished

plot_polished +
  theme_void() +
  theme(legend.position = "none")