This post describes how to build a choropleth map with R
and the ggplot2
package. It shows how to load geospatial data in R, merge region features and build the map. Reproducible code is provided.
Two inputs are needed to build a choropleth map:
.geoJSON
fileThis step has been extensively describe in chart #325. The geojsonio library allows to read this type of format in R. To plot it with ggplot2, we first need to transform it to a data frame using the tidy function of the broom library. Then, the geom_polygon() function allows to represent this type of object !
# Geospatial data available at the geojson format
library(geojsonio)
spdf <- geojson_read("https://raw.githubusercontent.com/gregoiredavid/france-geojson/master/communes.geojson", what = "sp")
# Since it is a bit to much data, I select only a subset of it:
spdf <- spdf[ substr(spdf@data$code,1,2) %in% c("06", "83", "13", "30", "34", "11", "66") , ]
# Geospatial data available at the geojson format
library(geojsonio)
spdf <- geojson_read("https://raw.githubusercontent.com/gregoiredavid/france-geojson/master/communes.geojson", what = "sp")
# Since it is a bit to much data, I select only a subset of it:
spdf <- spdf[ substr(spdf@data$code,1,2) %in% c("06", "83", "13", "30", "34", "11", "66") , ]
We now have a geospatial object called spdf
. This object could be plotted as is using the plot()
function as explained here.
However, an additionnal step is required to plot it with ggplot2
that expects a data frame as input. It is possible to make the convertion using the tidy
function of the broom
package as shown below.
Finally, geom_polygon
is used to plot the shape.
#plot(spdf)
# I need to fortify the data AND keep trace of the commune code! (Takes ~2 minutes)
library(broom)
spdf_fortified <- tidy(spdf, region = "code")
# Now I can plot this shape easily as described before:
library(ggplot2)
ggplot() +
geom_polygon(data = spdf_fortified, aes( x = long, y = lat, group = group), fill="white", color="grey") +
theme_void() +
coord_map()
The number of restaurant per city district has been found on the internet and a clean version is stored on the gallery website. It is thus easy to read it with read.table. Before doing a choropleth map, it is a good practice to check the distribution of your variable.
Here, we have a ‘long tail’ distribution: a few cities have a lot of restaurant. Thus we will probably need to apply a log scale to our color palette. It will avoid that all the variation is absorbed by these high values.
# read data
data <- read.table("https://www.r-graph-gallery.com/wp-content/uploads/2017/12/data_on_french_states.csv", header=T, sep=";")
head(data)
# Distribution of the number of restaurant?
library(dplyr)
data %>%
ggplot( aes(x=nb_equip)) +
geom_histogram(bins=20, fill='skyblue', color='#69b3a2') + scale_x_log10()
This is a key step in choropleth map: your 2 inputs must have a id in common to make the link between them!
We can make a first basic choropleth map. We just need to add fill = our value in the aesthetic of our polygons.
ggplot() +
geom_polygon(data = spdf_fortified, aes(fill = nb_equip, x = long, y = lat, group = group)) +
theme_void() +
coord_map()
There is a bit of work to do to get a descent figure. We need to change the color palette, improve the legend, use a log scale transformation for the colorscale, change background and add titles and explanation. Here is the code to do that, and the final result!