---
layout: post
title: "sf 0.6-0 news"
date: "`r format(Sys.time(), '%d %B, %Y')`"
comments: true
author: Edzer Pebesma
categories: r
---
TOC
[DOWNLOADHERE]
Version 0.6-0 of the sf package (an R package for
handling vector geometries in R) has been released to
CRAN. It contains several innovations, summarized in the
[NEWS](https://cran.r-project.org/web/packages/sf/news.html) file.
This blog post will illustrate some of thise further.
## Ring directions
Consider the following two polygons:
```{r fig=TRUE, fig.path = "images/", label="pol1", fig.height=3}
library(sf)
p1 = rbind(c(0,0), c(1,0), c(1,1), c(0,1), c(0,0))
p2 = 0.5 * p1 + 0.25
(pol1 = st_polygon(list(p1, p2)))
(pol2 = st_polygon(list(p1, p2[5:1,])))
opar = par(mfrow = c(1, 2), mar = rep(0, 4))
plot(pol1, col = grey(.8), rule = "winding")
plot(pol2, col = grey(.8), rule = "winding")
```
Although the simple feature standard describes that all secondary
rings indicate holes, it also specifies that outer rings should
be counter clockwise and inner rings (holes) clockwise. It doesn't
specify that polygons for which the hole has the same ring direction
as the outer ring are invalid - and they aren't. But how should
software deal with them? In prior `sf` versions, `plot` (and ggplot)
would take the `winding` rule, requiring holes to have the opposite
direction as outer rings. This has been changed into `evenodd` as
default, which plots both cases with holes:
```{r fig=TRUE, fig.path = "images/", label="pol2", fig.height=3}
library(sf)
p1 = rbind(c(0,0), c(1,0), c(1,1), c(0,1), c(0,0))
p2 = 0.5 * p1 + 0.25
(pol1 = st_polygon(list(p1, p2)))
(pol2 = st_polygon(list(p1, p2[5:1,])))
opar = par(mfrow = c(1, 2), mar = rep(0, 4))
plot(pol1, col = grey(.8)) # role = "evenodd"
plot(pol2, col = grey(.8)) # role = "evenodd"
```
In addition, `st_sfc` and `st_read` gained a parameter
`check_ring_dir`, by default `FALSE`, which when `TRUE` will
check every ring and revert them to counter clockwise for outer,
and clockwise for inner (hole) rings. By default this is `FALSE`
because it is an expensive operation for large datasets.
## Higher-order geometry differences
This was reported [here](http://r-spatial.org/r/2017/12/21/geoms.html); two
nice graphs are available from `?st_difference`
```{r fig=TRUE, fig.path = "images/", label="diff"}
set.seed(131)
m = rbind(c(0,0), c(1,0), c(1,1), c(0,1), c(0,0))
p = st_polygon(list(m))
n = 100
l = vector("list", n)
for (i in 1:n)
l[[i]] = p + 10 * runif(2)
s = st_sfc(l)
d = st_difference(s) # sequential differences: s1, s2-s1, s3-s2-s1, ...
i = st_intersection(s) # all intersections
plot(s, col = sf.colors(categorical = TRUE, alpha = .5))
title("overlapping squares")
par(mfrow = c(1, 2), mar = c(0,0,2.4,0))
plot(d, col = sf.colors(categorical = TRUE, alpha = .5))
title("non-overlapping differences")
plot(i, col = sf.colors(categorical = TRUE, alpha = .5))
title("non-overlapping intersections")
```
## Spherical geometry
All geometric operations (area, length, intersects,
intersection, union etc) provided by the GEOS library assume
two-dimensional coordinates. If your data have geographic
(longitude-latitude) coordinates, this is may be quite OK
when your area is small and close to the equator, otherwise
it is not. One way out is to project the data using a suitable
projection, the other is to use spherical geometry: algorithms
that compute _on the sphere_ (or, more precisely, on the
[_spheroid_](https://en.wikipedia.org/wiki/Spheroid). This has
a number of advantages:
* it is easy, you have no-worries about which projection to choose
* it is always correct
however, it comes at some computational cost.
Spherical geometry functions were formerly taken from
R package `geosphere`; with the new `sf` they use package
[lwgeom](https://cran.r-project.org/web/packages/lwgeom/index.html),
which interfaces liblwgeom, the library that also
PostGIS uses. (and the development of which was funded by
[palantir](https://www.directionsmag.com/article/1638)).
Liblwgeom functions `st_make_valid`, `st_geohash`, `st_plit`,
which were formerly in `sf` have now been moved to `lwgeom`.
Other functions in `lwgeom` enable the following functions to work
with geographic coordinates: `st_length`, `st_area` `st_distance`,
`st_is_within_distance`, `st_segmentize`.
Where `geosphere` could only compute distances between points,
`st_distance` now computes distances between arbitrary simple
feature geometries. `st_distance` is clearly slower when computed
on a spheroid than when computed on the sphere. For point data,
faster results are obtained when we assume the Earth is a sphere:
```{r}
n = 2000
df = data.frame(x = runif(n), y = runif(n))
pts = st_as_sf(df, coords = c("x", "y"))
system.time(x0 <- st_distance(pts))
st_crs(pts) = 4326 # spheroid
system.time(x1 <- st_distance(pts))
st_crs(pts) = "+proj=longlat +ellps=sphere" # sphere
system.time(x2 <- st_distance(pts))
system.time(x3 <- dist(as.matrix(df)))
```
## Hausdorff and Frechet distance
For two-dimensional (flat) geometries,
`st_distance` now has the option to compute [Hausdorff
distances](https://en.wikipedia.org/wiki/Hausdorff_distance),
and (if `sf` was linked to GEOS 3.7.0) also [Frechet
distance](https://en.wikipedia.org/wiki/Fr%C3%A9chet_distance).
## snap!
For two-dimensional (flat) geometries, `st_snap` is now available;
we refer to the [PostGIS](https://postgis.net/docs/ST_Snap.html)
documentation for examples what it does.
## join to largest matching feature
This feature was reported and illustrated here:
## polygon geometries have zero length
Function `st_length` now returns zero for non-linear geometries
_including polygons_. For length of polygon rings, `st_cast` to
`MULTILINESTRING` first.
## printing coordinates now honors `digits` setting
Printing of geometries, as well as `st_as_text` now use the default
digits of R:
```{r}
st_point(c(1/3, 1/6))
options(digits = 3)
st_point(c(1/3, 1/6))
st_as_text(st_point(c(1/3, 1/6)), digits = 16)
```
Before `sf` 0.6, `as.character` was used, which used around 16 digits.