Tidyverse methods for sf objects. Geometries are sticky, use as.data.frame to let dplyr
's own methods drop them. Use these methods without the .sf suffix and after loading the tidyverse package with the generic (or after loading package tidyverse).
filter.sf(.data, ..., .dots) arrange.sf(.data, ..., .dots) group_by.sf(.data, ..., add = FALSE) ungroup.sf(x, ...) mutate.sf(.data, ..., .dots) transmute.sf(.data, ..., .dots) select.sf(.data, ...) rename.sf(.data, ...) slice.sf(.data, ..., .dots) summarise.sf(.data, ..., .dots, do_union = TRUE) distinct.sf(.data, ..., .keep_all = FALSE) gather.sf( data, key, value, ..., na.rm = FALSE, convert = FALSE, factor_key = FALSE ) spread.sf( data, key, value, fill = NA, convert = FALSE, drop = TRUE, sep = NULL ) sample_n.sf(tbl, size, replace = FALSE, weight = NULL, .env = parent.frame()) sample_frac.sf( tbl, size = 1, replace = FALSE, weight = NULL, .env = parent.frame() ) nest.sf(.data, ...) separate.sf( data, col, into, sep = "[^[:alnum:]]+", remove = TRUE, convert = FALSE, extra = "warn", fill = "warn", ... ) separate_rows.sf(data, ..., sep = "[^[:alnum:]]+", convert = FALSE) unite.sf(data, col, ..., sep = "_", remove = TRUE) unnest.sf(data, ..., .preserve = NULL) inner_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) left_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) right_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) full_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) semi_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) anti_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)
.data | data object of class sf |
---|---|
... | other arguments |
.dots | see corresponding function in package |
add | see corresponding function in dplyr |
x | A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details. |
do_union | logical; in case |
.keep_all | see corresponding function in dplyr |
data | see original function docs |
key | see original function docs |
value | see original function docs |
na.rm | see original function docs |
convert | see separate_rows |
factor_key | see original function docs |
fill | see original function docs |
drop | see original function docs |
sep | see separate_rows |
tbl | see original function docs |
size | see original function docs |
replace | see original function docs |
weight | see original function docs |
.env | see original function docs |
col | see separate |
into | see separate |
remove | see separate |
extra | see separate |
.preserve | see unnest |
y | A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details. |
by | A character vector of variables to join by. If To join by different variables on To join by multiple variables, use a vector with length > 1.
For example, To perform a cross-join, generating all combinations of |
copy | If |
suffix | If there are non-joined duplicate variables in |
an object of class sf
select
keeps the geometry regardless whether it is selected or not; to deselect it, first pipe through as.data.frame
to let dplyr's own select
drop it.
In case one or more of the arguments (expressions) in the summarise
call creates a geometry list-column, the first of these will be the (active) geometry of the returned object. If this is not the case, a geometry column is created, depending on the value of do_union
.
In case do_union
is FALSE
, summarise
will simply combine geometries using c.sfg. When polygons sharing a boundary are combined, this leads to geometries that are invalid; see for instance https://github.com/r-spatial/sf/issues/681.
distinct
gives distinct records for which all attributes and geometries are distinct; st_equals is used to find out which geometries are distinct.
nest
assumes that a simple feature geometry list-column was among the columns that were nested.
#> Reading layer `nc' from data source `/tmp/RtmpCdQsky/temp_libpath64f92385e079/sf/shape/nc.shp' using driver `ESRI Shapefile' #> Simple feature collection with 100 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> CRS: 4267#> Warning: plotting the first 10 out of 14 attributes; use max.plot = 14 to plot alltitle("the ten counties with smallest area")#> [1] "sf" "grouped_df" "tbl_df" "tbl" "data.frame"nc2 <- nc %>% mutate(area10 = AREA/10) nc %>% transmute(AREA = AREA/10, geometry = geometry) %>% class()#> [1] "sf" "data.frame"#> [1] "sf" "data.frame"#> [1] "SID74" "SID79" "geometry"#> [1] "SID74" "SID79" "geometry"#> [1] "sf" "data.frame"#> [1] "sf" "data.frame"#> Simple feature collection with 2 features and 15 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -81.74107 ymin: 36.23436 xmax: -80.90344 ymax: 36.58965 #> CRS: 4267 #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 SID74 #> 1 0.114 1.442 1825 1825 Ashe 37009 37009 5 1091 1 #> 2 0.061 1.231 1827 1827 Alleghany 37005 37005 3 487 0 #> NWBIR74 BIR79 SID79 NWBIR79 area_cl geometry #> 1 10 1364 0 19 (0.1,0.12] MULTIPOLYGON (((-81.47276 3... #> 2 10 542 3 12 (0,0.1] MULTIPOLYGON (((-81.23989 3...nc$area_cl = cut(nc$AREA, c(0, .1, .12, .15, .25)) nc.g <- nc %>% group_by(area_cl) nc.g %>% summarise(mean(AREA))#> Simple feature collection with 4 features and 2 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> CRS: 4267 #> # A tibble: 4 x 3 #> area_cl `mean(AREA)` geometry #> <fct> <dbl> <MULTIPOLYGON [°]> #> 1 (0,0.1] 0.0760 (((-77.96073 34.18924, -77.96587 34.24229, -77.97528 … #> 2 (0.1,0.12] 0.112 (((-84.29104 35.21054, -84.22594 35.2616, -84.17973 3… #> 3 (0.12,0.1… 0.134 (((-76.54427 34.58783, -76.55515 34.61066, -76.53775 … #> 4 (0.15,0.2… 0.190 (((-76.64705 34.90633, -76.62562 34.89065, -76.75021 …#> mean(AREA) #> 1 0.12626#> [1] 100#> VAR SID geometry #> Length:200 Min. : 0.000 MULTIPOLYGON :200 #> Class :character 1st Qu.: 2.000 epsg:4267 : 0 #> Mode :character Median : 5.000 +proj=long...: 0 #> Mean : 7.515 #> 3rd Qu.: 9.000 #> Max. :57.000library(tidyr) nc$row = 1:100 # needed for spread to work nc %>% select(SID74, SID79, geometry, row) %>% gather("VAR", "SID", -geometry, -row) %>% spread(VAR, SID) %>% head()#> Simple feature collection with 6 features and 3 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -81.74107 ymin: 36.07282 xmax: -75.77316 ymax: 36.58965 #> CRS: 4267 #> row SID74 SID79 geometry #> 1 1 1 0 MULTIPOLYGON (((-81.47276 3... #> 2 2 0 3 MULTIPOLYGON (((-81.23989 3... #> 3 3 5 6 MULTIPOLYGON (((-80.45634 3... #> 4 4 1 2 MULTIPOLYGON (((-76.00897 3... #> 5 5 9 3 MULTIPOLYGON (((-77.21767 3... #> 6 6 7 5 MULTIPOLYGON (((-76.74506 3...