summarise() is typically used on grouped data created by group_by().
The output will have one row for each group.
summarise(.data, ...) summarize(.data, ...)
| .data | A tbl. All main verbs are S3 generics and provide methods
for |
|---|---|
| ... | Name-value pairs of summary functions. The name will be the
name of the variable in the result. The value should be an expression
that returns a single value like These arguments are automatically quoted and
evaluated in the context of the data
frame. They support unquoting and
splicing. See |
An object of the same class as .data. One grouping level will
be dropped.
Center: mean(), median()
Spread: sd(), IQR(), mad()
Range: min(), max(), quantile()
Count: n(), n_distinct()
Logical: any(), all()
Data frames are the only backend that supports creating a variable and using it in the same summary. See examples for more details.
When applied to a data frame, row names are silently dropped. To preserve,
convert to an explicit variable with tibble::rownames_to_column().
Other single table verbs: arrange,
filter, mutate,
select, slice
# A summary applied to ungrouped tbl returns a single row mtcars %>% summarise(mean = mean(disp), n = n())#> mean n #> 1 230.7219 32# Usually, you'll want to group first mtcars %>% group_by(cyl) %>% summarise(mean = mean(disp), n = n())#> # A tibble: 3 × 3 #> cyl mean n #> <dbl> <dbl> <int> #> 1 4 105.1364 11 #> 2 6 183.3143 7 #> 3 8 353.1000 14# Each summary call removes one grouping level (since that group # is now just a single row) mtcars %>% group_by(cyl, vs) %>% summarise(cyl_n = n()) %>% group_vars()#> [1] "cyl"# Note that with data frames, newly created summaries immediately # overwrite existing variables mtcars %>% group_by(cyl) %>% summarise(disp = mean(disp), sd = sd(disp))#> # A tibble: 3 × 3 #> cyl disp sd #> <dbl> <dbl> <dbl> #> 1 4 105.1364 NA #> 2 6 183.3143 NA #> 3 8 353.1000 NA