Aggregate data using summary statistics such as mean or median. Can be calculated by group.
summarize(
.df,
...,
.by = NULL,
.sort = TRUE,
.groups = "drop_last",
.unpack = FALSE
)
summarise(
.df,
...,
.by = NULL,
.sort = TRUE,
.groups = "drop_last",
.unpack = FALSE
)
A data.frame or data.table
Aggregations to perform
Columns to group by.
A single column can be passed with .by = d
.
Multiple columns can be passed with .by = c(c, d)
tidyselect
can be used:
Single predicate: .by = where(is.character)
Multiple predicates: .by = c(where(is.character), where(is.factor))
A combination of predicates and column names: .by = c(where(is.character), b)
experimental: Default TRUE
.
If FALSE the original order of the grouping variables will be preserved.
Grouping structure of the result
"drop_last": Drop the last level of grouping
"drop": Drop all groups
"keep": Keep all groups
experimental: Default FALSE
. Should unnamed data frame inputs be unpacked.
The user must opt in to this option as it can lead to a reduction in performance.
df <- data.table(
a = 1:3,
b = 4:6,
c = c("a", "a", "b"),
d = c("a", "a", "b")
)
df %>%
summarize(avg_a = mean(a),
max_b = max(b),
.by = c)
#> # A tidytable: 2 × 3
#> c avg_a max_b
#> <chr> <dbl> <int>
#> 1 a 1.5 5
#> 2 b 3 6
df %>%
summarize(avg_a = mean(a),
.by = c(c, d))
#> # A tidytable: 2 × 3
#> c d avg_a
#> <chr> <chr> <dbl>
#> 1 a a 1.5
#> 2 b b 3