tidytable?tidyverse-like syntax with data.table speedrlang compatibilitydtplyr is missing, including many tidyr functionsNote: tidytable functions do not use data.table’s modify-by-reference, and instead use the copy-on-modify principles followed by the tidyverse and base R.
Install the released version from CRAN with:
install.packages("tidytable")
Or install the development version from GitHub with:
# install.packages("devtools") devtools::install_github("markfairbanks/tidytable")
tidytable uses verb.() syntax to replicate tidyverse functions:
library(tidytable) test_df <- data.table(x = c(1,2,3), y = c(4,5,6), z = c("a","a","b")) test_df %>% select.(x, y, z) %>% filter.(x < 4, y > 1) %>% arrange.(x, y) %>% mutate.(double_x = x * 2, double_y = y * 2) #> # tidytable [3 × 5] #> x y z double_x double_y #> <dbl> <dbl> <chr> <dbl> <dbl> #> 1 1 4 a 2 8 #> 2 2 5 a 4 10 #> 3 3 6 b 6 12
A full list of functions can be found here.
Group by calls are done from inside any function that has group by functionality (such as summarize.() & mutate.())
.by = z
.by = c(y, z)
tidyselect can also be used, including using predicates:
.by = where(is.character)
.by = c(where(is.character), where(is.factor))
.by = c(where(is.character), y)
test_df %>% summarize.(avg_x = mean(x), count = n.(), .by = z) #> # tidytable [2 × 3] #> z avg_x count #> <chr> <dbl> <int> #> 1 a 1.5 2 #> 2 b 3 1
tidyselect supporttidytable allows you to select/drop columns just like you would in the tidyverse.
Normal selection can be mixed with:
where(is.numeric), where(is.character), etc.everything(), starts_with(), ends_with(), contains(), any_of(), etc.test_df <- data.table(a = c(1,2,3), b = c(4,5,6), c = c("a","a","b"), d = c("a","b","c")) test_df %>% select.(a, where(is.character)) #> # tidytable [3 × 3] #> a c d #> <dbl> <chr> <chr> #> 1 1 a a #> 2 2 a b #> 3 3 b c
To drop columns use a - sign:
test_df %>% select.(-a, -where(is.character)) #> # tidytable [3 × 1] #> b #> <dbl> #> 1 4 #> 2 5 #> 3 6
These same ideas can be used whenever selecting columns in tidytable functions - for example when using count.(), drop_na.(), mutate_across.(), pivot_longer.(), etc.
A full overview of selection options can be found here.
rlang compatibilityrlang can be used to write custom functions with tidytable functions:
mutate.()
df <- data.table(x = c(1,1,1), y = c(1,1,1), z = c("a","a","b")) # Using enquo() with !! add_one <- function(data, add_col) { add_col <- enquo(add_col) data %>% mutate.(new_col = !!add_col + 1) } # Using the {{ }} shortcut add_one <- function(data, add_col) { data %>% mutate.(new_col = {{ add_col }} + 1) } df %>% add_one(x) #> # tidytable [3 × 4] #> x y z new_col #> <dbl> <dbl> <chr> <dbl> #> 1 1 1 a 2 #> 2 1 1 a 2 #> 3 1 1 b 2
summarize.()
df <- data.table(x = 1:10, y = c(rep("a", 6), rep("b", 4)), z = c(rep("a", 6), rep("b", 4))) find_mean <- function(data, grouping_cols, col) { data %>% summarize.(avg = mean({{ col }}), .by = {{ grouping_cols }}) } df %>% find_mean(grouping_cols = c(y, z), col = x) #> # tidytable [2 × 3] #> y z avg #> <chr> <chr> <dbl> #> 1 a a 3.5 #> 2 b b 8.5
All tidytable functions automatically convert data.frame and tibble inputs to a data.table:
library(dplyr) library(data.table) test_df <- tibble(x = c(1,2,3), y = c(4,5,6), z = c("a","a","b")) test_df %>% mutate.(double_x = x * 2) %>% is.data.table() #> [1] TRUE
dt() helperThe dt() function makes regular data.table syntax pipeable, so you can easily mix tidytable syntax with data.table syntax:
For those interested in performance, speed comparisons can be found here.