A few notes:
summarize()
tests were performed on a
different dataset from case_when()
.setDTthreads(4)
was used for data.table
& tidytable
timings.data.table
when being
compared to mutate.()
&
dplyr::mutate()
fill.()
& tidyr::fill()
both work with
character/factor/logical columns, whereas
data.table::nafill()
does not. Testing only included
numeric columns due to this constraint.dtplyr
is missing timings for functions that are not
yet implemented in the package.pandas
comparisons are in the process of being added -
more will be added soon.tidytable
functions faster than
their data.table
counterpart?
data.table
in the background.tidytable
runs were
slightly shorter on those specific functions on this iteration of the
tests. However one goal of these tests is to show that the “time cost”
of translating tidyverse
syntax to data.table
is very negligible to the user (especially on medium-to-large
datasets).#> Date last run: 2023-04-20
#> # A tidytable: 13 × 6
#> func_tested data.table tidytable dtplyr tidyverse pandas
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 arrange 61.4 33.5 71.1 79.1 716.
#> 2 case_when 10.2 9.9 NA 53.6 64.4
#> 3 distinct 29.5 30.7 33.4 68.4 309.
#> 4 fill 40 74 58.3 37.8 724
#> 5 filter 189. 190. 205. 218. 904.
#> 6 inner_join 39.7 40 49.1 102. NA
#> 7 left_join 68.1 67.2 79.9 196. NA
#> 8 mutate 61.8 67.3 133. 63.3 780.
#> 9 nest 19.6 20.3 95.8 56.8 NA
#> 10 pivot_longer 61.3 37.7 70.1 213. NA
#> 11 pivot_wider 61.3 73.9 184. 141. NA
#> 12 summarize 433. 390. 397. 802. 3080.
#> 13 unnest 264. 75.4 NA 92.7 NA