Speed Comparisons

Below are some speed comparisons of various functions. More functions will get added to the speed comps over time.

A few notes:

  • Comparing times from separate functions won’t be very useful. For example - the summarize() tests were performed on a different dataset from case_when().
  • setDTthreads(4) was used for data.table & tidytable timings.
  • Modify-by-reference was used in data.table when being compared to mutate.() & dplyr::mutate()
  • fill.() & tidyr::fill() both work with character/factor/logical columns, whereas data.table::nafill() does not. Testing only included numeric columns due to this constraint.
  • dtplyr is missing timings for functions that are not yet implemented in the package.
  • pandas comparisons are in the process of being added - more will be added soon.
  • All tests are run 5 times. The times shown are the median of those 5 runs.
  • All timings are in milliseconds.
  • All tests can be found in the source code here.
  • FAQ - Why are some tidytable functions faster than their data.table counterpart?
    • All R functions have some slight natural variation in their execution time. If a tidytable function appears to be “faster” than data.table it’s due to this. However one goal of these tests is to show that the “time cost” of translating tidyverse syntax to data.table is negligible to the user.
  • Lastly I’d like to mention that these tests were not rigorously created to cover all angles equally. They are just meant to be used as general insight into the performance of these packages.
#> # tidytable [13 × 7]
#>    func_tested  data.table tidytable dtplyr tidyverse pandas tidytable_vs_dplyr
#>    <chr>             <dbl>     <dbl>  <dbl>     <dbl>  <dbl> <chr>             
#>  1 arrange            43.8      46.6   44.5    1351.   355   3.4%              
#>  2 case_when          26.3      25.3   NA       335.    59.2 7.6%              
#>  3 distinct           18.5      19     20        53.5  309   35.5%             
#>  4 fill               28.6      44.8   NA        66.7  846   67.2%             
#>  5 filter            228.      226.   238       261.   707   86.9%             
#>  6 inner_join         44.3      47.3  116        84.1   NA   56.2%             
#>  7 left_join          70.9      51.2  158.       87     NA   58.9%             
#>  8 mutate             69        56.3  538.       77.8   86.4 72.4%             
#>  9 nest                9.6      14.5   NA        30.7   NA   47.2%             
#> 10 pivot_longer       11.1      13.6   NA        47.7   NA   28.5%             
#> 11 pivot_wider        94.8      99.7   NA        78.8   NA   126.5%            
#> 12 summarize         176.      177.   178.      236.   834   75.2%             
#> 13 unnest             14.7       7.7   NA        30.4   NA   25.3%