Skip to content

Slower fsetdiff vs dplyr::setdiff #7783

@cgiachalis

Description

@cgiachalis

This came up after code profiling where a loop was calling fsetdiff in each pass among other things.

library(data.table)

olddt <- data.table::data.table(a = paste0(1:10000, "a"))
newdt <- data.table::data.table(a = paste0(1:10010, "a"))


bench::mark(DT = data.table::fsetdiff(newdt, olddt),
            DPLYR = dplyr::setdiff(newdt, olddt),
            min_iterations = 30,
            relative = TRUE)[, 1:5]
# A tibble: 2 × 5
  expression     min  median `itr/sec` mem_alloc
  <bch:expr>   <dbl>   <dbl>     <dbl>     <dbl>
1 DT         10.7859 11.0259    1        1.81242
2 DPLYR       1       1        10.0019   1  
bench::mark(DT = data.table::fsetdiff(newdt, olddt),
            DPLYR = dplyr::setdiff(newdt, olddt),
            min_iterations = 30,
            relative = FALSE)[, 1:5]

# A tibble: 2 × 5
  expression      min   median `itr/sec` mem_alloc
  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>
1 DT           4.77ms   5.66ms   171.193     763KB
2 DPLYR         432µs  505.4µs  1799.31      421KB

Session info

sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.5.3 (2026-03-11 ucrt)
#>  os       Windows 11 x64 (build 26200)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  English_United Kingdom.utf8
#>  ctype    English_United Kingdom.utf8
#>  date     2026-06-07
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  bench         1.1.4   2025-01-16 [1] CRAN (R 4.5.2)
#>  cli           3.6.6   2026-04-09 [1] CRAN (R 4.5.3)
#>  data.table  * 1.18.4  2026-05-06 [1] CRAN (R 4.5.3)
#>  digest        0.6.39  2025-11-19 [1] CRAN (R 4.5.2)
#>  dplyr         1.2.1   2026-04-03 [1] CRAN (R 4.5.3)
#>  evaluate      1.0.5   2025-08-27 [1] CRAN (R 4.5.2)
#>  fastmap       1.2.0   2024-05-15 [1] CRAN (R 4.5.2)
#>  fs            2.1.0   2026-04-18 [1] CRAN (R 4.5.3)
#>  generics      0.1.4   2025-05-09 [1] CRAN (R 4.5.2)
#>  glue          1.8.1   2026-04-17 [1] CRAN (R 4.5.3)
#>  htmltools     0.5.9   2025-12-04 [1] CRAN (R 4.5.2)
#>  knitr         1.51    2025-12-20 [1] CRAN (R 4.5.2)
#>  lifecycle     1.0.5   2026-01-08 [1] CRAN (R 4.5.2)
#>  magrittr      2.0.5   2026-04-04 [1] CRAN (R 4.5.3)
#>  otel          0.2.0   2025-08-29 [1] CRAN (R 4.5.2)
#>  pillar        1.11.1  2025-09-17 [1] CRAN (R 4.5.2)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.5.2)
#>  profmem       0.7.0   2025-05-02 [1] CRAN (R 4.5.2)
#>  R6            2.6.1   2025-02-15 [1] CRAN (R 4.5.2)
#>  reprex        2.1.1   2024-07-06 [1] CRAN (R 4.5.2)
#>  rlang         1.2.0   2026-04-06 [1] CRAN (R 4.5.3)
#>  rmarkdown     2.31    2026-03-26 [1] CRAN (R 4.5.3)
#>  rstudioapi    0.18.0  2026-01-16 [1] CRAN (R 4.5.2)
#>  sessioninfo   1.2.4   2026-06-04 [1] CRAN (R 4.5.3)
#>  tibble        3.3.1   2026-01-11 [1] CRAN (R 4.5.2)
#>  tidyselect    1.2.1   2024-03-11 [1] CRAN (R 4.5.2)
#>  utf8          1.2.6   2025-06-08 [1] CRAN (R 4.5.2)
#>  vctrs         0.7.3   2026-04-11 [1] CRAN (R 4.5.3)
#>  withr         3.0.2   2024-10-28 [1] CRAN (R 4.5.2)
#>  xfun          0.57    2026-03-20 [1] CRAN (R 4.5.3)
#>  yaml          2.3.12  2025-12-10 [1] CRAN (R 4.5.2)
#> 
#>  [1] C:/Program Files/R/library
#>  [3] C:/Program Files/R/R-4.5.3/library
#>  * ── Packages attached to the search path.
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions