Superseded
extract()
has been superseded by separate_wider_regex()
.
Given a regular expression with capturing groups, extract()
turns each group
into a new column. If the groups don't match, or the input is NA
, the output
will be NA
. When you pass same name in the into
argument it will merge
the groups together. Whilst passing NA
in the into
arg will drop the group
from the resulting tidytable
extract(
.df,
col,
into,
regex = "([[:alnum:]]+)",
remove = TRUE,
convert = FALSE,
...
)
A data.table or data.frame
Column to extract from
New column names to split into. A character vector.
A regular expression to extract the desired values. There
should be one group (defined by ()
) for each element of into
If TRUE, remove the input column from the output data.table
If TRUE, runs type.convert()
on the resulting column.
Useful if the resulting column should be type integer/double.
Additional arguments passed on to methods.
df <- data.table(x = c(NA, "a-b-1", "a-d-3", "b-c-2", "d-e-7"))
df %>% extract(x, "A")
#> # A tidytable: 5 × 1
#> A
#> <chr>
#> 1 NA
#> 2 a
#> 3 a
#> 4 b
#> 5 d
df %>% extract(x, c("A", "B"), "([[:alnum:]]+)-([[:alnum:]]+)")
#> # A tidytable: 5 × 2
#> A B
#> <chr> <chr>
#> 1 NA NA
#> 2 a b
#> 3 a d
#> 4 b c
#> 5 d e
# If no match, NA:
df %>% extract(x, c("A", "B"), "([a-d]+)-([a-d]+)")
#> # A tidytable: 5 × 2
#> A B
#> <chr> <chr>
#> 1 NA NA
#> 2 a b
#> 3 a d
#> 4 b c
#> 5 NA NA
# drop columns by passing NA
df %>% extract(x, c("A", NA, "B"), "([a-d]+)-([a-d]+)-(\\d+)")
#> # A tidytable: 5 × 2
#> A B
#> <chr> <chr>
#> 1 NA NA
#> 2 a 1
#> 3 a 3
#> 4 b 2
#> 5 NA NA
# merge groups by passing same name
df %>% extract(x, c("A", "B", "A"), "([a-d]+)-([a-d]+)-(\\d+)")
#> # A tidytable: 5 × 2
#> A B
#> <chr> <chr>
#> 1 NANA NA
#> 2 a1 b
#> 3 a3 d
#> 4 b2 c
#> 5 NANA NA