Collapse Column(s) Flexibly
collapse_col_flex.Rd
Collapses values in the column(s) specified using a defined method for each
record, where a record is defined as a unique observation comprised of all
columns NOT specified in collapse
).
Arguments
- df
A data.frame
- ...
Column-method pairs specifying the
method
to use for each column to be collapsed; column names can be bare variables or strings, methods must be strings, e.g. column_name = method.- method
A string identifying a function to use; one of "unique", "first", or "last"
- delim
A delimiter to place between vector elements (default: "|").
Value
A data.frame with the specified columns collapsed. Also NOTE the following:
For all methods, rows will be reordered by the unique combination of columns not collapsed (due to
dplyr::group_by()
use)._For "unique" method, collapsed columns will be converted to
character
.
See also
collapse_col()
for a simpler approach and lengthen_col()
for the
pseudo-reverse operation that lengthens/expands one or more specified columns.
Examples
cc_df <- tibble::tibble(
x = c(1, 2, 3, 3, 4, 4, 4),
y = c("a", "a", "b", "b", "c", "c", "e"),
z = c("Z", "Y", "X", "X", "W", "V", "U")
)
cc_df
#> # A tibble: 7 × 3
#> x y z
#> <dbl> <chr> <chr>
#> 1 1 a Z
#> 2 2 a Y
#> 3 3 b X
#> 4 3 b X
#> 5 4 c W
#> 6 4 c V
#> 7 4 e U
# completely duplicated rows (3-4) are collapsed with any column(s) specified
collapse_col_flex(cc_df, y)
#> # A tibble: 6 × 3
#> x y z
#> <dbl> <chr> <chr>
#> 1 1 a Z
#> 2 2 a Y
#> 3 3 b X
#> 4 4 e U
#> 5 4 c V
#> 6 4 c W
# individual columns
collapse_col_flex(cc_df, z, method = "unique")
#> # A tibble: 5 × 3
#> x y z
#> <dbl> <chr> <chr>
#> 1 1 a Z
#> 2 2 a Y
#> 3 3 b X
#> 4 4 c W|V
#> 5 4 e U
collapse_col_flex(cc_df, z, method = "first")
#> # A tibble: 5 × 3
#> x y z
#> <dbl> <chr> <chr>
#> 1 1 a Z
#> 2 2 a Y
#> 3 3 b X
#> 4 4 c W
#> 5 4 e U
collapse_col_flex(cc_df, z, method = "last")
#> # A tibble: 5 × 3
#> x y z
#> <dbl> <chr> <chr>
#> 1 1 a Z
#> 2 2 a Y
#> 3 3 b X
#> 4 4 c V
#> 5 4 e U
# multiple columns can be collapsed using the same method
collapse_col_flex(cc_df, x, z, method = "unique")
#> # A tibble: 4 × 3
#> x y z
#> <chr> <chr> <chr>
#> 1 1|2 a Z|Y
#> 2 3 b X
#> 3 4 c W|V
#> 4 4 e U
collapse_col_flex(cc_df, x, z, method = "first")
#> # A tibble: 4 × 3
#> x y z
#> <dbl> <chr> <chr>
#> 1 1 a Z
#> 2 3 b X
#> 3 4 c W
#> 4 4 e U
collapse_col_flex(cc_df, x, z, method = "last")
#> # A tibble: 4 × 3
#> x y z
#> <dbl> <chr> <chr>
#> 1 2 a Y
#> 2 3 b X
#> 3 4 c V
#> 4 4 e U
# ...or using different methods
collapse_col_flex(cc_df, x = "unique", z = "unique")
#> # A tibble: 4 × 3
#> x y z
#> <chr> <chr> <chr>
#> 1 1|2 a Z|Y
#> 2 3 b X
#> 3 4 c W|V
#> 4 4 e U
collapse_col_flex(cc_df, x = "first", z = "unique")
#> # A tibble: 4 × 3
#> x y z
#> <dbl> <chr> <chr>
#> 1 1 a Z|Y
#> 2 3 b X
#> 3 4 c W|V
#> 4 4 e U
collapse_col_flex(cc_df, x = "first", z = "last")
#> # A tibble: 4 × 3
#> x y z
#> <dbl> <chr> <chr>
#> 1 1 a Y
#> 2 3 b X
#> 3 4 c V
#> 4 4 e U