dplyr operation that I don't know

a summary of some tricky functions

selection

  • X:Y (column X to Y)
  • start_with (string)
  • end_with (string)
  • matches (regex)
  • contains (string)

conditons

  • if_else(condition, TRUE_operation, FASLE_operation)

  • case_when(condition ~ operation_if_TRUE)

across and c_aross

across is a verb for selected column (i.e., vector) c_across is a verb for row, but must be used with rowwise()

1
2
3
4
5
# calculate mean for x and y column vector,  imagine x=mean(x), y=mean(y)
df %>% mutate(across(c(x,y), mean))

## Yes, it can't be used in the same functional programming way, how weird!
df %>% rowwise() %>% mutate(m = mean(c_across(c(x,y))))

lambda function (in purrr package)

~ = lambda, .x = argument

1
2
## plus 1 for selected column vector, imagine x=x+1, y=y+1
df %>% mutate(across(c(x,y), ~ .x + 1))

to be continue …

0%