Introduction
The select() function in dplyr is used to select columns from a data frame. It offers powerful column selection capabilities.
Basic Selection
library(dplyr)
df <- tibble(
id = 1:5,
name = c("Alice", "Bob", "Charlie", "David", "Eve"),
age = c(25, 30, 35, 40, 45),
score = c(85, 90, 78, 92, 88)
)
# Select single column
select(df, name)
# Select multiple columns
select(df, name, age)
# Select range of columns
select(df, name:age)
Selection Helpers
df <- tibble(
id = 1:5,
name = c("Alice", "Bob", "Charlie", "David", "Eve"),
age = c(25, 30, 35, 40, 45),
score1 = 85,
score2 = 90,
score3 = 78
)
# By pattern
select(df, starts_with("score"))
# Ends with
select(df, ends_with("e"))
# Contains
select(df, contains("a"))
# Matches regex
select(df, matches("^s"))
Advanced Selection
# Everything
select(df, everything())
# Select by column number
select(df, 1:3)
# Exclude columns
select(df, -id)
# Using where
select(df, where(is.numeric))
Rename with Select
# Rename
select(df, new_name = old_name)
# Rename and keep
select(df, new_name = old_name, everything())
Summary
select() provides flexible ways to choose columns in R. Master these helpers for efficient data manipulation.