Introduction
Factors are used to represent categorical data in R. They store values as discrete levels and are essential for statistical modeling.
Creating Factors
# Using factor()
gender <- factor(c("Male", "Female", "Male", "Female"))
gender
# With specified levels
education <- factor(c("BS", "MS", "PhD", "BS"),
levels = c("HS", "BS", "MS", "PhD"))
education
# Ordered factors
rank <- factor(c("Junior", "Senior", "Mid", "Junior"),
levels = c("Junior", "Mid", "Senior"),
ordered = TRUE)
rank
Factor Functions
gender <- factor(c("Male", "Female", "Male"))
levels(gender) # Get levels
nlevels(gender) # Number of levels
table(gender) # Frequency table
summary(gender) # Summary
Modifying Factors
gender <- factor(c("Male", "Female", "Male"))
# Add levels
levels(gender) <- c("Male", "Female", "Other")
# Reorder levels
gender <- relevel(gender, ref = "Female")
# Convert to numeric
as.numeric(gender)
Summary
Factors are essential for categorical data in R. They enable proper statistical analysis and modeling.