R Apply Family — Vectorized Function Application
Learning Objectives
By the end of this tutorial, you will be able to:
- Use
apply()for matrix and array operations - Apply
lapply()andsapply()for list/vector iteration - Use
vapply()for type-safe apply operations - Apply
tapply()for grouped calculations - Use
mapply()for multivariate apply - Know when to use each apply function vs loops
Why Apply Functions?
Apply functions replace loops with cleaner, more functional code:
# Slow: loop
result <- numeric(5)
for (i in 1:5) {
result[i] <- i^2
}
# Fast: apply
result <- sapply(1:5, function(x) x^2)
# [1] 1 4 9 16 25
apply() — Apply Over Matrix Margins
# Create a matrix
m <- matrix(1:12, nrow = 3, ncol = 4)
m
# [,1] [,2] [,3] [,4]
# [1,] 1 4 7 10
# [2,] 2 5 8 11
# [3,] 3 6 9 12
# Apply over rows (MARGIN = 1)
apply(m, 1, sum)
# [1] 22 26 30
# Apply over columns (MARGIN = 2)
apply(m, 2, sum)
# [1] 6 15 24 33
# Apply custom function
apply(m, 1, function(x) max(x) - min(x))
# [1] 9 9 9
apply(m, 2, mean)
# [1] 2 5 8 11
# Multiple return values
apply(m, 2, function(x) c(mean = mean(x), sd = sd(x)))
# [,1] [,2] [,3] [,4]
# mean 2.0000000 5.00 8.00 11.00
# sd 1.0000000 1.00 1.00 1.00
lapply() — Apply Over List, Returns List
# lapply always returns a list
x <- list(a = 1, b = 2, c = 3, d = 4)
lapply(x, function(x) x^2)
# $a
# [1] 1
# $b
# [1] 4
# $c
# [1] 9
# $d
# [1] 16
# With character vector
fruits <- c("apple", "banana", "cherry")
lapply(fruits, function(x) nchar(x))
# [[1]]
# [1] 5
# [[2]]
# [1] 6
# [[3]]
# [1] 6
# Using $ operator
lapply(fruits, nchar)
# [[1]]
# [1] 5
# [[2]]
# [1] 6
# [[3]]
# [1] 6
# With data frames
df <- data.frame(a = 1:5, b = 6:10, c = 11:15)
lapply(df, mean)
# $a
# [1] 3
# $b
# [1] 8
# $c
# [1] 13
sapply() — Simplified Apply
# sapply simplifies result when possible
x <- 1:5
sapply(x, function(x) x^2)
# [1] 1 4 9 16 25 (simplified to vector)
# vs lapply
lapply(x, function(x) x^2)
# [[1]]
# [1] 1
# ...
# With names
scores <- list(Alice = 95, Bob = 87, Charlie = 92)
sapply(scores, function(x) x * 1.1)
# Alice Bob Charlie
# 104.50 95.70 101.20
# When can't simplify — returns list
sapply(list(1, "a", TRUE), class)
# [[1]]
# [1] "numeric"
# [[2]]
# [1] "character"
# [[3]]
# [1] "logical"
# simplify = FALSE — always returns list
sapply(1:5, function(x) x^2, simplify = FALSE)
vapply() — Type-Safe Apply
# vapply requires specifying return type
x <- 1:5
vapply(x, function(x) x^2, numeric(1))
# [1] 1 4 9 16 25
# Type safety
vapply(x, function(x) x^2, character(1))
# Error: values must be type character
# With data frames
df <- data.frame(a = 1:5, b = 6:10, c = 11:15)
vapply(df, mean, numeric(1))
# a b c
# 3 8 13
# Returns named vector
vapply(list(Alice = 95, Bob = 87), function(x) x, numeric(1))
# Alice Bob
# 95 87
tapply() — Apply by Group
# Grouped calculations
grades <- c(85, 92, 78, 95, 88, 92, 85, 78)
group <- c("A", "A", "B", "A", "B", "B", "A", "B")
tapply(grades, group, mean)
# A B
# 90.00000 85.75000
# Multiple functions
tapply(grades, group, function(x) c(mean = mean(x), sd = sd(x)))
# $A
# A
# mean 90.0
#
# $B
# B
# mean 85.75
# Multiple grouping variables
gender <- c("M", "F", "M", "F", "M", "F", "M", "F")
tapply(grades, list(group, gender), mean)
# F M
# A 95.00 87.5
# B 85.00 85.0
# With NA
grades_with_na <- c(85, NA, 78, 95, 88)
tapply(grades_with_na, group[1:5], mean, na.rm = TRUE)
mapply() — Multivariate Apply
# Apply function to multiple vectors
x <- c(1, 2, 3, 4, 5)
y <- c(10, 20, 30, 40, 50)
mapply(function(a, b) a + b, x, y)
# [1] 11 22 33 44 55
# More than two vectors
mapply(function(a, b, c) a + b + c, x, y, x * y)
# [1] 11 44 99 176 275
# With names
mapply(paste, c("A", "B", "C"), c(1, 2, 3), sep = "-")
# [1] "A-1" "B-2" "C-3"
# Equivalent to Map
Map("+", x, y)
rapply() — Recursive Apply
# Apply to nested lists
nested <- list(a = 1, b = list(c = 2, d = 3), e = 4)
rapply(nested, function(x) x^2, how = "unlist")
# a c d e
# 1 4 9 16
rapply(nested, function(x) x + 10, how = "list")
Decision Guide: Which Apply Function?
| Function | Input | Output | Use Case |
|---|---|---|---|
apply() | Matrix/array | Vector/array | Row/column operations |
lapply() | List/vector | List | Always returns list |
sapply() | List/vector | Vector/list | Simplified result |
vapply() | List/vector | Vector/list | Type-safe sapply |
tapply() | Vector + groups | Array | Grouped calculations |
mapply() | Multiple vectors | Vector/list | Multivariate operations |
rapply() | Nested list | List | Recursive operations |
Practice Exercises
Exercise 1: Column Statistics
Use lapply() to calculate the mean of each column in mtcars.
Solution
lapply(mtcars, mean)
# Or with sapply for vector output
sapply(mtcars, mean)
Exercise 2: Group Summaries
Use tapply() to find the average weight of cars by number of cylinders in mtcars.
Solution
tapply(mtcars$wt, mtcars$cyl, mean)
# 4 6 8
# 2.285727 3.117143 3.999219
Key Takeaways
apply()works on matrix margins (rows or columns)lapply()always returns a list — safe defaultsapply()simplifies when possible — convenient but unpredictablevapply()is type-safe — best for production codetapply()does grouped calculations — like SQL GROUP BYmapply()applies to multiple vectors simultaneously- Vectorized operations are faster than loops in R
Next: Learn about R String Functions — advanced text manipulation.