R Apply Family — Vectorized Function Application

R BasicsApply FamilyFree Lesson

Advertisement

R Apply Family — Vectorized Function Application

Learning Objectives

By the end of this tutorial, you will be able to:

  • Use apply() for matrix and array operations
  • Apply lapply() and sapply() for list/vector iteration
  • Use vapply() for type-safe apply operations
  • Apply tapply() for grouped calculations
  • Use mapply() for multivariate apply
  • Know when to use each apply function vs loops

Why Apply Functions?

Apply functions replace loops with cleaner, more functional code:

# Slow: loop
result <- numeric(5)
for (i in 1:5) {
  result[i] <- i^2
}

# Fast: apply
result <- sapply(1:5, function(x) x^2)
# [1]  1  4  9 16 25

apply() — Apply Over Matrix Margins

# Create a matrix
m <- matrix(1:12, nrow = 3, ncol = 4)
m
#      [,1] [,2] [,3] [,4]
# [1,]    1    4    7   10
# [2,]    2    5    8   11
# [3,]    3    6    9   12

# Apply over rows (MARGIN = 1)
apply(m, 1, sum)
# [1] 22 26 30

# Apply over columns (MARGIN = 2)
apply(m, 2, sum)
# [1]  6 15 24 33

# Apply custom function
apply(m, 1, function(x) max(x) - min(x))
# [1] 9 9 9

apply(m, 2, mean)
# [1] 2 5 8 11

# Multiple return values
apply(m, 2, function(x) c(mean = mean(x), sd = sd(x)))
#           [,1] [,2] [,3] [,4]
# mean 2.0000000 5.00 8.00 11.00
# sd   1.0000000 1.00 1.00  1.00

lapply() — Apply Over List, Returns List

# lapply always returns a list
x <- list(a = 1, b = 2, c = 3, d = 4)
lapply(x, function(x) x^2)
# $a
# [1] 1
# $b
# [1] 4
# $c
# [1] 9
# $d
# [1] 16

# With character vector
fruits <- c("apple", "banana", "cherry")
lapply(fruits, function(x) nchar(x))
# [[1]]
# [1] 5
# [[2]]
# [1] 6
# [[3]]
# [1] 6

# Using $ operator
lapply(fruits, nchar)
# [[1]]
# [1] 5
# [[2]]
# [1] 6
# [[3]]
# [1] 6

# With data frames
df <- data.frame(a = 1:5, b = 6:10, c = 11:15)
lapply(df, mean)
# $a
# [1] 3
# $b
# [1] 8
# $c
# [1] 13

sapply() — Simplified Apply

# sapply simplifies result when possible
x <- 1:5
sapply(x, function(x) x^2)
# [1]  1  4  9 16 25  (simplified to vector)

# vs lapply
lapply(x, function(x) x^2)
# [[1]]
# [1] 1
# ...

# With names
scores <- list(Alice = 95, Bob = 87, Charlie = 92)
sapply(scores, function(x) x * 1.1)
#   Alice     Bob Charlie
#  104.50   95.70  101.20

# When can't simplify — returns list
sapply(list(1, "a", TRUE), class)
# [[1]]
# [1] "numeric"
# [[2]]
# [1] "character"
# [[3]]
# [1] "logical"

# simplify = FALSE — always returns list
sapply(1:5, function(x) x^2, simplify = FALSE)

vapply() — Type-Safe Apply

# vapply requires specifying return type
x <- 1:5
vapply(x, function(x) x^2, numeric(1))
# [1]  1  4  9 16 25

# Type safety
vapply(x, function(x) x^2, character(1))
# Error: values must be type character

# With data frames
df <- data.frame(a = 1:5, b = 6:10, c = 11:15)
vapply(df, mean, numeric(1))
# a  b  c
# 3  8 13

# Returns named vector
vapply(list(Alice = 95, Bob = 87), function(x) x, numeric(1))
# Alice   Bob
#    95    87

tapply() — Apply by Group

# Grouped calculations
grades <- c(85, 92, 78, 95, 88, 92, 85, 78)
group <- c("A", "A", "B", "A", "B", "B", "A", "B")

tapply(grades, group, mean)
#        A        B
# 90.00000 85.75000

# Multiple functions
tapply(grades, group, function(x) c(mean = mean(x), sd = sd(x)))
# $A
#       A
# mean 90.0
#
# $B
#        B
# mean 85.75

# Multiple grouping variables
gender <- c("M", "F", "M", "F", "M", "F", "M", "F")
tapply(grades, list(group, gender), mean)
#           F    M
#   A 95.00 87.5
#   B 85.00 85.0

# With NA
grades_with_na <- c(85, NA, 78, 95, 88)
tapply(grades_with_na, group[1:5], mean, na.rm = TRUE)

mapply() — Multivariate Apply

# Apply function to multiple vectors
x <- c(1, 2, 3, 4, 5)
y <- c(10, 20, 30, 40, 50)

mapply(function(a, b) a + b, x, y)
# [1] 11 22 33 44 55

# More than two vectors
mapply(function(a, b, c) a + b + c, x, y, x * y)
# [1]  11  44  99 176 275

# With names
mapply(paste, c("A", "B", "C"), c(1, 2, 3), sep = "-")
# [1] "A-1" "B-2" "C-3"

# Equivalent to Map
Map("+", x, y)

rapply() — Recursive Apply

# Apply to nested lists
nested <- list(a = 1, b = list(c = 2, d = 3), e = 4)

rapply(nested, function(x) x^2, how = "unlist")
#    a  c  d  e
#    1  4  9 16

rapply(nested, function(x) x + 10, how = "list")

Decision Guide: Which Apply Function?

FunctionInputOutputUse Case
apply()Matrix/arrayVector/arrayRow/column operations
lapply()List/vectorListAlways returns list
sapply()List/vectorVector/listSimplified result
vapply()List/vectorVector/listType-safe sapply
tapply()Vector + groupsArrayGrouped calculations
mapply()Multiple vectorsVector/listMultivariate operations
rapply()Nested listListRecursive operations

Practice Exercises

Exercise 1: Column Statistics

Use lapply() to calculate the mean of each column in mtcars.

Solution

lapply(mtcars, mean)

# Or with sapply for vector output
sapply(mtcars, mean)

Exercise 2: Group Summaries

Use tapply() to find the average weight of cars by number of cylinders in mtcars.

Solution

tapply(mtcars$wt, mtcars$cyl, mean)
#        4        6        8
# 2.285727 3.117143 3.999219

Key Takeaways

  • apply() works on matrix margins (rows or columns)
  • lapply() always returns a list — safe default
  • sapply() simplifies when possible — convenient but unpredictable
  • vapply() is type-safe — best for production code
  • tapply() does grouped calculations — like SQL GROUP BY
  • mapply() applies to multiple vectors simultaneously
  • Vectorized operations are faster than loops in R

Next: Learn about R String Functions — advanced text manipulation.

Advertisement

Need Expert R Programming Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement