R Hypothesis Testing — Statistical Significance

Learning Objectives

By the end of this tutorial, you will be able to:

Conduct one-sample, two-sample, and paired t-tests
Perform chi-square tests for independence and goodness of fit
Run one-way and two-way ANOVA
Apply non-parametric tests (Wilcoxon, Kruskal-Wallis)
Interpret p-values and confidence intervals

Hypothesis Testing Framework

Step	Description
1. State hypotheses	H₀ (null) and H₁ (alternative)
2. Choose significance level	α = 0.05 (typical)
3. Calculate test statistic	t, F, χ², etc.
4. Calculate p-value	Probability of observing data if H₀ is true
5. Make decision	Reject or fail to reject H₀

One-Sample t-test

# Test if mean equals a value
x <- rnorm(30, mean = 52, sd = 10)

# H₀: μ = 50
t.test(x, mu = 0)

# Alternative hypotheses
t.test(x, mu = 0, alternative = "greater")  # μ > 50
t.test(x, mu = 0, alternative = "less")     # μ < 50

# Extract results
result <- t.test(x, mu = 0)
result$statistic    # t-value
result$p.value      # p-value
result$conf.int     # 95% CI
result$estimate     # sample mean

Two-Sample t-test

# Independent samples
group1 <- rnorm(30, mean = 50, sd = 10)
group2 <- rnorm(30, mean = 55, sd = 10)

# Welch's t-test (default, does not assume equal variances)
t.test(group1, group2)

# Student's t-test (assumes equal variances)
t.test(group1, group2, var.equal = TRUE)

# Paired t-test
before <- rnorm(20, mean = 50, sd = 10)
after <- before + rnorm(20, mean = 5, sd = 5)
t.test(before, after, paired = TRUE)

Chi-Square Tests

Test of Independence

# Contingency table
data <- matrix(c(50, 30, 20, 40), nrow = 2,
               dimnames = list(Gender = c("M", "F"),
                               Preference = c("A", "B")))

# Chi-square test
chisq.test(data)

# Expected frequencies
chisq.test(data)$expected

# Cramér's V (effect size)
cramers_v <- function(x) {
  chi2 <- chisq.test(x)$statistic
  n <- sum(x)
  k <- min(nrow(x), ncol(x))
  sqrt(chi2 / (n * (k - 1)))
}
cramers_v(data)

Goodness of Fit

# Observed frequencies
observed <- c(30, 20, 50)
expected <- c(1/3, 1/3, 1/3)

chisq.test(observed, p = expected)

ANOVA (Analysis of Variance)

One-Way ANOVA

# Compare means across groups
data <- data.frame(
  score = c(rnorm(30, mean = 70), rnorm(30, mean = 75), rnorm(30, mean = 80)),
  group = rep(c("A", "B", "C"), each = 30)
)

# One-way ANOVA
result <- aov(score ~ group, data = data)
summary(result)

# Post-hoc comparisons
TukeyHSD(result)

# Effect size (eta-squared)
summary_result <- summary(result)
ss_between <- summary_result[[1]]["group", "Sum Sq"]
ss_total <- sum(summary_result[[1]][, "Sum Sq"])
eta_sq <- ss_between / ss_total
cat("Eta-squared:", eta_sq, "\n")

Two-Way ANOVA

# Two factors
data <- expand.grid(
  factor1 = c("A", "B"),
  factor2 = c("X", "Y", "Z"),
  rep = 10
)
data$value <- rnorm(nrow(data), mean = 50)

# Two-way ANOVA
result <- aov(value ~ factor1 * factor2, data = data)
summary(result)

# Interaction plot
interaction.plot(data$factor1, data$factor2, data$value)

ANOVA Assumptions

# Check assumptions
result <- aov(score ~ group, data = data)

# Normality (Shapiro-Wilk test)
shapiro.test(residuals(result))

# Homogeneity of variances (Levene's test)
library(car)
leveneTest(score ~ group, data = data)

# If assumptions violated, use non-parametric
kruskal.test(score ~ group, data = data)

Non-Parametric Tests

Wilcoxon Tests

# Mann-Whitney U test (two independent samples)
wilcox.test(group1, group2)

# Wilcoxon signed-rank test (paired samples)
wilcox.test(before, after, paired = TRUE)

# One-sample Wilcoxon
wilcox.test(x, mu = 50)

Kruskal-Wallis Test

# Non-parametric one-way ANOVA
kruskal.test(score ~ group, data = data)

# Post-hoc
pairwise.wilcox.test(data$score, data$group, p.adjust.method = "bonferroni")

Friedman Test

# Non-parametric repeated measures
data <- data.frame(
  subject = rep(1:10, 3),
  condition = rep(c("A", "B", "C"), each = 10),
  response = rnorm(30)
)

friedman.test(response ~ condition | subject, data = data)

Effect Sizes

# Cohen's d (two groups)
cohens_d <- function(x, y) {
  nx <- length(x)
  ny <- length(y)
  pooled_sd <- sqrt(((nx - 1) * var(x) + (ny - 1) * var(y)) / (nx + ny - 2))
  (mean(x) - mean(y)) / pooled_sd
}

cohens_d(group1, group2)

# Interpretation:
# |d| < 0.2: negligible
# 0.2 ≤ |d| < 0.5: small
# 0.5 ≤ |d| < 0.8: medium
# |d| ≥ 0.8: large

Power Analysis

library(pwr)

# Two-sample t-test
pwr.t.test(n = 30, d = 0.5, sig.level = 0.05, type = "two.sample")

# Sample size needed
pwr.t.test(power = 0.8, d = 0.5, sig.level = 0.05, type = "two.sample")

# Effect size for ANOVA
pwr.anova.test(k = 3, n = 30, sig.level = 0.05, power = 0.8)

# Chi-square
pwr.chisq.test(w = 0.3, df = 1, sig.level = 0.05, power = 0.8)

Practical Examples

Example 1: A/B Test

# Simulate A/B test
set.seed(42)
control <- rbinom(1000, 1, prob = 0.10)   # 10% conversion
treatment <- rbinom(1000, 1, prob = 0.12)  # 12% conversion

# Proportion test
prop.test(
  x = c(sum(treatment), sum(control)),
  n = c(length(treatment), length(control))
)

Practice Exercises

Exercise 1: Drug Efficacy

Test if a drug lowers blood pressure using a paired t-test.

Solution

set.seed(42)
before <- rnorm(30, mean = 140, sd = 10)
after <- before - rnorm(30, mean = 5, sd = 8)

t.test(before, after, paired = TRUE)
# Significant if p < 0.05

Key Takeaways

t-test: Compare means (1 or 2 groups)
Chi-square: Test association between categorical variables
ANOVA: Compare means across 3+ groups
Non-parametric alternatives: Wilcoxon, Kruskal-Wallis when assumptions violated
Always check assumptions: normality, equal variances
Report effect sizes along with p-values
Use power analysis to determine sample size

Next: Learn about R Linear Regression — modeling relationships.

R Hypothesis Testing — Statistical Significance

R Hypothesis Testing — Statistical Significance

Learning Objectives

Hypothesis Testing Framework

One-Sample t-test

Two-Sample t-test

Chi-Square Tests

Test of Independence

Goodness of Fit

ANOVA (Analysis of Variance)

One-Way ANOVA

Two-Way ANOVA

ANOVA Assumptions

Non-Parametric Tests

Wilcoxon Tests

Kruskal-Wallis Test

Friedman Test

Effect Sizes

Power Analysis

Practical Examples

Example 1: A/B Test

Practice Exercises

Exercise 1: Drug Efficacy

Key Takeaways

Need Expert R Programming Help?