R Hypothesis Testing β Statistical Significance
Learning Objectives
By the end of this tutorial, you will be able to:
- Conduct one-sample, two-sample, and paired t-tests
- Perform chi-square tests for independence and goodness of fit
- Run one-way and two-way ANOVA
- Apply non-parametric tests (Wilcoxon, Kruskal-Wallis)
- Interpret p-values and confidence intervals
Hypothesis Testing Framework
| Step | Description |
|---|---|
| 1. State hypotheses | Hβ (null) and Hβ (alternative) |
| 2. Choose significance level | Ξ± = 0.05 (typical) |
| 3. Calculate test statistic | t, F, ΟΒ², etc. |
| 4. Calculate p-value | Probability of observing data if Hβ is true |
| 5. Make decision | Reject or fail to reject Hβ |
One-Sample t-test
# Test if mean equals a value
x <- rnorm(30, mean = 52, sd = 10)
# Hβ: ΞΌ = 50
t.test(x, mu = 0)
# Alternative hypotheses
t.test(x, mu = 0, alternative = "greater") # ΞΌ > 50
t.test(x, mu = 0, alternative = "less") # ΞΌ < 50
# Extract results
result <- t.test(x, mu = 0)
result$statistic # t-value
result$p.value # p-value
result$conf.int # 95% CI
result$estimate # sample mean
Two-Sample t-test
# Independent samples
group1 <- rnorm(30, mean = 50, sd = 10)
group2 <- rnorm(30, mean = 55, sd = 10)
# Welch's t-test (default, does not assume equal variances)
t.test(group1, group2)
# Student's t-test (assumes equal variances)
t.test(group1, group2, var.equal = TRUE)
# Paired t-test
before <- rnorm(20, mean = 50, sd = 10)
after <- before + rnorm(20, mean = 5, sd = 5)
t.test(before, after, paired = TRUE)
Chi-Square Tests
Test of Independence
# Contingency table
data <- matrix(c(50, 30, 20, 40), nrow = 2,
dimnames = list(Gender = c("M", "F"),
Preference = c("A", "B")))
# Chi-square test
chisq.test(data)
# Expected frequencies
chisq.test(data)$expected
# CramΓ©r's V (effect size)
cramers_v <- function(x) {
chi2 <- chisq.test(x)$statistic
n <- sum(x)
k <- min(nrow(x), ncol(x))
sqrt(chi2 / (n * (k - 1)))
}
cramers_v(data)
Goodness of Fit
# Observed frequencies
observed <- c(30, 20, 50)
expected <- c(1/3, 1/3, 1/3)
chisq.test(observed, p = expected)
ANOVA (Analysis of Variance)
One-Way ANOVA
# Compare means across groups
data <- data.frame(
score = c(rnorm(30, mean = 70), rnorm(30, mean = 75), rnorm(30, mean = 80)),
group = rep(c("A", "B", "C"), each = 30)
)
# One-way ANOVA
result <- aov(score ~ group, data = data)
summary(result)
# Post-hoc comparisons
TukeyHSD(result)
# Effect size (eta-squared)
summary_result <- summary(result)
ss_between <- summary_result[[1]]["group", "Sum Sq"]
ss_total <- sum(summary_result[[1]][, "Sum Sq"])
eta_sq <- ss_between / ss_total
cat("Eta-squared:", eta_sq, "\n")
Two-Way ANOVA
# Two factors
data <- expand.grid(
factor1 = c("A", "B"),
factor2 = c("X", "Y", "Z"),
rep = 10
)
data$value <- rnorm(nrow(data), mean = 50)
# Two-way ANOVA
result <- aov(value ~ factor1 * factor2, data = data)
summary(result)
# Interaction plot
interaction.plot(data$factor1, data$factor2, data$value)
ANOVA Assumptions
# Check assumptions
result <- aov(score ~ group, data = data)
# Normality (Shapiro-Wilk test)
shapiro.test(residuals(result))
# Homogeneity of variances (Levene's test)
library(car)
leveneTest(score ~ group, data = data)
# If assumptions violated, use non-parametric
kruskal.test(score ~ group, data = data)
Non-Parametric Tests
Wilcoxon Tests
# Mann-Whitney U test (two independent samples)
wilcox.test(group1, group2)
# Wilcoxon signed-rank test (paired samples)
wilcox.test(before, after, paired = TRUE)
# One-sample Wilcoxon
wilcox.test(x, mu = 50)
Kruskal-Wallis Test
# Non-parametric one-way ANOVA
kruskal.test(score ~ group, data = data)
# Post-hoc
pairwise.wilcox.test(data$score, data$group, p.adjust.method = "bonferroni")
Friedman Test
# Non-parametric repeated measures
data <- data.frame(
subject = rep(1:10, 3),
condition = rep(c("A", "B", "C"), each = 10),
response = rnorm(30)
)
friedman.test(response ~ condition | subject, data = data)
Effect Sizes
# Cohen's d (two groups)
cohens_d <- function(x, y) {
nx <- length(x)
ny <- length(y)
pooled_sd <- sqrt(((nx - 1) * var(x) + (ny - 1) * var(y)) / (nx + ny - 2))
(mean(x) - mean(y)) / pooled_sd
}
cohens_d(group1, group2)
# Interpretation:
# |d| < 0.2: negligible
# 0.2 β€ |d| < 0.5: small
# 0.5 β€ |d| < 0.8: medium
# |d| β₯ 0.8: large
Power Analysis
library(pwr)
# Two-sample t-test
pwr.t.test(n = 30, d = 0.5, sig.level = 0.05, type = "two.sample")
# Sample size needed
pwr.t.test(power = 0.8, d = 0.5, sig.level = 0.05, type = "two.sample")
# Effect size for ANOVA
pwr.anova.test(k = 3, n = 30, sig.level = 0.05, power = 0.8)
# Chi-square
pwr.chisq.test(w = 0.3, df = 1, sig.level = 0.05, power = 0.8)
Practical Examples
Example 1: A/B Test
# Simulate A/B test
set.seed(42)
control <- rbinom(1000, 1, prob = 0.10) # 10% conversion
treatment <- rbinom(1000, 1, prob = 0.12) # 12% conversion
# Proportion test
prop.test(
x = c(sum(treatment), sum(control)),
n = c(length(treatment), length(control))
)
Practice Exercises
Exercise 1: Drug Efficacy
Test if a drug lowers blood pressure using a paired t-test.
Solution
set.seed(42)
before <- rnorm(30, mean = 140, sd = 10)
after <- before - rnorm(30, mean = 5, sd = 8)
t.test(before, after, paired = TRUE)
# Significant if p < 0.05
Key Takeaways
- t-test: Compare means (1 or 2 groups)
- Chi-square: Test association between categorical variables
- ANOVA: Compare means across 3+ groups
- Non-parametric alternatives: Wilcoxon, Kruskal-Wallis when assumptions violated
- Always check assumptions: normality, equal variances
- Report effect sizes along with p-values
- Use power analysis to determine sample size
Next: Learn about R Linear Regression β modeling relationships.