Mann-Whitney U Test

The Mann-Whitney U test (Wilcoxon rank-sum test) is the nonparametric alternative to the two-sample t-test. Tests whether one group tends to have larger values than the other.

import numpy as np
from scipy import stats

np.random.seed(42)

# Non-normal data: response times (milliseconds)
group_a = np.array([250, 280, 230, 310, 270, 290, 260, 320, 245, 275, 1200])  # outlier!
group_b = np.array([200, 220, 195, 235, 215, 205, 240, 210, 225, 190])

print(f"Group A: median={np.median(group_a):.1f}, mean={np.mean(group_a):.1f}")
print(f"Group B: median={np.median(group_b):.1f}, mean={np.mean(group_b):.1f}")

# Mann-Whitney U test
U_stat, p_mw = stats.mannwhitneyu(group_a, group_b, alternative='two-sided')
print(f"\nMann-Whitney: U={U_stat:.2f}, p={p_mw:.4f}")

# Compare: independent t-test (affected by outlier)
t_stat, p_t = stats.ttest_ind(group_a, group_b)
print(f"Independent t-test: t={t_stat:.4f}, p={p_t:.4f}")

# Effect size: rank biserial correlation
n1, n2 = len(group_a), len(group_b)
r_rb = 1 - 2*U_stat/(n1*n2)
print(f"Effect size (rank biserial r) = {r_rb:.4f}")

The U Statistic

U counts the number of times a value from group 1 exceeds a value from group 2:

$U = \sum_{i=1}^{n_1}\sum_{j=1}^{n_2} \mathbb{1}(x_{1i} > x_{2j})$

# Manual U computation
U_manual = sum(1 for a in group_a for b in group_b if a > b)
print(f"Manual U = {U_manual} (fraction > : {U_manual/(n1*n2):.3f})")

Key Takeaways

Tests whether values from one group tend to be larger — stochastic dominance
Robust to outliers — uses ranks, not raw values
Does NOT test medians directly (common misconception) unless distributions are identical in shape
U ranges from 0 to n₁×n₂ — U = n₁n₂/2 means complete overlap
Effect size r = 1 − 2U/(n₁n₂): |r|<0.1 small, <0.3 medium, >0.5 large

Mann-Whitney U Test — Nonparametric Two-Sample Test

Mann-Whitney U Test

The U Statistic

Key Takeaways

Need Expert Statistics Help?