Kruskal-Wallis Test — Nonparametric One-Way ANOVA

Nonparametric TestsNonparametric TestsFree Lesson

Advertisement

Kruskal-Wallis Test

The Kruskal-Wallis test is the nonparametric alternative to one-way ANOVA. It tests whether k independent groups have the same distribution (or equivalently, the same median when distributions are identically shaped).

H=12N(N+1)i=1kRi2ni3(N+1)H = \frac{12}{N(N+1)} \sum_{i=1}^k \frac{R_i^2}{n_i} - 3(N+1)

where Rᵢ is the sum of ranks in group i.

import numpy as np
from scipy import stats
import scikit_posthocs as sp  # pip install scikit-posthocs
import matplotlib.pyplot as plt

np.random.seed(42)

# Test: Does pain relief differ across 3 medication types?
# Data is not normally distributed (skewed)
drug_a = np.random.lognormal(2.0, 0.6, 25)  # pain scores
drug_b = np.random.lognormal(2.3, 0.5, 25)
drug_c = np.random.lognormal(2.6, 0.7, 25)

# Kruskal-Wallis
H, p = stats.kruskal(drug_a, drug_b, drug_c)
df = 3 - 1  # k-1

print(f"Kruskal-Wallis H({df}) = {H:.4f}, p = {p:.4f}")
print(f"Decision: {'Reject H₀ — groups differ' if p < 0.05 else 'Fail to reject H₀'}")

# Effect size: eta-squared for Kruskal-Wallis
n = len(drug_a) + len(drug_b) + len(drug_c)
eta2 = (H - df + 1) / (n - df)
print(f"Effect size η²_KW = {eta2:.4f}")

# If significant → post-hoc pairwise comparisons (Dunn's test)
try:
    import scikit_posthocs as sp
    data_combined = [drug_a, drug_b, drug_c]
    posthoc = sp.posthoc_dunn(data_combined, p_adjust='bonferroni')
    print("\nDunn's Post-hoc Test (Bonferroni corrected):")
    print(posthoc.round(4))
except ImportError:
    # Manual Mann-Whitney pairwise
    pairs = [('A vs B', drug_a, drug_b), ('A vs C', drug_a, drug_c), ('B vs C', drug_b, drug_c)]
    bonf_alpha = 0.05 / 3
    for name, g1, g2 in pairs:
        _, p_mw = stats.mannwhitneyu(g1, g2, alternative='two-sided')
        print(f"{name}: p={p_mw:.4f} → {'Significant' if p_mw < bonf_alpha else 'Not significant'} (Bonferroni α={bonf_alpha:.4f})")

# Box plots
fig, ax = plt.subplots(figsize=(8, 5))
ax.boxplot([drug_a, drug_b, drug_c], labels=['Drug A', 'Drug B', 'Drug C'], patch_artist=True)
ax.set_title(f'Pain Scores by Drug\nKruskal-Wallis H={H:.3f}, p={p:.4f}')
ax.set_ylabel('Pain Score')
plt.tight_layout()
plt.savefig('kruskal_wallis.png', dpi=150)
plt.show()

Key Takeaways

  1. Nonparametric alternative to one-way ANOVA — uses ranks
  2. Assumes independence, ordinal+ data, and identically shaped distributions between groups
  3. If significant: use Dunn's test (post-hoc) with Bonferroni or BH correction
  4. More robust than ANOVA for skewed or heavy-tailed distributions
  5. Less powerful than ANOVA when normality truly holds

Advertisement

Need Expert Statistics Help?

Get personalized tutoring, dissertation support, or statistical consulting.

Advertisement