Statistical Power
Power = P(Reject H₀ | H₁ is true) = 1 − β
Power is the probability of detecting a true effect. Low power means wasted resources and missed discoveries.
Factors Affecting Power
| Factor | Effect on Power |
|---|---|
| Sample size (↑) | ↑ Power |
| Effect size (↑) | ↑ Power |
| α (↑) | ↑ Power (but ↑ Type I error) |
| σ (↑) | ↓ Power |
| One-tailed test | ↑ Power (vs two-tailed) |
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
from statsmodels.stats.power import TTestIndPower, TTestPower
# ==========================================
# A PRIORI POWER ANALYSIS
# ==========================================
analysis = TTestIndPower()
# Goal: 80% power to detect medium effect (d=0.5) at α=0.05
n_needed = analysis.solve_power(effect_size=0.5, alpha=0.05, power=0.80,
alternative='two-sided')
print(f"For d=0.5, α=0.05, power=0.80:")
print(f"Need n = {int(np.ceil(n_needed))} per group")
# Power curves: power vs sample size for different effect sizes
n_range = np.arange(10, 300, 5)
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
for d, label in [(0.2, 'Small (d=0.2)'), (0.5, 'Medium (d=0.5)'), (0.8, 'Large (d=0.8)')]:
powers = [analysis.solve_power(effect_size=d, alpha=0.05, nobs1=n, alternative='two-sided')
for n in n_range]
axes[0].plot(n_range, powers, linewidth=2, label=label)
axes[0].axhline(0.80, color='red', linewidth=1.5, linestyle='--', label='80% power target')
axes[0].axhline(0.90, color='orange', linewidth=1.5, linestyle=':', label='90% power target')
axes[0].set_xlabel('Sample Size per Group')
axes[0].set_ylabel('Power (1 − β)')
axes[0].set_title('Power Curves for Two-Sample T-Test
(α = 0.05, Two-Tailed)')
axes[0].legend()
axes[0].grid(True, alpha=0.3)
# Effect size vs required n for 80% power
effect_sizes = np.arange(0.1, 1.5, 0.05)
n_required = [int(np.ceil(analysis.solve_power(effect_size=d, alpha=0.05, power=0.80,
alternative='two-sided')))
for d in effect_sizes]
axes[1].plot(effect_sizes, n_required, 'b-', linewidth=2)
axes[1].axvline(0.2, color='gray', linestyle=':', label='Small (d=0.2)')
axes[1].axvline(0.5, color='blue', linestyle=':', label='Medium (d=0.5)')
axes[1].axvline(0.8, color='red', linestyle=':', label='Large (d=0.8)')
axes[1].set_xlabel("Cohen's d (Effect Size)")
axes[1].set_ylabel('Required n per Group')
axes[1].set_title('Required Sample Size for 80% Power
(α = 0.05, Two-Tailed)')
axes[1].legend()
axes[1].grid(True, alpha=0.3)
axes[1].set_ylim(0, 1000)
plt.tight_layout()
plt.savefig('power_analysis.png', dpi=150)
plt.show()
# ==========================================
# POST-HOC POWER ANALYSIS
# ==========================================
# After collecting data: what power did we have?
n_per_group = 25
observed_d = 0.45
achieved_power = analysis.solve_power(effect_size=observed_d, alpha=0.05, nobs1=n_per_group,
alternative='two-sided')
print(f"
Post-hoc power analysis:")
print(f"n={n_per_group}, d={observed_d}, α=0.05 → Power = {achieved_power:.3f}")
print(f"This study was {'adequately' if achieved_power >= 0.80 else 'under'}-powered")
Standard Power Thresholds
| Power | Assessment |
|---|---|
| < 0.50 | Very underpowered — likely to miss real effects |
| 0.50–0.79 | Underpowered — risky |
| ≥ 0.80 | Conventional minimum (Cohen's recommendation) |
| ≥ 0.90 | Strong — suitable for high-stakes decisions |
| ≥ 0.95 | Very strong — clinical trials often target this |
Key Takeaways
- Always conduct a priori power analysis before collecting data
- 80% power is the conventional minimum — many journals require this
- Post-hoc power analysis is controversial — it's circular (just a function of p-value)
- Underpowered studies waste resources and produce false negatives
- Small effect sizes require large n — planning for effect size is the key decision