One-Sample T-Test
The one-sample t-test tests whether a population mean μ equals a hypothesized value μ₀, when σ is unknown (the typical real-world case).
The t-distribution has heavier tails than the normal, accounting for uncertainty in estimating σ with s.
Assumptions
- Random sample from the population
- Normal population, OR n ≥ 30 (CLT makes t-test robust)
- Independence of observations
- Scale/ratio level measurement
Checking normality:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
np.random.seed(42)
sample = np.array([22.1, 23.5, 21.8, 24.2, 22.9, 23.1, 22.7, 24.0, 21.5, 23.8,
22.3, 23.6, 21.9, 24.1, 22.6, 23.3, 22.0, 24.3, 21.7, 23.9])
# Normality checks
stat_sw, p_sw = stats.shapiro(sample)
print(f"Shapiro-Wilk: W={stat_sw:.4f}, p={p_sw:.4f}")
print(f"{'Normal assumption OK' if p_sw > 0.05 else 'Normality violated'}")
fig, axes = plt.subplots(1, 2, figsize=(10, 4))
axes[0].hist(sample, bins=8, edgecolor='black', color='steelblue', alpha=0.7)
axes[0].set_title('Histogram of Sample')
stats.probplot(sample, dist='norm', plot=axes[1])
axes[1].set_title('Q-Q Plot (should be roughly linear)')
plt.tight_layout()
plt.show()
Complete Worked Example
# Scenario: A factory claims its bolts have mean diameter 10.00 mm.
# Quality control engineer samples 20 bolts.
bolt_diameters = np.array([9.98, 10.02, 9.97, 10.05, 9.99,
10.01, 9.96, 10.03, 9.98, 10.00,
9.94, 10.04, 9.97, 10.02, 9.99,
10.01, 9.96, 10.03, 9.98, 10.00])
mu_0 = 10.00
alpha = 0.05
n = len(bolt_diameters)
x_bar = bolt_diameters.mean()
s = bolt_diameters.std(ddof=1)
se = s / np.sqrt(n)
t_stat = (x_bar - mu_0) / se
df = n - 1
# Two-tailed p-value
p_value = 2 * stats.t.sf(abs(t_stat), df=df)
# Critical value
t_crit = stats.t.ppf(1 - alpha/2, df=df)
# 95% Confidence interval
ci = stats.t.interval(1-alpha, df=df, loc=x_bar, scale=se)
# Effect size (Cohen's d)
cohen_d = (x_bar - mu_0) / s
print("=== One-Sample T-Test Results ===")
print(f"H₀: μ = {mu_0} mm")
print(f"H₁: μ ≠ {mu_0} mm (two-tailed)")
print(f"\nDescriptive Statistics:")
print(f" n = {n}")
print(f" x̄ = {x_bar:.4f} mm")
print(f" s = {s:.4f} mm")
print(f" SE = {se:.4f} mm")
print(f"\nTest Results:")
print(f" t({df}) = {t_stat:.4f}")
print(f" Critical value: ±{t_crit:.4f}")
print(f" p-value = {p_value:.4f}")
print(f"\nEffect size: Cohen's d = {cohen_d:.4f}")
print(f" (|d| < 0.2: negligible, 0.2-0.5: small, 0.5-0.8: medium, >0.8: large)")
print(f"\n95% CI: ({ci[0]:.4f}, {ci[1]:.4f}) mm")
print(f"\nDecision: {'Reject H₀' if p_value < alpha else 'Fail to Reject H₀'}")
# Verify using scipy
t2, p2 = stats.ttest_1samp(bolt_diameters, popmean=mu_0)
print(f"\nScipy verification: t={t2:.4f}, p={p2:.4f}")
T-Distribution: Degrees of Freedom Effect
x = np.linspace(-5, 5, 500)
fig, ax = plt.subplots(figsize=(10, 5))
ax.plot(x, stats.norm.pdf(x), 'k-', linewidth=2, label='Normal (df=∞)')
for df in [1, 2, 5, 10, 30]:
ax.plot(x, stats.t.pdf(x, df=df), linewidth=1.5, label=f't (df={df})')
ax.set_xlim(-5, 5)
ax.set_title('T-Distribution vs Normal: Heavy Tails Decrease with df')
ax.legend()
ax.set_xlabel('t')
ax.set_ylabel('Density')
plt.savefig('t_distribution.png', dpi=150)
plt.show()
Power Analysis
from statsmodels.stats.power import TTestPower
analysis = TTestPower()
# Given: α=0.05, d=0.5, desired power=0.80 → needed n?
n_needed = analysis.solve_power(effect_size=0.5, alpha=0.05, power=0.80, alternative='two-sided')
print(f"For d=0.5, α=0.05, power=0.80: need n = {int(np.ceil(n_needed))}")
# Given our sample: what power did we have?
our_power = analysis.solve_power(effect_size=abs(cohen_d), alpha=0.05, nobs=n, alternative='two-sided')
print(f"Our study power: {our_power:.4f}")
Key Takeaways
- T-test vs Z-test: use t-test when σ is unknown (almost always)
- t-statistic measures standard errors between x̄ and μ₀
- Degrees of freedom = n − 1 — as n grows, t approaches z
- Always report: t, df, p, 95% CI, and Cohen's d
- Check assumptions: normality (Q-Q plot, Shapiro-Wilk) and independence
- Power analysis before the study tells you needed n — power after is "post-hoc"