t-Tests
ℹ️ Why It Matters
t-tests are the foundation of statistical inference, enabling you to determine whether observed differences between sample means are real or simply due to random chance. From A/B testing in tech companies to clinical trials in medicine, t-tests power critical decision-making across every quantitative field. Mastering them gives you the ability to draw rigorous conclusions from data.
Overview
A t-test compares a sample mean to a hypothesized value or compares means between groups. The one-sample t-test compares a single group to a known value. The two-sample t-test compares means of two independent groups. The paired t-test compares means from the same subjects measured twice (before/after). Welch's t-test is the robust default that doesn't assume equal variances. All t-tests assume independence and approximate normality (critical for small samples). For large samples (), the CLT ensures validity even without strict normality. Always report effect size (Cohen's d) alongside p-values to communicate practical significance.
Key Concepts
One-Sample t-Statistic
Here,
- =Sample mean
- =Hypothesized population mean
- =Sample standard deviation
- =Sample size
- =Degrees of freedom
Two-Sample t-Statistic (Pooled)
Here,
- =Pooled SD: $\sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}$
- =$n_1 + n_2 - 2$
Welch's t-Statistic
Here,
- =Welch-Satterthwaite approximation (non-integer)
Paired t-Statistic
Here,
- =Mean of differences $d_i = x_{1i} - x_{2i}$
- =Standard deviation of differences
- =Number of pairs
Cohen's d (Effect Size)
Here,
- =Pooled standard deviation
t-Test Selection Guide
| Scenario | Test | scipy Function | df Formula |
|---|---|---|---|
| Sample mean vs. known value | One-sample | ttest_1samp(x, mu0) | |
| Two independent groups (equal var) | Pooled | ttest_ind(x, y, equal_var=True) | |
| Two independent groups (unequal var) | Welch's | ttest_ind(x, y, equal_var=False) | Satterthwaite |
| Same subjects measured twice | Paired | ttest_rel(x, y) |
Assumptions
- Independence: Observations must be independent. No clustering or repeated measures (use paired instead).
- Normality: Data (or differences for paired) should be approximately normal. Critical for small samples ().
- Equal Variances (pooled only): Test with Levene's test. When in doubt, use Welch's.
- Continuous Data: Dependent variable on interval or ratio scale.
- No Significant Outliers: Check with boxplots or Cook's distance.
Quick Example
📝Welch's t-Test
Group A (, , ) vs. Group B (, , ):
With , critical value . Since , fail to reject . The difference is not statistically significant.
📝Effect Size
Two groups have means 112 and 120, pooled SD = 15.
This is a medium effect size (between 0.5 and 0.8), meaning the difference is practically meaningful.
Key Takeaways
📋Summary: t-Tests
- One-sample: Compare sample mean to a known value. Use when you have a benchmark.
- Two-sample: Compare means of two independent groups. Use Welch's as the default.
- Paired: Compare means from linked observations (before/after). More powerful than independent test for matched data.
- Welch's: Robust to unequal variances. Nearly as powerful as pooled when variances are equal.
- Assumptions: Independence is critical. Normality matters most for small samples ().
- Effect Size: Always report Cohen's d (0.2 = small, 0.5 = medium, 0.8 = large) alongside p-values.
- CV Model Comparison: Always use the paired t-test for cross-validation fold comparison in ML.
- Confidence Intervals: Construct CIs for the mean difference — they provide richer information than binary significant/not-significant.
Deep Dive
For detailed explanations, worked examples, and Python implementations, explore the dedicated statistics lessons:
Z-Test
- One-Sample Z-Test — When σ is known; the simplest hypothesis test for a mean
t-Tests
- One-Sample t-Test — Comparing a sample mean to a hypothesized value with worked examples
- Two-Sample t-Test — Comparing means of two independent groups (pooled and Welch's)
- Paired t-Test — Before/after studies and matched pairs analysis
Related Topics
- F-Test for Equality of Variances — Testing whether two groups have equal variances
- Levene's Test — Robust test for homogeneity of variances
- Hypothesis Testing — The foundational framework behind all t-tests
- Power of a Test — Determining sample size needed to detect a given effect
- Effect Size — Cohen's d, Hedges' g, and practical significance
- One-Way ANOVA — Extending t-test to 3+ groups (ANOVA with 2 groups = t²)