← Math|49 of 100
Statistics

t-Tests

Master one-sample, two-sample, paired, and Welch's t-tests with formulas, assumptions, and real-world applications.

📂 Parametric Tests📖 Lesson 49 of 100🎓 Free Course

Advertisement

t-Tests

ℹ️ Why It Matters

t-tests are the foundation of statistical inference, enabling you to determine whether observed differences between sample means are real or simply due to random chance. From A/B testing in tech companies to clinical trials in medicine, t-tests power critical decision-making across every quantitative field. Mastering them gives you the ability to draw rigorous conclusions from data.


Overview

A t-test compares a sample mean to a hypothesized value or compares means between groups. The one-sample t-test compares a single group to a known value. The two-sample t-test compares means of two independent groups. The paired t-test compares means from the same subjects measured twice (before/after). Welch's t-test is the robust default that doesn't assume equal variances. All t-tests assume independence and approximate normality (critical for small samples). For large samples (n>30n > 30), the CLT ensures validity even without strict normality. Always report effect size (Cohen's d) alongside p-values to communicate practical significance.


Key Concepts

One-Sample t-Statistic

t=xˉμ0s/nt = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}

Here,

  • xˉ\bar{x}=Sample mean
  • μ0\mu_0=Hypothesized population mean
  • ss=Sample standard deviation
  • nn=Sample size
  • df=n1df = n - 1=Degrees of freedom

Two-Sample t-Statistic (Pooled)

t=xˉ1xˉ2sp1n1+1n2t = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}

Here,

  • sps_p=Pooled SD: $\sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}$
  • dfdf=$n_1 + n_2 - 2$

Welch's t-Statistic

t=xˉ1xˉ2s12n1+s22n2t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}

Here,

  • dfdf=Welch-Satterthwaite approximation (non-integer)

Paired t-Statistic

t=dˉsd/nt = \frac{\bar{d}}{s_d / \sqrt{n}}

Here,

  • dˉ\bar{d}=Mean of differences $d_i = x_{1i} - x_{2i}$
  • sds_d=Standard deviation of differences
  • nn=Number of pairs

Cohen's d (Effect Size)

d=xˉ1xˉ2spd = \frac{\bar{x}_1 - \bar{x}_2}{s_p}

Here,

  • sps_p=Pooled standard deviation

t-Test Selection Guide

ScenarioTestscipy Functiondf Formula
Sample mean vs. known valueOne-samplettest_1samp(x, mu0)n1n - 1
Two independent groups (equal var)Pooledttest_ind(x, y, equal_var=True)n1+n22n_1 + n_2 - 2
Two independent groups (unequal var)Welch'sttest_ind(x, y, equal_var=False)Satterthwaite
Same subjects measured twicePairedttest_rel(x, y)n1n - 1

Assumptions

  1. Independence: Observations must be independent. No clustering or repeated measures (use paired instead).
  2. Normality: Data (or differences for paired) should be approximately normal. Critical for small samples (n<30n < 30).
  3. Equal Variances (pooled only): Test with Levene's test. When in doubt, use Welch's.
  4. Continuous Data: Dependent variable on interval or ratio scale.
  5. No Significant Outliers: Check with boxplots or Cook's distance.

Quick Example

📝Welch's t-Test

Group A (n=25n=25, xˉA=78\bar{x}_A = 78, sA=12s_A = 12) vs. Group B (n=30n=30, xˉB=84\bar{x}_B = 84, sB=18s_B = 18):

t=788412225+18230=65.76+10.8=64.07=1.474t = \frac{78 - 84}{\sqrt{\frac{12^2}{25} + \frac{18^2}{30}}} = \frac{-6}{\sqrt{5.76 + 10.8}} = \frac{-6}{4.07} = -1.474

With df50df \approx 50, critical value 2.009\approx 2.009. Since 1.474<2.009|-1.474| < 2.009, fail to reject H0H_0. The difference is not statistically significant.

📝Effect Size

Two groups have means 112 and 120, pooled SD = 15.

d=11212015=0.533d = \frac{|112 - 120|}{15} = 0.533

This is a medium effect size (between 0.5 and 0.8), meaning the difference is practically meaningful.


Key Takeaways

📋Summary: t-Tests

  • One-sample: Compare sample mean to a known value. Use when you have a benchmark.
  • Two-sample: Compare means of two independent groups. Use Welch's as the default.
  • Paired: Compare means from linked observations (before/after). More powerful than independent test for matched data.
  • Welch's: Robust to unequal variances. Nearly as powerful as pooled when variances are equal.
  • Assumptions: Independence is critical. Normality matters most for small samples (n<30n < 30).
  • Effect Size: Always report Cohen's d (0.2 = small, 0.5 = medium, 0.8 = large) alongside p-values.
  • CV Model Comparison: Always use the paired t-test for cross-validation fold comparison in ML.
  • Confidence Intervals: Construct CIs for the mean difference — they provide richer information than binary significant/not-significant.

Deep Dive

For detailed explanations, worked examples, and Python implementations, explore the dedicated statistics lessons:

Z-Test

t-Tests

  • One-Sample t-Test — Comparing a sample mean to a hypothesized value with worked examples
  • Two-Sample t-Test — Comparing means of two independent groups (pooled and Welch's)
  • Paired t-Test — Before/after studies and matched pairs analysis

Related Topics

Lesson Progress49 / 100