← Math|50 of 100
Statistics

Chi-Square Tests

Master chi-square goodness of fit, test of independence, chi-square distribution, expected frequencies, Fisher's exact test, and applications in AI/ML feature selection.

πŸ“‚ Non-parametric TestsπŸ“– Lesson 50 of 100πŸŽ“ Free Course

Advertisement

Chi-Square Tests

ℹ️ Why It Matters

Most real-world data is categorical: email is spam or not, a patient responds to treatment or not, a user clicks an ad or doesn't. T-tests and ANOVA assume continuous, normally distributed data β€” they cannot handle categories. Chi-square tests fill this gap by analyzing frequency data in contingency tables and comparing observed distributions to theoretical ones. In machine learning, chi-square tests drive feature selection β€” identifying which categorical features are most associated with the target variable before training a classifier.


Overview

Chi-square tests address two fundamental questions about categorical data. The goodness of fit test determines whether a single categorical variable follows a specified distribution (e.g., is a die fair?). The test of independence determines whether two categorical variables are associated (e.g., is gender associated with voting preference?). Both use the same test statistic β€” the sum of squared differences between observed and expected frequencies, standardized by expected frequencies. When expected frequencies are small (<5< 5), the chi-square approximation breaks down and Fisher's exact test should be used instead. Effect size is measured by CramΓ©r's V.


Key Concepts

Chi-Square Goodness of Fit

Ο‡2=βˆ‘i=1k(Oiβˆ’Ei)2Ei\chi^2 = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i}

Here,

  • OiO_i=Observed frequency in category i
  • EiE_i=Expected frequency under Hβ‚€: $n \cdot p_i$
  • kk=Number of categories

Expected Frequency (Independence)

Eij=Riβ‹…CjnE_{ij} = \frac{R_i \cdot C_j}{n}

Here,

  • RiR_i=Marginal row total for row i
  • CjC_j=Marginal column total for column j
  • nn=Grand total of all observations

Chi-Square Test of Independence

Ο‡2=βˆ‘i=1rβˆ‘j=1c(Oijβˆ’Eij)2Eij\chi^2 = \sum_{i=1}^{r} \sum_{j=1}^{c} \frac{(O_{ij} - E_{ij})^2}{E_{ij}}

Here,

  • dfdf=$(r-1)(c-1)$ degrees of freedom

CramΓ©r's V (Effect Size)

V=Ο‡2nβ‹…(kβˆ—βˆ’1)V = \sqrt{\frac{\chi^2}{n \cdot (k^* - 1)}}

Here,

  • kβˆ—k^*=$\min(r, c)$ β€” the smaller of rows or columns
  • VV=Ranges from 0 (no association) to 1 (perfect association)

Fisher's Exact Test (2Γ—2)

P=(a+ba)(c+dc)(na+c)P = \frac{\binom{a+b}{a}\binom{c+d}{c}}{\binom{n}{a+c}}

Here,

  • a,b,c,da, b, c, d=Cell counts in the 2Γ—2 table
  • nn=Grand total

Degrees of Freedom

TestFormulaExample
Goodness of Fit (known pip_i)kβˆ’1k - 16-sided die: df=5df = 5
Goodness of Fit (estimated params)kβˆ’1βˆ’mk - 1 - m10 bins, estimating ΞΌ and Οƒ: df=7df = 7
Independence rΓ—cr \times c(rβˆ’1)(cβˆ’1)(r-1)(c-1)3Γ—43 \times 4 table: df=6df = 6

Minimum Expected Frequency Rule

  • No expected frequency should be less than 1
  • No more than 20% of expected frequencies should be less than 5
  • For 2Γ—22 \times 2 tables violating this, use Fisher's exact test

CramΓ©r's V Benchmarks

dfβˆ—df^*SmallMediumLarge
10.100.300.50
20.070.210.35
3+0.060.170.29

Quick Example

πŸ“Goodness of Fit: Is the Die Fair?

A die rolled 120 times: [18, 15, 22, 17, 20, 28]. Expected: 20 per face.

Ο‡2=(18βˆ’20)220+(15βˆ’20)220+β‹―+(28βˆ’20)220=5.30\chi^2 = \frac{(18-20)^2}{20} + \frac{(15-20)^2}{20} + \cdots + \frac{(28-20)^2}{20} = 5.30

df=5df = 5, critical value Ο‡0.05,52=11.07\chi^2_{0.05, 5} = 11.07. Since 5.30<11.075.30 < 11.07, fail to reject H0H_0. The die appears fair (p=0.38p = 0.38).

πŸ“Test of Independence

Gender vs. voting preference (300 voters). After computing expected frequencies and the chi-square statistic: Ο‡2=36.73\chi^2 = 36.73, df=4df = 4, p<0.001p < 0.001.

Reject H0H_0: gender and voting preference are significantly associated. The largest contributor is the "Other gender / Party C" cell (Ο‡contribution2=20.8\chi^2_{contribution} = 20.8).


Key Takeaways

πŸ“‹Summary: Chi-Square Tests

  • Goodness of Fit: Tests whether a single categorical variable matches an expected distribution. df=kβˆ’1df = k - 1.
  • Test of Independence: Tests whether two categorical variables are associated. df=(rβˆ’1)(cβˆ’1)df = (r-1)(c-1).
  • Expected Frequencies: Must be β‰₯5\geq 5 in at least 80% of cells. Use Fisher's exact test when this fails.
  • Effect Size: Always report CramΓ©r's V alongside the p-value β€” large samples make everything "significant."
  • Feature Selection: Chi-square independence test is the standard filter method for categorical features in NLP and spam detection.
  • Assumptions: Categorical data, independent observations, random sampling, sufficient expected frequencies.
  • Ordinal Data: Chi-square ignores ordering. Use trend tests (Cochran-Armitage) for more power with ordinal data.

Deep Dive

For detailed explanations, worked examples, and Python implementations, explore the dedicated statistics lessons:

Chi-Square Distribution

Goodness of Fit

Test of Independence

Related Tests

Related Topics

Lesson Progress50 / 100