Advanced Analysis of Variance

Introduction to ANOVA

Analysis of Variance (ANOVA) provides statistical methods for comparing means across multiple groups. Originally developed by Ronald Fisher in the early 20th century for agricultural experiments, ANOVA has become fundamental across scientific disciplines. The key insight is that comparing means requires understanding overall variability and partitioning it into components attributable to different sources.

ANOVA's power derives from comparing within-group variability to between-group variability. If groups truly have different means, the between-group variation should exceed within-group variation. The F-statistic formalizes this comparison, testing whether observed differences are likely due to chance or reflect real group differences.

While conceptually simple, ANOVA extends to complex experimental designs with multiple factors, repeated measurements, and hierarchical structures. Understanding both basic and advanced ANOVA enables appropriate analysis across diverse research situations.

One-Way ANOVA

One-way ANOVA compares means across three or more groups defined by a single categorical factor. This design extends the two-sample t-test to multiple groups while controlling overall error rate.

The F-Test Framework

One-way ANOVA partitions total variability into between-group and within-group components. Total sum of squares (SST) measures overall variability around the grand mean. Between-group sum of squares (SSB) measures variability due to group differences. Within-group sum of squares (SSW) measures variability within groups.

Mean squares divide sums of squares by their degrees of freedom. MSB = SSB/(k-1) where k is the number of groups. MSW = SSW/(n-k) where n is total sample size. The F-statistic equals MSB/MSW.

The null hypothesis states all group means equal: μ₁ = μ₂ = ... = μₖ. The alternative states at least one group mean differs. Large F values (relative to the null distribution) provide evidence against H₀.

ANOVA Assumptions

ANOVA assumes independent observations. This typically requires random sampling or random assignment. Violations bias results and affect inference validity.

ANOVA assumes normally distributed groups or sufficiently large samples for the Central Limit Theorem to apply. Moderate violations are generally tolerable, especially with equal sample sizes and balanced designs.

ANOVA assumes equal variances across groups (homoscedasticity). This assumption is important for valid inference. Violations are concerning with unequal sample sizes. Diagnostic tests check this assumption.

Post-Hoc Comparisons

ANOVA tests overall differences but doesn't identify which groups differ. Post-hoc pairwise comparisons identify specific group differences while controlling family-wise error rate.

The Tukey Honestly Significant Difference (HSD) test provides simultaneous confidence intervals for all pairwise differences. It controls error rate across all comparisons. The Studentized range distribution provides critical values.

The Bonferroni correction divides α by the number of comparisons. This approach is conservative but simple to apply. The Scheffé method provides simultaneous confidence intervals for all possible contrasts, very conservative but flexible.

Two-Way ANOVA

Two-way ANOVA examines effects of two categorical factors simultaneously. This design is more efficient than separate one-way ANOVAs and enables testing interactions.

Main Effects and Interactions

Main effects test each factor separately, averaging over levels of the other factor. The factor A main effect tests whether mean differences exist across A levels regardless of B level.

The interaction effect tests whether the effect of one factor depends on the level of the other factor. Significant interaction indicates that simple effects differ across levels.

When interaction is significant, main effects can be misleading. The interaction pattern should be examined before interpreting main effects. Interaction plots visualize patterns.

Factorial Design Analysis

Factorial designs include all combinations of factor levels. The analysis partitions variability into components for Factor A, Factor B, and their interaction, plus error.

The total degrees of freedom equals n-1. Factor A has a-1 degrees of freedom. Factor B has b-1 degrees of freedom. Interaction has (a-1)(b-1) degrees of freedom. Error has ab(n̄-1) degrees of freedom where n̄ is average cell size.

Interpretation requires examining both main effects and interaction. Patterns in interaction plots reveal how factors combine to affect outcomes. Significant interactions require careful interpretation.

Interaction Interpretation

Significant interactions mean that simple effects differ across levels. The effect of changing Factor A might be positive at one level of Factor B but negative at another. This pattern would show crossed lines in interaction plots.

Interpretation should describe interaction patterns in context. Which levels show stronger effects? What does the interaction mean for the application? The substantive interpretation matters more than statistical significance.

Simple effects analysis examines effect of one factor at each level of the other factor. This can clarify interaction patterns when overall interaction is significant.

Repeated Measures ANOVA

Repeated measures designs measure the same subjects multiple times. This approach controls for individual differences, often providing more power than between-subject designs.

Within-Subject Designs

Within-subject factors vary within each subject over time or condition. Each subject experiences all levels of the within-subject factor. This creates correlated observations requiring special analysis.

The key advantage is controlling for individual differences. Some subjects are consistently higher or lower regardless of treatment. Repeated measures designs remove this source of variability from error.

The design requires assumptions about correlation structure. Sphericity assumes equal variances of differences across conditions. Violation reduces validity. Corrections adjust degrees of freedom.

Mixed Designs

Mixed designs include both within-subject and between-subject factors. Some factors vary across subjects (between-subject), others vary within subjects (within-subject).

Analysis separates within-subject effects, between-subject effects, and their interaction. The within-subject analysis uses subject-specific error term. The between-subject analysis uses the usual error term.

Interpretation requires attending to both within-subject and between-subject effects. Each has appropriate error terms and interpretations.

Analysis of Covariance (ANCOVA)

ANCOVA combines ANOVA and regression, comparing group means while controlling for other variables. This approach increases precision by accounting for pre-existing differences.

Model Specification

ANCOVA adds covariate(s) to the ANOVA model. The model includes both categorical group factors and continuous covariates. Group comparisons adjust for covariate differences.

The analysis partitions variability into groups, covariates, and error. Covariates explain residual variability, increasing power to detect group effects. Group effects are tested after controlling for covariates.

Interpretation focuses on group effects after adjusting for covariates. The estimated group difference represents the expected difference controlling for covariate values.

Assumptions and Diagnostics

ANCOVA assumes the covariate is measured without error (or measurement error is negligible). It assumes the covariate is unrelated to group assignment (if violated, adjustment might create bias).

Homogeneity of regression slopes assumes the covariate affects all groups equally. If slopes differ, the comparison is complicated. Testing the interaction between group and covariate evaluates this assumption.

When homogeneity fails, alternatives include analyzing subgroups, using nonparametric methods, or explicitly modeling different slopes.

MANOVA: Multivariate ANOVA

MANOVA extends ANOVA to multiple outcomes simultaneously. This approach tests whether groups differ across a set of dependent variables considered together.

Multivariate Testing

MANOVA tests whether group centroids (multivariate means) differ across groups. Several test statistics are available: Wilks' Lambda, Pillai's Trace, Roy's Largest Root, and Hotelling-Lawley Trace. They lead to similar conclusions in most situations.

The null hypothesis states all group mean vectors equal. Significant MANOVA indicates groups differ on the multivariate combination of outcomes. Follow-up analyses identify which outcomes contribute.

MANOVA is appropriate when outcomes are conceptually related and correlated. It controls overall Type I error across multiple outcomes better than separate ANOVAs.

Assumptions

MANOVA assumes multivariate normality within groups. It assumes equal covariance matrices across groups (Box's M test). It assumes linear relationships among outcomes.

Violations are more serious than in univariate ANOVA because multivariate tests are more sensitive. Transformation might address non-normality. Equal variance assumption is particularly important.

Effect Size Measures

Effect size measures quantify the magnitude of effects, complementing significance tests that only indicate whether effects are likely due to chance.

Eta Squared

Eta squared (η²) equals SSB/SST, representing the proportion of total variance attributable to groups. Values range from 0 to 1, with larger values indicating stronger effects.

Small effects might achieve statistical significance with large samples even when explaining little variance. Effect sizes help interpret practical importance.

General guidelines: η² = 0.01 is small, 0.06 is medium, 0.14 is large. These are approximate, and context matters for interpretation.

Omega Squared

Omega squared (ω²) provides a less biased estimate of effect size than η². It corrects for upward bias in η², especially with small samples.

Cohen's f is another effect size measure, related to η² by f = √(η²/(1-η²)). This provides the same information on a different scale.

For comparing many groups or multiple factors, partial η² and partial ω² use SS between divided by (SS between + SS error), measuring effects within the specific comparison.

Non-Parametric Alternatives

When ANOVA assumptions are seriously violated, non-parametric alternatives provide valid inference without distributional assumptions.

Kruskal-Wallis Test

The Kruskal-Wallis test is the non-parametric alternative to one-way ANOVA. It tests whether group distributions differ using ranks rather than raw values.

The test statistic approximates a chi-square distribution. It tests whether mean ranks differ across groups. Significant results indicate groups likely differ in distribution.

Post-hoc pairwise comparisons using rank-based methods extend the test.

Friedman Test

The Friedman test is the non-parametric alternative to repeated measures ANOVA. It uses ranks within subjects to test for differences across conditions.

The test is appropriate for ordinal outcomes or when normality is severely violated. It is less powerful than parametric tests when assumptions are met but more robust when they are not.

Diagnostics and Model Checking

Checking ANOVA assumptions ensures valid results. Several diagnostic approaches reveal potential problems.

Residual Analysis

Residuals should be randomly distributed with constant variance. Patterns in residual plots indicate assumption violations. Non-random patterns might indicate non-linear relationships or missing factors.

Q-Q plots check normality of residuals. Deviations from the line indicate non-normality. Severe deviations might require transformation or non-parametric analysis.

Variance Homogeneity Tests

Levene's test evaluates equality of variances across groups. Significant results indicate heteroscedasticity. The test is less sensitive to non-normality than some alternatives.

Box's M test evaluates multivariate homogeneity for MANOVA. Significant results indicate unequal covariance matrices. This is serious for MANOVA and might require alternative approaches.

Transformations

Log transformations can address heteroscedasticity and non-normality simultaneously. Square root transformations help with count data. Arcsine transformations help with proportion data.

Transformation changes interpretation. Back-transformation is needed for predicted values and intervals. Interpretation becomes multiplicative rather than additive.

Key Takeaways

ANOVA compares means across groups by partitioning variance into components
One-way ANOVA handles single factor designs; two-way ANOVA handles two factors
Repeated measures designs measure subjects multiple times, increasing power
Effect sizes complement significance tests by quantifying magnitude
Non-parametric alternatives apply when assumptions are seriously violated
Diagnostics verify assumptions and identify problems requiring attention