Covariance and Correlation | ChatWhole Learn

Covariance $\text{Cov}(X,Y) = E[XY] - E[X]E[Y]$ measures the joint variability of two variables. Positive means they co-move; negative means they move in opposite directions; zero means no linear relationship.
Correlation $\rho = \frac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y} \in [-1, 1]$ normalizes covariance to a unitless scale, making it comparable across different variable pairs. $|\rho| = 1$ indicates a perfect linear relationship.
Correlation ≠ Causation: Correlation measures association, not causation. Confounding variables, reverse causation, and coincidence can all produce spurious correlations.
Uncorrelated ≠ Independent: Zero correlation only rules out linear dependence. Non-linear dependencies (e.g., $Y = X^2$ ) can exist even when $\rho = 0$ . Only for bivariate normal distributions does $\rho = 0$ imply independence.
Covariance Matrix $\Sigma$ is symmetric and positive semi-definite. Its diagonal contains variances, off-diagonals contain covariances. Eigenvalue decomposition of $\Sigma$ is the foundation of PCA.
Applications: Feature selection (multicollinearity), PCA (dimensionality reduction), portfolio optimization (Markowitz), Gaussian distributions, attention mechanisms, and natural gradient methods all rely on covariance and correlation.