Statistics 101: Mean, Median, Variance and Distributions

Why Statistics Matters for Data Science

Statistics is the mathematical foundation of machine learning. Every ML algorithm is essentially a statistical model optimizing some objective function. Without understanding statistics, you cannot truly understand why models work or fail.

The Statistics ?� ML Pipeline

1. Measures of Central Tendency

Central tendency describes the "center" or "typical value" of a dataset.

Mean (Arithmetic Average)

Formula:

Median (Middle Value)

The median is the value separating the higher half from the lower half of a data sample.

Formula:

Mode (Most Frequent)

2. Measures of Spread (Dispersion)

Variance and Standard Deviation

Population Variance:

Sample Variance (Bessel's Correction):

Standard Deviation:

σ² = (16 + 4 + 0 + 4 + 16 + 0) / 6 = 40/6 = 6.67

σ = ⇚6.67 = 2.58

Negative deviationPositive deviationOn the mean

3. The Normal Distribution (Gaussian)

The most important probability distribution in statistics and ML.

Probability Density Function (PDF):

Where:

= mean (location parameter)
= standard deviation (scale parameter)
= variance

Standard Normal Distribution (Z-Score)

Z-Score Transformation:

This transforms any normal distribution to the standard normal with and .

Z-Score: How Many Standard Deviations from Mean?
IQ Scores: N(100, 15�)
100       70        130
? Z = (X-�)/s ? Standard Normal: N(0, 1)
0       -2        +2
IQ = 130 ? Z = (130-100)/15 = +2.0 (top 2.3%)

4. Central Limit Theorem (CLT)

The most important theorem in statistics explains why the normal distribution appears everywhere.

Central Limit Theorem:

Given a population with mean and standard deviation , the sampling distribution of the sample mean approaches a normal distribution as sample size increases:

Standard Error:

Architecture Diagram

Central Limit Theorem in Action
Population
(any shape)
Take n samples
compute x̞
?�
n = 5
?�
n = 30
?�
n = 100

    Key Insight: As n increases, the sampling distribution becomes more normal


    σ̞ = σ/⇚n ?� Standard error decreases with ⇚n

Practical Impact
{item.n}

        SE = {item.se}

5. Skewness and Kurtosis

Skewness (Asymmetry)

Mean > Median ?� Right Skewed | Mean < Median ?� Left Skewed

Kurtosis (Tail Weight)

Fisher's Kurtosis:

Mesokurtic (γ₂ = 0): Normal distribution
Leptokurtic (γ₂ > 0): Heavy tails, sharp peak
Platykurtic (γ₂ < 0): Light tails, flat peak

{Kurtosis: Comparing Tail Behavior Platykurtic (�4=2)   Normal (�4=3)   Leptokurtic (�4=8)   Heavy tails ?   Higher kurtosis = more outliers, fatter tails, sharper peak}

6. Covariance and Correlation

Covariance

Covariance:

Cov > 0: Variables move together (positive)
Cov < 0: Variables move opposite (negative)
Cov = 0: No linear relationship

Pearson Correlation Coefficient

Pearson's r:

Note: Correlation measures LINEAR relationship only. Non-linear patterns need other metrics.

7. Confidence Intervals

Confidence Interval for Mean:

Common Confidence Levels:

Level	z-score	Area in Tails
90%	1.645	5% each side
95%	1.960	2.5% each side
99%	2.576	0.5% each side

We are 95% confident that the true mean lies between 38 and 62

Key Takeaways

Mean, Median, Mode describe central tendency choose based on data shape
Variance/Standard Deviation quantify spread foundation of all ML loss functions
Normal Distribution the bell curve underlies hypothesis testing and CLT
CLT sample means are normal regardless of population shape (n ��¥ 30)
Correlation �� Causation always consider confounding variables
Confidence Intervals quantify uncertainty in estimates

Next: Probability, Bayes' Theorem and PDF/CDF

Build on these foundations with probability theory and Bayesian inference.

Statistics 101: Mean, Median, Variance and Distributions

Statistics 101: Mean, Median, Variance and Distributions

Why Statistics Matters for Data Science

The Statistics ?� ML Pipeline

1. Measures of Central Tendency

Mean (Arithmetic Average)

Median (Middle Value)

Mode (Most Frequent)

2. Measures of Spread (Dispersion)

Variance and Standard Deviation

3. The Normal Distribution (Gaussian)

Standard Normal Distribution (Z-Score)

4. Central Limit Theorem (CLT)

5. Skewness and Kurtosis

Skewness (Asymmetry)

Kurtosis (Tail Weight)

6. Covariance and Correlation

Covariance

Pearson Correlation Coefficient

7. Confidence Intervals

Key Takeaways

Next: Probability, Bayes' Theorem and PDF/CDF

Need Expert Data Science Help?