Arithmetic Mean

Descriptive StatisticsCentral TendencyFree Lesson

Advertisement

Why the Average Isn't Always Right

A neighbourhood has 9 households earning 50kandonebillionaireearning50k and one billionaire earning 50M. Mean income = ~$5M — nobody earns near that. The mean is pulled by extremes.

Core Insight: The mean minimises the sum of squared deviations. It's the least-squares centre — powerful but sensitive to outliers.


Formula

xˉ=1ni=1nxiμ=1Ni=1Nxi\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i \qquad \mu = \frac{1}{N}\sum_{i=1}^{N} x_i

Grouped data: xˉ=fimi/fi\bar{x} = \sum f_i m_i / \sum f_i


Worked Example

Scores: 72, 85, 90, 68, 78, 95, 88, 74

xˉ=72+85+90+68+78+95+88+748=6508=81.25\bar{x} = \frac{72+85+90+68+78+95+88+74}{8} = \frac{650}{8} = 81.25

Sorted:  68  72  74  78  85  88  90  95
                      ↑
                 Mean = 81.25

Python Implementation

import numpy as np
import pandas as pd

data = [72, 85, 90, 68, 78, 95, 88, 74]

print(f"Mean:       {np.mean(data):.2f}")     # 81.25
print(f"Pandas:     {pd.Series(data).mean():.2f}")  # 81.25

# Grouped data
midpoints   = [15, 25, 35, 45, 55]
frequencies = [3,   8,  12,  5,  2]
grouped_mean = np.average(midpoints, weights=frequencies)
print(f"Grouped:    {grouped_mean:.2f}")      # 33.33

# Outlier effect
data_out = data + [500]
print(f"Normal mean:  {np.mean(data):.2f}")      # 81.25
print(f"Outlier mean: {np.mean(data_out):.2f}")  # 130.56

R Implementation

data <- c(72, 85, 90, 68, 78, 95, 88, 74)
cat("Mean:", mean(data), "\n")  # 81.25

# Grouped
cat("Grouped:", weighted.mean(c(15,25,35,45,55), c(3,8,12,5,2)), "\n")

# Trimmed mean (robust)
cat("Trimmed (10%):", mean(data, trim=0.1), "\n")

When to Use Mean vs Median

SituationUse
Symmetric, no outliersMean
Skewed data (income, prices)Median
Categorical dataMode
Need SD / varianceMean
Outliers presentMedian

Key Takeaways

  1. Sum ÷ countxˉ=xi/n\bar{x} = \sum x_i / n; always check for outliers first
  2. Outlier sensitivity — one extreme value shifts the mean significantly
  3. Least-squares — mean minimises (xic)2\sum(x_i - c)^2 for any constant cc
  4. Grouped data — use fimi/fi\sum f_i m_i / \sum f_i when only frequency table is available
  5. μ\mu vs xˉ\bar{x} — population vs sample symbol; computation is identical
  6. Prefer median for skewed distributions like income or survival times

Advertisement

Need Expert Statistics Help?

Get personalized tutoring, dissertation support, or statistical consulting.

Advertisement