Weighted Mean — Formula, Applications, and Python

Foundations of StatisticsDescriptive StatisticsFree Lesson

Advertisement

Weighted Mean

The weighted mean assigns different weights to different observations, allowing some values to have more influence on the average than others.

xˉw=i=1nwixii=1nwi\bar{x}_w = \frac{\sum_{i=1}^n w_i x_i}{\sum_{i=1}^n w_i}

When to Use Weighted Mean

SituationWhy Weights Matter
GPA (grade point average)Courses have different credit hours
Portfolio returnsAssets have different dollar weights
Survey analysisRespondents represent different group sizes
Grouped frequency dataEach midpoint represents many observations
Price indices (CPI, etc.)Goods have different consumption weights

Python Implementation

import numpy as np
import pandas as pd

# ========================================
# Example 1: GPA Calculation
# ========================================
courses = pd.DataFrame({
    'Course': ['Statistics', 'Linear Algebra', 'Machine Learning', 'Databases', 'Ethics'],
    'Grade_Points': [4.0, 3.7, 3.3, 4.0, 3.0],
    'Credits': [4, 3, 4, 3, 1]
})

weighted_gpa = np.average(courses['Grade_Points'], weights=courses['Credits'])
simple_gpa = courses['Grade_Points'].mean()

print("Courses:")
print(courses.to_string(index=False))
print(f"\nWeighted GPA: {weighted_gpa:.4f}")
print(f"Unweighted GPA: {simple_gpa:.4f}")
print(f"Difference: {weighted_gpa - simple_gpa:+.4f} (credits matter!)")

# ========================================
# Example 2: Portfolio Return
# ========================================
portfolio = pd.DataFrame({
    'Asset': ['Stock A', 'Stock B', 'Bonds', 'Cash'],
    'Weight_pct': [40, 35, 20, 5],
    'Return_pct': [12.5, -3.2, 4.1, 0.5]
})

portfolio_return = np.average(portfolio['Return_pct'], 
                               weights=portfolio['Weight_pct'])
simple_avg_return = portfolio['Return_pct'].mean()

print("\nPortfolio:")
print(portfolio.to_string(index=False))
print(f"\nWeighted portfolio return: {portfolio_return:.2f}%")
print(f"Simple average return:     {simple_avg_return:.2f}%")

# ========================================
# Example 3: Survey Weighting (post-stratification)
# ========================================
survey = pd.DataFrame({
    'Group': ['18-34', '35-54', '55+'],
    'Survey_pct': [15, 50, 35],    # % in survey sample
    'Pop_pct': [30, 40, 30],       # % in true population
    'Support_pct': [75, 55, 40]    # % supporting the policy
})

# Unweighted (biased — overrepresents 35-54 group)
unweighted = np.average(survey['Support_pct'], weights=survey['Survey_pct'])
# Weighted to population (correct)
weighted = np.average(survey['Support_pct'], weights=survey['Pop_pct'])

print("\nSurvey with nonrepresentative sample:")
print(survey.to_string(index=False))
print(f"\nUnweighted mean: {unweighted:.1f}% support (biased!)")
print(f"Population-weighted mean: {weighted:.1f}% support (correct)")

Properties of the Weighted Mean

# Property 1: Reduces to arithmetic mean when all weights equal
equal_weights = [1, 1, 1, 1, 1]
data = [10, 20, 30, 40, 50]
print(f"Equal weights → weighted mean = {np.average(data, weights=equal_weights):.1f}")
print(f"Simple mean = {np.mean(data):.1f}")

# Property 2: Extreme weights force convergence toward that value
extreme_weights = [1, 1, 1, 1, 100]
print(f"Extreme weight on last → weighted mean = {np.average(data, weights=extreme_weights):.1f}")
print(f"Last value: {data[-1]}")  # Should be close to 50

# Property 3: Normalized weights must sum to 1 for percentages to work
weights = [4, 3, 4, 3, 1]
norm_weights = [w/sum(weights) for w in weights]
print(f"Normalized weights: {[f'{w:.3f}' for w in norm_weights]}")
print(f"Sum: {sum(norm_weights):.4f}")

Key Takeaways

  1. Weighted mean is the correct tool when observations differ in importance
  2. np.average(data, weights=w) in Python computes it exactly
  3. GPA, portfolio returns, price indices — all weighted means in disguise
  4. Survey post-stratification weighting corrects for nonprobability samples
  5. Equal weights → arithmetic mean (special case of weighted mean)
  6. A weight of zero excludes an observation; very large weights dominate the result

Advertisement

Need Expert Statistics Help?

Get personalized tutoring, dissertation support, or statistical consulting.

Advertisement