Skewness — Measuring Asymmetry of Distributions

Foundations of StatisticsDescriptive StatisticsFree Lesson

Advertisement

Skewness

Skewness quantifies asymmetry of a distribution.

skewness=1n(xixˉ)3s3\text{skewness} = \frac{\frac{1}{n}\sum(x_i - \bar{x})^3}{s^3}

Positive → right tail. Negative → left tail. Zero → symmetric.

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

np.random.seed(42)
right_skew = np.random.lognormal(0, 0.8, 2000)   # income-like
symmetric  = np.random.normal(0, 1, 2000)
left_skew  = -np.random.lognormal(0, 0.8, 2000)

for name, data in [("Right-Skewed", right_skew),
                    ("Symmetric", symmetric),
                    ("Left-Skewed", left_skew)]:
    sk = stats.skew(data)
    print(f"{name:<15}: skew={sk:+.4f}, mean={np.mean(data):.3f}, median={np.median(data):.3f}")

Mean vs Median Under Skewness

Right-Skewed:   Mode < Median < Mean
Symmetric:      Mode ≈ Median ≈ Mean
Left-Skewed:    Mean < Median < Mode
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
datasets = [("Right-Skewed", right_skew, '#f8d7da'),
            ("Symmetric",    symmetric,   '#d4edda'),
            ("Left-Skewed",  left_skew,   '#d1ecf1')]

for ax, (name, data, color) in zip(axes, datasets):
    ax.hist(data, bins=50, density=True, color=color, edgecolor='gray', alpha=0.7)
    ax.axvline(np.mean(data), color='red', lw=2, ls='--', label=f'Mean={np.mean(data):.2f}')
    ax.axvline(np.median(data), color='blue', lw=2, ls='-', label=f'Median={np.median(data):.2f}')
    ax.set_title(f'{name}\nskewness={stats.skew(data):.3f}')
    ax.legend(fontsize=8)
plt.tight_layout()
plt.savefig('skewness.png', dpi=150)
plt.show()

Interpretation Guide

Absolute SkewnessInterpretation
< 0.5Approximately symmetric
0.5–1.0Moderately skewed
> 1.0Highly skewed — consider transformation

Fixing Skewness with Transformations

skewed = np.random.lognormal(0, 1, 500)
print(f"Original skewness: {stats.skew(skewed):.4f}")

# Log transform (works for positive right-skewed data)
log_transformed = np.log(skewed)
print(f"Log-transformed skewness: {stats.skew(log_transformed):.4f}")

# Square root (moderate right skew)
sqrt_transformed = np.sqrt(skewed)
print(f"Sqrt-transformed skewness: {stats.skew(sqrt_transformed):.4f}")

Key Takeaways

  1. Positive skew = right tail — mean > median (tail pulls mean rightward)
  2. Negative skew = left tail — mean < median
  3. |skew| > 1: strongly skewed — use non-parametric methods or transform
  4. Log transformation corrects right skewness in income, prices, reaction times
  5. Always visualize — skewness alone doesn't tell you the full story
  6. Income, house prices, stock returns are classically right-skewed

Advertisement

Need Expert Statistics Help?

Get personalized tutoring, dissertation support, or statistical consulting.

Advertisement