Range and IQR — Measures of Spread Explained

Foundations of StatisticsDescriptive StatisticsFree Lesson

Advertisement

Range and Interquartile Range (IQR)

Measures of spread (dispersion) tell us how scattered the data is. Range and IQR are the simplest measures — they use only specific order statistics.

Range

Range=xmaxxmin\text{Range} = x_{\max} - x_{\min}

Simple but highly sensitive to outliers — one extreme value changes it completely.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

np.random.seed(42)
data = np.array([12, 15, 14, 10, 18, 20, 16, 11, 13, 17])
data_with_outlier = np.append(data, 100)

print(f"Data: {sorted(data)}")
print(f"Range = {data.max()} - {data.min()} = {data.max() - data.min()}")
print(f"\nWith outlier (100 added):")
print(f"Range = {data_with_outlier.max()} - {data_with_outlier.min()} = {data_with_outlier.max() - data_with_outlier.min()}")
print("Range nearly quadrupled due to one outlier!")

Interquartile Range (IQR)

IQR=Q3Q1IQR = Q3 - Q1

The range of the middle 50% of the data. Robust to outliers.

# Computing quartiles and IQR
def five_number_summary(data):
    q1, q2, q3 = np.percentile(data, [25, 50, 75])
    iqr = q3 - q1
    lower_fence = q1 - 1.5 * iqr
    upper_fence = q3 + 1.5 * iqr
    
    print(f"Min:    {data.min():.2f}")
    print(f"Q1:     {q1:.2f}")
    print(f"Median: {q2:.2f}")
    print(f"Q3:     {q3:.2f}")
    print(f"Max:    {data.max():.2f}")
    print(f"IQR:    {iqr:.2f}")
    print(f"Lower fence (Q1 - 1.5×IQR): {lower_fence:.2f}")
    print(f"Upper fence (Q3 + 1.5×IQR): {upper_fence:.2f}")
    return q1, q2, q3, iqr

print("=== Normal data ===")
five_number_summary(data)
print("\n=== Data with outlier ===")
five_number_summary(data_with_outlier)
print("IQR barely changed — robust!")

Visualizing Range and IQR

# Two datasets with same mean and range but different IQR
np.random.seed(0)
dataset_a = np.random.uniform(0, 100, 200)  # Uniform: large IQR
dataset_b = np.random.normal(50, 10, 200)   # Normal: smaller IQR

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

for ax, data, label, color in zip(axes, 
                                    [dataset_a, dataset_b], 
                                    ['Uniform', 'Normal'], 
                                    ['steelblue', 'coral']):
    ax.hist(data, bins=30, color=color, edgecolor='black', alpha=0.7, density=True)
    q1, q2, q3 = np.percentile(data, [25, 50, 75])
    ax.axvline(data.min(), color='gray', linestyle=':', label=f'Min={data.min():.0f}')
    ax.axvline(q1, color='blue', linestyle='--', label=f'Q1={q1:.0f}')
    ax.axvline(q2, color='red', linestyle='-', linewidth=2, label=f'Median={q2:.0f}')
    ax.axvline(q3, color='blue', linestyle='--', label=f'Q3={q3:.0f}')
    ax.axvline(data.max(), color='gray', linestyle=':', label=f'Max={data.max():.0f}')
    ax.fill_betweenx([0, ax.get_ylim()[1] if ax.get_ylim()[1] > 0 else 0.05], 
                      q1, q3, alpha=0.2, color='yellow', label=f'IQR={q3-q1:.0f}')
    ax.set_title(f'{label} Distribution\nRange={data.max()-data.min():.0f}, IQR={q3-q1:.0f}')
    ax.legend(fontsize=7)

plt.tight_layout()
plt.savefig('range_iqr.png', dpi=150)
plt.show()

Comparing Spread Measures

MeasureFormulaBreakdown PointSensitive To
RangeMax - Min0%Very sensitive to outliers
IQRQ3 - Q125%Robust
Std Dev√(Σ(xᵢ-x̄)²/(n-1))0%Sensitive to outliers
MADMedian(xᵢ - Median)

Key Takeaways

  1. Range is simple but useless with outliers — one bad data point ruins it
  2. IQR is the most robust simple spread measure — covers the middle 50%
  3. The 1.5×IQR rule for outlier detection is built into most box plot implementations
  4. For symmetric data without outliers, standard deviation is more informative than IQR
  5. For skewed data or data with outliers, report IQR instead of (or alongside) standard deviation
  6. IQR = 0 means at least 50% of data is identical — common in count data

Advertisement

Need Expert Statistics Help?

Get personalized tutoring, dissertation support, or statistical consulting.

Advertisement