Levels of Measurement — Nominal, Ordinal, Interval, Ratio

Foundations of StatisticsMeasurement TheoryFree Lesson

Advertisement

Levels of Measurement

In 1946, psychologist Stanley Stevens proposed a taxonomy of four levels of measurement that has since become foundational in statistics. The level determines which statistical operations are mathematically valid.


The Four Levels

1. Nominal Scale

The weakest level. Data is placed into named categories with no meaningful order or distance between them.

Properties:

  • Identity: each value belongs to a distinct category
  • No order, no distance, no meaningful zero

Examples: Gender, blood type, nationality, color, product ID, political party

Valid statistics: Frequency, mode, chi-square test
Invalid: Mean, median, standard deviation

import pandas as pd
from scipy.stats import chi2_contingency

# Nominal: blood type distribution
blood_types = pd.Series(['A', 'O', 'B', 'AB', 'O', 'A', 'O', 'A', 'B', 'O'])
print("Mode:", blood_types.mode()[0])
print(blood_types.value_counts())
# Chi-square test of independence (nominal vs nominal)

2. Ordinal Scale

Categories have a meaningful order, but the intervals between categories are unknown or unequal.

Properties:

  • Identity + Order
  • No distance, no meaningful zero

Examples:

  • Survey Likert scales (Strongly Disagree → Strongly Agree)
  • Education level (High School < Bachelor's < Master's < PhD)
  • Race finishing position (1st, 2nd, 3rd)
  • Socioeconomic status (Low, Middle, High)

Valid statistics: Median, IQR, percentiles, Spearman rank correlation, Mann-Whitney test
Invalid: Arithmetic mean (debated), standard deviation, Pearson r

import numpy as np
from scipy.stats import spearmanr

# Ordinal: race positions
team_a = [1, 3, 5, 7]   # positions team A finished
team_b = [2, 4, 6, 8]   # positions team B finished

# Spearman correlation (rank-based — appropriate for ordinal)
rho, p = spearmanr(team_a, team_b)
print(f"Spearman ρ = {rho:.3f}, p = {p:.4f}")

# Median is appropriate for ordinal
satisfaction = [3, 4, 2, 5, 4, 3, 4, 5, 2, 4]  # 1–5 scale
print(f"Median satisfaction: {np.median(satisfaction)}")

3. Interval Scale

Equal intervals between values, but no true zero — zero is arbitrary, not the absence of the quantity.

Properties:

  • Identity + Order + Equal Intervals
  • No true zero (ratios meaningless)

Examples:

  • Temperature in Celsius or Fahrenheit (0°C ≠ "no temperature")
  • IQ scores (IQ 0 doesn't mean no intelligence)
  • Calendar years (Year 0 is arbitrary)
  • Likert scales (when treated as interval — common in practice)

Valid statistics: Mean, standard deviation, Pearson r, t-tests, ANOVA
Invalid: Ratios ("twice as hot" is not meaningful in Celsius)

# Temperature conversion — shows why ratios fail for interval data
celsius_a = 20
celsius_b = 40

# It is NOT true that 40°C is "twice as hot" as 20°C
# Convert to Kelvin (ratio scale) to see why:
kelvin_a = celsius_a + 273.15  # 293.15 K
kelvin_b = celsius_b + 273.15  # 313.15 K

ratio_celsius = celsius_b / celsius_a        # 2.0 — misleading!
ratio_kelvin  = kelvin_b / kelvin_a          # 1.068 — true ratio

print(f"Celsius ratio: {ratio_celsius:.3f}  ← NOT meaningful")
print(f"Kelvin ratio:  {ratio_kelvin:.3f}  ← Meaningful thermodynamic ratio")

4. Ratio Scale

The strongest level. Has all properties of interval scale plus a true absolute zero (zero means absence of the attribute).

Properties:

  • Identity + Order + Equal Intervals + True Zero

Examples:

  • Height, weight, length (0 kg = no mass)
  • Age, time duration
  • Income (0 = no income)
  • Temperature in Kelvin
  • Number of items (count data)

Valid statistics: All statistics including geometric mean, coefficient of variation, and ratio comparisons.

import numpy as np

heights_m = np.array([1.65, 1.72, 1.80, 1.58, 1.90])

print(f"Mean: {np.mean(heights_m):.3f} m")
print(f"Ratio (tallest/shortest): {heights_m.max()/heights_m.min():.3f}")
print(f"Geometric mean: {np.exp(np.log(heights_m).mean()):.3f} m")
print(f"CV (coeff of variation): {(np.std(heights_m)/np.mean(heights_m)*100):.1f}%")
# All valid because height is ratio scale

Summary Table

LevelOrderEqual IntervalsTrue ZeroExampleAppropriate Mean
NominalEye colorMode
OrdinalSatisfaction ratingMedian
IntervalTemperature (°C)Arithmetic mean
RatioHeight, weightGeometric mean possible

Choosing the Right Statistical Test

def suggest_test(level_of_measurement, n_groups, paired=False):
    """Suggest appropriate statistical test based on measurement level."""
    if level_of_measurement == 'nominal':
        return "Chi-square test (categories) or Fisher's exact test (small samples)"
    elif level_of_measurement == 'ordinal':
        if n_groups == 2:
            return "Mann-Whitney U (independent) or Wilcoxon signed-rank (paired)"
        else:
            return "Kruskal-Wallis (independent) or Friedman (repeated measures)"
    elif level_of_measurement in ('interval', 'ratio'):
        if n_groups == 1:
            return "One-sample t-test"
        elif n_groups == 2:
            return "Independent t-test" if not paired else "Paired t-test"
        else:
            return "One-way ANOVA" if not paired else "Repeated measures ANOVA"

# Examples
print(suggest_test('nominal', 2))
print(suggest_test('ordinal', 2))
print(suggest_test('ratio', 2, paired=False))
print(suggest_test('ratio', 3))

Key Takeaways

  1. Nominal: categories only — use frequency, mode, chi-square
  2. Ordinal: ordered categories — use median, rank-based tests
  3. Interval: equal gaps, no true zero — use mean, t-test, Pearson r
  4. Ratio: everything interval + true zero — use all statistics including ratios and CV
  5. Applying higher-level statistics to lower-level data is mathematically invalid but regrettably common
  6. When in doubt, use conservative (lower-level) methods — they're more robust

Advertisement

Need Expert Statistics Help?

Get personalized tutoring, dissertation support, or statistical consulting.

Advertisement