Time Series Basics: Trend, Seasonality

Module 2: Machine LearningFree Lesson

Advertisement

Time Series Basics: Trend, Seasonality

What is a Time Series?

DfTime Series

A sequence of data points indexed in temporal order, where each observation is associated with a timestamp. Formally, a time series is a stochastic process Y={y1,y2,…,yT}Y = \{y_1, y_2, \ldots, y_T\} where yty_t is the observation at time tt.

Y={y1,y2,y3,…,yT}Y = \{y_1, y_2, y_3, \ldots, y_T\}

where yty_t is the observation at time tt.

Architecture Diagram
Time Series Example (Daily Temperature):

Temp(°F)
  95│            ╭─╮     ╭─╮
  90│     ╭─╮   ╭╯ ā•°ā•®   ╭╯ ā•°ā•®
  85│    ╭╯ ā•°ā•® ╭╯   ā•°ā•® ╭╯   ā•°ā•®
  80│   ╭╯   ╰─╯     ╰─╯     ā•°ā•®
  75│───╯                       ╰────
  70│
    └────────────────────────────────→ Time
     Jan  Feb  Mar  Apr  May  Jun  Jul

  Components: Upward trend + Seasonal pattern + Random noise

Time Series Components

DfTime Series Decomposition

The process of separating a time series into its constituent components: trend, seasonality, cyclicity, and noise. The two fundamental models are additive (components sum) and multiplicative (components multiply).

Every time series can be decomposed into components:

Additive Decomposition Model

Yt=Tt+St+Ct+ϵtY_t = T_t + S_t + C_t + \epsilon_t

Here,

  • =Trend — long-term direction
  • =Seasonality — fixed-period patterns
  • =Cyclicity — irregular long-term fluctuations
  • =Noise — random, unpredictable variation

or

Multiplicative Decomposition Model

Yt=TtƗStƗCt×ϵtY_t = T_t \times S_t \times C_t \times \epsilon_t

Here,

  • =Trend component
  • =Seasonal component (multiplicative factor)
  • =Cyclical component
  • =Noise component
Architecture Diagram
Component Breakdown:
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│  Y_t = T_t + S_t + C_t + ε_t                               │
│                                                             │
│  T_t = Trend        → Long-term direction                   │
│  S_t = Seasonality  → Fixed-period patterns                 │
│  C_t = Cyclicity    → Irregular long-term fluctuations      │
│  ε_t = Noise        → Random, unpredictable variation       │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

Use the additive model when the seasonal amplitude is roughly constant over time. Use the multiplicative model when the seasonal amplitude grows with the level of the series (e.g., airline passengers — the seasonal swing is larger when overall traffic is higher). A log transform can convert a multiplicative relationship into an additive one.

1. Trend

The long-term direction of the series (upward, downward, or flat).

import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

# Create synthetic time series
np.random.seed(42)
n = 365
time = np.arange(n)

trend = 0.05 * time                    # Linear upward trend
seasonal = 10 * np.sin(2 * np.pi * time / 365)  # Yearly cycle
noise = np.random.randn(n) * 2

y = trend + seasonal + noise

fig, axes = plt.subplots(4, 1, figsize=(12, 8), sharex=True)

axes[0].plot(time, y, 'b-', linewidth=0.8, alpha=0.7)
axes[0].set_title('Original Series')

axes[1].plot(time, trend, 'r-', linewidth=2)
axes[1].set_title('Trend Component')

axes[2].plot(time, seasonal, 'g-', linewidth=1)
axes[2].set_title('Seasonal Component')

axes[3].plot(time, noise, 'gray', linewidth=0.5, alpha=0.5)
axes[3].set_title('Noise Component')

plt.tight_layout()
plt.savefig('ts_components.png', dpi=150)
plt.show()

2. Seasonality

Repeating patterns at fixed intervals (daily, weekly, monthly, yearly).

# Types of seasonality
fig, axes = plt.subplots(3, 1, figsize=(12, 6))

t = np.arange(0, 7*24, 1)  # 7 days of hourly data

# Daily pattern (peaks at noon)
daily = 5 * np.sin(2 * np.pi * t / 24)
axes[0].plot(t, daily, 'b-')
axes[0].set_title('Daily Seasonality (24-hour cycle)')
axes[0].set_ylabel('Effect')

# Weekly pattern (lower on weekends)
weekly = 3 * np.sin(2 * np.pi * t / (24*7) + np.pi/4)
axes[1].plot(t, weekly, 'r-')
axes[1].set_title('Weekly Seasonality (168-hour cycle)')
axes[1].set_ylabel('Effect')

# Combined
axes[2].plot(t, daily + weekly, 'g-', linewidth=0.8)
axes[2].set_title('Combined Seasonality')
axes[2].set_xlabel('Hours')
axes[2].set_ylabel('Effect')

plt.tight_layout()
plt.savefig('seasonality_types.png', dpi=150)
plt.show()

3. Cyclicity

Long-term oscillations without fixed periods (economic cycles, climate patterns).

# Cyclical component
cycle = 15 * np.sin(2 * np.pi * time / 365 * 4) + \
       8 * np.sin(2 * np.pi * time / 365 * 2.5)

plt.figure(figsize=(12, 4))
plt.plot(time, cycle, 'purple', linewidth=1.5)
plt.title('Cyclical Component (Multi-year cycles)')
plt.xlabel('Days')
plt.ylabel('Effect')
plt.savefig('cyclical_component.png', dpi=150)
plt.show()

Stationarity

DfStationarity

A time series is (weakly) stationary if its statistical properties do not change over time: constant mean, constant variance, and autocovariance that depends only on the lag, not on the absolute time index. Most time series models (ARIMA, etc.) assume stationarity.

E[yt]=Ī¼ā€…ā€Šāˆ€ā€‰t,Var(yt)=σ2ā€…ā€Šāˆ€ā€‰t,Cov(yt,yt+h)=γ(h)E[y_t] = \mu \;\forall\, t, \quad \text{Var}(y_t) = \sigma^2 \;\forall\, t, \quad \text{Cov}(y_t, y_{t+h}) = \gamma(h)
Architecture Diagram
Stationary vs Non-Stationary:

Stationary:                    Non-Stationary:
  y ↑    ────────              y ↑         ╱
    │  ───╱╲──╱╲──               │       ╱
    │ ──╱╲──╱╲──╱╲──             │     ╱
    │╱╲──╱╲──╱╲──╱╲──            │   ╱
    └──────────────→ t           │ ╱
    Mean & variance constant     └──────────────→ t
                                  Increasing trend → non-stationary

Augmented Dickey-Fuller (ADF) Test

DfAugmented Dickey-Fuller Test

A statistical test for the presence of a unit root in a time series. A unit root indicates the series is non-stationary — shocks persist indefinitely. The ADF test regresses the first difference on lagged levels and lagged differences, testing whether the coefficient on the lagged level is significantly negative.

The ADF test equation:

Augmented Dickey-Fuller Regression

Ī”yt=α+βt+γytāˆ’1+āˆ‘i=1pĪ“iĪ”ytāˆ’i+ϵt\Delta y_t = \alpha + \beta t + \gamma y_{t-1} + \sum_{i=1}^{p} \delta_i \Delta y_{t-i} + \epsilon_t

Here,

  • =First difference of the series
  • =Coefficient tested (unit root if γ = 0)
  • =Number of lagged difference terms
  • =Constant (drift)
  • =Time trend coefficient

Hypotheses:

  • H0:Ā SeriesĀ hasĀ aĀ unitĀ rootĀ (non-stationary) — γ=0H_0\text{: Series has a unit root (non-stationary) — } \gamma = 0
  • H1:Ā SeriesĀ isĀ stationary — γ<0H_1\text{: Series is stationary — } \gamma < 0

Decision rule: If p-value < 0.05, reject H0 (series is stationary).

The ADF test's power depends on the number of lags pp. Too few lags leave autocorrelation in the residuals; too many reduces power. Use information criteria (AIC, BIC) to select the lag order, or let adfuller choose automatically via autolag='AIC'.

from statsmodels.tsa.stattools import adfuller

def adf_test(series, title=''):
    """Perform ADF test and print results."""
    result = adfuller(series.dropna(), autolag='AIC')

    print(f'ADF Test: {title}')
    print('=' * 50)
    print(f'ADF Statistic:  {result[0]:.4f}')
    print(f'p-value:        {result[1]:.4f}')
    print(f'Lags Used:      {result[2]}')
    print(f'Observations:   {result[3]}')

    for key, value in result[4].items():
        print(f'Critical Value ({key}): {value:.4f}')

    if result[1] <= 0.05:
        print(f'\nāœ“ Stationary (reject H0 at 5% level)')
    else:
        print(f'\nāœ— Non-stationary (fail to reject H0)')

    return result[1]

# Test on original series
adf_test(y, 'Original Series')

# Test on differenced series
y_diff = np.diff(y)
adf_test(y_diff, 'Differenced Series')

Decomposition Methods

Classical Decomposition

import pandas as pd

# Create DataFrame
dates = pd.date_range(start='2023-01-01', periods=365, freq='D')
ts = pd.Series(y, index=dates)

# Additive decomposition
decomposition_add = seasonal_decompose(
    ts, model='additive', period=365
)

# Multiplicative decomposition
ts_positive = ts + 20  # Shift to positive values
decomposition_mul = seasonal_decompose(
    ts_positive, model='multiplicative', period=365
)

# Plot additive decomposition
fig = decomposition_add.plot()
fig.set_size_inches(12, 8)
plt.suptitle('Additive Decomposition', fontsize=14, y=1.02)
plt.tight_layout()
plt.savefig('additive_decomposition.png', dpi=150)
plt.show()

STL Decomposition (Seasonal-Trend using LOESS)

DfSTL Decomposition

Seasonal and Trend decomposition using LOESS (Locally Estimated Scatterplot Smoothing). STL is more robust than classical decomposition, handling any type of seasonality, robustness to outliers, and a seasonal component that changes over time.

from statsmodels.tsa.seasonal import STL

# STL decomposition
stl = STL(ts, period=365, robust=True)
result = stl.fit()

fig = result.plot()
fig.set_size_inches(12, 8)
plt.suptitle('STL Decomposition', fontsize=14, y=1.02)
plt.tight_layout()
plt.savefig('stl_decomposition.png', dpi=150)
plt.show()

# Access components
trend_stl = result.trend
seasonal_stl = result.seasonal
residual_stl = result.resid

Choosing Between Additive and Multiplicative

def check_model_type(series, period):
    """Determine if additive or multiplicative decomposition is appropriate."""
    decomposition = seasonal_decompose(series, model='additive', period=period)

    seasonal_amplitude = decomposition.seasonal.max() - decomposition.seasonal.min()

    print(f"Seasonal amplitude: {seasonal_amplitude:.4f}")
    print(f"Series mean:        {series.mean():.4f}")

    ratio = seasonal_amplitude / series.mean()
    print(f"Amplitude/Mean:     {ratio:.4f}")

    if ratio > 0.3:
        print("→ Large seasonal variation relative to level → Multiplicative")
    else:
        print("→ Small seasonal variation relative to level → Additive")

    return 'multiplicative' if ratio > 0.3 else 'additive'

model_type = check_model_type(ts, 365)

Making a Series Stationary

1. Differencing

First-Order Differencing

yt′=ytāˆ’ytāˆ’1y'_t = y_t - y_{t-1}

Here,

  • =Differenced series at time t
  • =Original series at time t

Second-order differencing:

yt′′=ytā€²āˆ’ytāˆ’1′=ytāˆ’2ytāˆ’1+ytāˆ’2y''_t = y'_t - y'_{t-1} = y_t - 2y_{t-1} + y_{t-2}
# First-order differencing
y_diff1 = ts.diff().dropna()
adf_test(y_diff1, 'First Differencing')

# Seasonal differencing
y_diff_seasonal = ts.diff(365).dropna()
adf_test(y_diff_seasonal, 'Seasonal Differencing (period=365)')

# Combined differencing
y_diff_combined = ts.diff().diff(365).dropna()
adf_test(y_diff_combined, 'Combined Differencing')

2. Log Transformation

Stabilizes variance when it changes with level:

Log Transformation

yt′=log⁔(yt)y'_t = \log(y_t)

Here,

  • =Transformed series (log scale)
  • =Original series (positive values)
# Log transformation (for positive series)
y_log = np.log(ts_positive)
y_log_diff = y_log.diff().dropna()

fig, axes = plt.subplots(1, 2, figsize=(14, 4))
ts_positive.plot(ax=axes[0], title='Original (Positive)')
y_log_diff.plot(ax=axes[1], title='Log-Transformed + Differenced')
plt.tight_layout()
plt.savefig('log_transform.png', dpi=150)
plt.show()

3. Box-Cox Transformation

Power transformation that stabilizes variance:

Box-Cox Transformation

y(Ī»)={yĪ»āˆ’1λλ≠0ln⁔(y)Ī»=0y^{(\lambda)} = \begin{cases} \frac{y^\lambda - 1}{\lambda} & \lambda \neq 0 \\ \ln(y) & \lambda = 0 \end{cases}

Here,

  • =Transformation parameter (optimized via MLE)
  • =Original series (positive values)

The optimal Ī»\lambda is chosen by maximum likelihood estimation. Common values: Ī»=1\lambda = 1 (no transform), Ī»=0.5\lambda = 0.5 (square root), Ī»=0\lambda = 0 (log). The Box-Cox transform requires strictly positive data. For non-positive data, use the Yeo-Johnson transform.

from scipy.stats import boxcox
from scipy.special import inv_boxcox

# Apply Box-Cox
y_bc, fitted_lambda = boxcox(ts_positive.values)
print(f"Optimal lambda: {fitted_lambda:.4f}")

# Inverse transform
y_original = inv_boxcox(y_bc, fitted_lambda)

fig, axes = plt.subplots(1, 2, figsize=(14, 4))
axes[0].plot(ts_positive.values, label='Original')
axes[0].set_title('Original Series')
axes[1].plot(y_bc, label='Box-Cox', color='orange')
axes[1].set_title(f'Box-Cox Transformed (Ī»={fitted_lambda:.2f})')
plt.tight_layout()
plt.savefig('boxcox_transform.png', dpi=150)
plt.show()

Autocorrelation Analysis

ACF (Autocorrelation Function)

Measures correlation between yty_t and ytāˆ’ky_{t-k}:

Autocorrelation Function

ρk=γkγ0=Cov(yt,ytāˆ’k)Var(yt)\rho_k = \frac{\gamma_k}{\gamma_0} = \frac{\text{Cov}(y_t, y_{t-k})}{\text{Var}(y_t)}

Here,

  • =Autocorrelation at lag k
  • =Autocovariance at lag k

PACF (Partial Autocorrelation Function)

Measures correlation between yty_t and ytāˆ’ky_{t-k} after removing intermediate effects.

Architecture Diagram
ACF vs PACF Intuition:

  ACF: "How correlated is y_t with y_{t-k}?"
       Includes BOTH direct and indirect correlations.

  PACF: "How correlated is y_t with y_{t-k} DIRECTLY?"
         Removes the effect of y_{t-1}, y_{t-2}, ..., y_{t-k+1}

  Example:  y_t depends on y_{t-1}, and y_{t-1} depends on y_{t-2}
    ACF(2) = correlation(y_t, y_{t-2})
      = HIGH (because y_t -> y_{t-1} -> y_{t-2})

    PACF(2) = correlation(y_t, y_{t-2} | y_{t-1})
      = LOW (because the indirect path is removed)

ACF/PACF Visual Rules for Model Selection

Architecture Diagram
AR(p) - Autoregressive:
  "Current value depends on p past values"
  Equation: y_t = c + phi_1*y_{t-1} + ... + phi_p*y_{t-p} + epsilon_t

  ACF:  Tails off (gradual decay)
  PACF: Cuts off after lag p
  → Use PACF to determine p (the order)

MA(q) - Moving Average:
  "Current value depends on q past errors"
  Equation: y_t = c + epsilon_t + theta_1*epsilon_{t-1} + ... + theta_q*epsilon_{t-q}

  ACF:  Cuts off after lag q
  PACF: Tails off (gradual decay)
  → Use ACF to determine q (the order)

ARMA(p,q) - Combined:
  "Both AR and MA components"
  ACF:  Tails off
  PACF: Tails off
  → Use information criteria (AIC/BIC) to find p and q

ARIMA(p,d,q) - With Differencing:
  "Apply differencing d times, then fit ARMA(p,q)"
  If series is non-stationary, difference until stationary (d=1 or d=2)

Real-World Example: Air Passengers

šŸ“Air Passengers Time Series Analysis

from statsmodels.datasets import get_rdataset

# Load Air Passengers dataset
airline = get_rdataset('AirPassengers').data
airline['time'] = pd.date_range('1949-01', periods=len(airline), freq='M')
airline.set_index('time', inplace=True)
ts_air = airline['value']

# Visualize
fig, axes = plt.subplots(2, 1, figsize=(12, 6))
ts_air.plot(ax=axes[0], title='Monthly Airline Passengers')
axes[0].set_ylabel('Passengers (thousands)')

# Log transform
np.log(ts_air).plot(ax=axes[1], title='Log-Transformed')
axes[1].set_ylabel('Log(Passengers)')
plt.tight_layout()
plt.savefig('airpassengers.png', dpi=150)
plt.show()

# Decomposition
decomp = seasonal_decompose(ts_air, model='multiplicative', period=12)
fig = decomp.plot()
fig.set_size_inches(12, 8)
plt.suptitle('Air Passengers - Multiplicative Decomposition', y=1.02)
plt.tight_layout()
plt.savefig('airpassengers_decomp.png', dpi=150)
plt.show()

# ADF tests
print("Original:")
adf_test(ts_air)

print("\nLog-transformed:")
adf_test(np.log(ts_air))

print("\nLog + First differencing:")
adf_test(np.log(ts_air).diff().dropna())

print("\nLog + Seasonal differencing:")
adf_test(np.log(ts_air).diff(12).dropna())

Key Takeaways

  1. Decompose first: Understand trend, seasonality, and noise before modeling
  2. Stationarity is key: Most time series models assume stationarity — use ADF test
  3. Differencing removes trends; seasonal differencing removes seasonal patterns
  4. Multiplicative models when seasonal amplitude grows with level; additive otherwise
  5. STL decomposition is more robust than classical methods
  6. Log transforms stabilize variance and convert multiplicative to additive patterns
  7. ACF/PACF guide model selection for ARIMA models (next lesson)

šŸ“‹Summary: Time Series Basics — Trend, Seasonality

  1. A time series Y={y1,…,yT}Y = \{y_1, \ldots, y_T\} is decomposed into trend, seasonality, cyclicity, and noise.
  2. The additive model Yt=Tt+St+Ct+ϵtY_t = T_t + S_t + C_t + \epsilon_t applies when seasonal amplitude is constant; the multiplicative model Yt=TtƗStƗCt×ϵtY_t = T_t \times S_t \times C_t \times \epsilon_t applies when amplitude grows with level.
  3. Stationarity requires constant mean, constant variance, and lag-dependent autocovariance: E[yt]=μ,ā€…ā€ŠVar(yt)=σ2,ā€…ā€ŠCov(yt,yt+h)=γ(h)E[y_t] = \mu, \; \text{Var}(y_t) = \sigma^2, \; \text{Cov}(y_t, y_{t+h}) = \gamma(h).
  4. The ADF test checks for unit roots (non-stationarity). Reject H0 (p < 0.05) to conclude the series is stationary.
  5. Differencing (yt′=ytāˆ’ytāˆ’1y'_t = y_t - y_{t-1}) removes trends; seasonal differencing removes seasonal patterns.
  6. Log transforms and Box-Cox transforms stabilize variance. Box-Cox optimizes Ī»\lambda via MLE.
  7. STL decomposition uses LOESS smoothing and is more robust to outliers and changing seasonal patterns than classical methods.
  8. The ACF measures total correlation at lag k (direct + indirect); the PACF measures only direct correlation.
  9. ACF/PACF patterns guide ARIMA order selection: AR(p) cuts off at lag p in PACF; MA(q) cuts off at lag q in ACF.
  10. Always test stationarity before modeling — non-stationary series must be transformed or differenced.

Practice Exercises

  1. Decomposition: Decompose a daily temperature dataset. What period is most appropriate? Is the model additive or multiplicative?
  2. Stationarity: Take a non-stationary stock price series. How many differencing steps are needed to achieve stationarity?
  3. Seasonal Pattern: Identify the seasonal period in hourly electricity demand data. Plot ACF to confirm.
  4. Transformation: Compare log, Box-Cox, and square root transforms on a series with increasing variance. Which is most effective?

Advertisement

Need Expert Data Science Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement