Time Series Basics: Trend, Seasonality

What is a Time Series?

DfTime Series

A sequence of data points indexed in temporal order, where each observation is associated with a timestamp. Formally, a time series is a stochastic process $Y = \{y_1, y_2, \ldots, y_T\}$ where $y_t$ is the observation at time $t$ .

Y = \{y_1, y_2, y_3, \ldots, y_T\}

where $y_t$ is the observation at time $t$ .

Architecture Diagram

Time Series Example (Daily Temperature):

Temp(°F)
  95│            ╭─╮     ╭─╮
  90│     ╭─╮   ╭╯ ╰╮   ╭╯ ╰╮
  85│    ╭╯ ╰╮ ╭╯   ╰╮ ╭╯   ╰╮
  80│   ╭╯   ╰─╯     ╰─╯     ╰╮
  75│───╯                       ╰────
  70│
    └────────────────────────────────→ Time
     Jan  Feb  Mar  Apr  May  Jun  Jul

  Components: Upward trend + Seasonal pattern + Random noise

Time Series Components

DfTime Series Decomposition

The process of separating a time series into its constituent components: trend, seasonality, cyclicity, and noise. The two fundamental models are additive (components sum) and multiplicative (components multiply).

Every time series can be decomposed into components:

Additive Decomposition Model

Y_t = T_t + S_t + C_t + \epsilon_t

Here,

=Trend — long-term direction
=Seasonality — fixed-period patterns
=Cyclicity — irregular long-term fluctuations
=Noise — random, unpredictable variation

Multiplicative Decomposition Model

Y_t = T_t \times S_t \times C_t \times \epsilon_t

Here,

=Trend component
=Seasonal component (multiplicative factor)
=Cyclical component
=Noise component

Architecture Diagram

Component Breakdown:
┌─────────────────────────────────────────────────────────────┐
│  Y_t = T_t + S_t + C_t + ε_t                               │
│                                                             │
│  T_t = Trend        → Long-term direction                   │
│  S_t = Seasonality  → Fixed-period patterns                 │
│  C_t = Cyclicity    → Irregular long-term fluctuations      │
│  ε_t = Noise        → Random, unpredictable variation       │
└─────────────────────────────────────────────────────────────┘

Use the additive model when the seasonal amplitude is roughly constant over time. Use the multiplicative model when the seasonal amplitude grows with the level of the series (e.g., airline passengers — the seasonal swing is larger when overall traffic is higher). A log transform can convert a multiplicative relationship into an additive one.

1. Trend

The long-term direction of the series (upward, downward, or flat).

import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

# Create synthetic time series
np.random.seed(42)
n = 365
time = np.arange(n)

trend = 0.05 * time                    # Linear upward trend
seasonal = 10 * np.sin(2 * np.pi * time / 365)  # Yearly cycle
noise = np.random.randn(n) * 2

y = trend + seasonal + noise

fig, axes = plt.subplots(4, 1, figsize=(12, 8), sharex=True)

axes[0].plot(time, y, 'b-', linewidth=0.8, alpha=0.7)
axes[0].set_title('Original Series')

axes[1].plot(time, trend, 'r-', linewidth=2)
axes[1].set_title('Trend Component')

axes[2].plot(time, seasonal, 'g-', linewidth=1)
axes[2].set_title('Seasonal Component')

axes[3].plot(time, noise, 'gray', linewidth=0.5, alpha=0.5)
axes[3].set_title('Noise Component')

plt.tight_layout()
plt.savefig('ts_components.png', dpi=150)
plt.show()

2. Seasonality

Repeating patterns at fixed intervals (daily, weekly, monthly, yearly).

# Types of seasonality
fig, axes = plt.subplots(3, 1, figsize=(12, 6))

t = np.arange(0, 7*24, 1)  # 7 days of hourly data

# Daily pattern (peaks at noon)
daily = 5 * np.sin(2 * np.pi * t / 24)
axes[0].plot(t, daily, 'b-')
axes[0].set_title('Daily Seasonality (24-hour cycle)')
axes[0].set_ylabel('Effect')

# Weekly pattern (lower on weekends)
weekly = 3 * np.sin(2 * np.pi * t / (24*7) + np.pi/4)
axes[1].plot(t, weekly, 'r-')
axes[1].set_title('Weekly Seasonality (168-hour cycle)')
axes[1].set_ylabel('Effect')

# Combined
axes[2].plot(t, daily + weekly, 'g-', linewidth=0.8)
axes[2].set_title('Combined Seasonality')
axes[2].set_xlabel('Hours')
axes[2].set_ylabel('Effect')

plt.tight_layout()
plt.savefig('seasonality_types.png', dpi=150)
plt.show()

3. Cyclicity

Long-term oscillations without fixed periods (economic cycles, climate patterns).

# Cyclical component
cycle = 15 * np.sin(2 * np.pi * time / 365 * 4) + \
       8 * np.sin(2 * np.pi * time / 365 * 2.5)

plt.figure(figsize=(12, 4))
plt.plot(time, cycle, 'purple', linewidth=1.5)
plt.title('Cyclical Component (Multi-year cycles)')
plt.xlabel('Days')
plt.ylabel('Effect')
plt.savefig('cyclical_component.png', dpi=150)
plt.show()

Stationarity

DfStationarity

A time series is (weakly) stationary if its statistical properties do not change over time: constant mean, constant variance, and autocovariance that depends only on the lag, not on the absolute time index. Most time series models (ARIMA, etc.) assume stationarity.

E[y_t] = \mu \;\forall\, t, \quad \text{Var}(y_t) = \sigma^2 \;\forall\, t, \quad \text{Cov}(y_t, y_{t+h}) = \gamma(h)

Architecture Diagram

Stationary vs Non-Stationary:

Stationary:                    Non-Stationary:
  y ↑    ────────              y ↑         ╱
    │  ───╱╲──╱╲──               │       ╱
    │ ──╱╲──╱╲──╱╲──             │     ╱
    │╱╲──╱╲──╱╲──╱╲──            │   ╱
    └──────────────→ t           │ ╱
    Mean & variance constant     └──────────────→ t
                                  Increasing trend → non-stationary

Augmented Dickey-Fuller (ADF) Test

DfAugmented Dickey-Fuller Test

A statistical test for the presence of a unit root in a time series. A unit root indicates the series is non-stationary — shocks persist indefinitely. The ADF test regresses the first difference on lagged levels and lagged differences, testing whether the coefficient on the lagged level is significantly negative.

The ADF test equation:

Augmented Dickey-Fuller Regression

\Delta y_t = \alpha + \beta t + \gamma y_{t-1} + \sum_{i=1}^{p} \delta_i \Delta y_{t-i} + \epsilon_t

Here,

=First difference of the series
=Coefficient tested (unit root if γ = 0)
=Number of lagged difference terms
=Constant (drift)
=Time trend coefficient

Hypotheses:

$H_0\text{: Series has a unit root (non-stationary) — } \gamma = 0$
$H_1\text{: Series is stationary — } \gamma < 0$

Decision rule: If p-value < 0.05, reject H0 (series is stationary).

The ADF test's power depends on the number of lags $p$ . Too few lags leave autocorrelation in the residuals; too many reduces power. Use information criteria (AIC, BIC) to select the lag order, or let adfuller choose automatically via autolag='AIC'.

from statsmodels.tsa.stattools import adfuller

def adf_test(series, title=''):
    """Perform ADF test and print results."""
    result = adfuller(series.dropna(), autolag='AIC')

    print(f'ADF Test: {title}')
    print('=' * 50)
    print(f'ADF Statistic:  {result[0]:.4f}')
    print(f'p-value:        {result[1]:.4f}')
    print(f'Lags Used:      {result[2]}')
    print(f'Observations:   {result[3]}')

    for key, value in result[4].items():
        print(f'Critical Value ({key}): {value:.4f}')

    if result[1] <= 0.05:
        print(f'\n✓ Stationary (reject H0 at 5% level)')
    else:
        print(f'\n✗ Non-stationary (fail to reject H0)')

    return result[1]

# Test on original series
adf_test(y, 'Original Series')

# Test on differenced series
y_diff = np.diff(y)
adf_test(y_diff, 'Differenced Series')

Decomposition Methods

Classical Decomposition

import pandas as pd

# Create DataFrame
dates = pd.date_range(start='2023-01-01', periods=365, freq='D')
ts = pd.Series(y, index=dates)

# Additive decomposition
decomposition_add = seasonal_decompose(
    ts, model='additive', period=365
)

# Multiplicative decomposition
ts_positive = ts + 20  # Shift to positive values
decomposition_mul = seasonal_decompose(
    ts_positive, model='multiplicative', period=365
)

# Plot additive decomposition
fig = decomposition_add.plot()
fig.set_size_inches(12, 8)
plt.suptitle('Additive Decomposition', fontsize=14, y=1.02)
plt.tight_layout()
plt.savefig('additive_decomposition.png', dpi=150)
plt.show()

STL Decomposition (Seasonal-Trend using LOESS)

DfSTL Decomposition

Seasonal and Trend decomposition using LOESS (Locally Estimated Scatterplot Smoothing). STL is more robust than classical decomposition, handling any type of seasonality, robustness to outliers, and a seasonal component that changes over time.

from statsmodels.tsa.seasonal import STL

# STL decomposition
stl = STL(ts, period=365, robust=True)
result = stl.fit()

fig = result.plot()
fig.set_size_inches(12, 8)
plt.suptitle('STL Decomposition', fontsize=14, y=1.02)
plt.tight_layout()
plt.savefig('stl_decomposition.png', dpi=150)
plt.show()

# Access components
trend_stl = result.trend
seasonal_stl = result.seasonal
residual_stl = result.resid

Choosing Between Additive and Multiplicative

def check_model_type(series, period):
    """Determine if additive or multiplicative decomposition is appropriate."""
    decomposition = seasonal_decompose(series, model='additive', period=period)

    seasonal_amplitude = decomposition.seasonal.max() - decomposition.seasonal.min()

    print(f"Seasonal amplitude: {seasonal_amplitude:.4f}")
    print(f"Series mean:        {series.mean():.4f}")

    ratio = seasonal_amplitude / series.mean()
    print(f"Amplitude/Mean:     {ratio:.4f}")

    if ratio > 0.3:
        print("→ Large seasonal variation relative to level → Multiplicative")
    else:
        print("→ Small seasonal variation relative to level → Additive")

    return 'multiplicative' if ratio > 0.3 else 'additive'

model_type = check_model_type(ts, 365)

Making a Series Stationary

1. Differencing

First-Order Differencing

y'_t = y_t - y_{t-1}

Here,

=Differenced series at time t
=Original series at time t

Second-order differencing:

y''_t = y'_t - y'_{t-1} = y_t - 2y_{t-1} + y_{t-2}

# First-order differencing
y_diff1 = ts.diff().dropna()
adf_test(y_diff1, 'First Differencing')

# Seasonal differencing
y_diff_seasonal = ts.diff(365).dropna()
adf_test(y_diff_seasonal, 'Seasonal Differencing (period=365)')

# Combined differencing
y_diff_combined = ts.diff().diff(365).dropna()
adf_test(y_diff_combined, 'Combined Differencing')

2. Log Transformation

Stabilizes variance when it changes with level:

Log Transformation

y'_t = \log(y_t)

Here,

=Transformed series (log scale)
=Original series (positive values)

# Log transformation (for positive series)
y_log = np.log(ts_positive)
y_log_diff = y_log.diff().dropna()

fig, axes = plt.subplots(1, 2, figsize=(14, 4))
ts_positive.plot(ax=axes[0], title='Original (Positive)')
y_log_diff.plot(ax=axes[1], title='Log-Transformed + Differenced')
plt.tight_layout()
plt.savefig('log_transform.png', dpi=150)
plt.show()

3. Box-Cox Transformation

Power transformation that stabilizes variance:

Box-Cox Transformation

y^{(\lambda)} = \begin{cases} \frac{y^\lambda - 1}{\lambda} & \lambda \neq 0 \\ \ln(y) & \lambda = 0 \end{cases}

Here,

=Transformation parameter (optimized via MLE)
=Original series (positive values)

The optimal $\lambda$ is chosen by maximum likelihood estimation. Common values: $\lambda = 1$ (no transform), $\lambda = 0.5$ (square root), $\lambda = 0$ (log). The Box-Cox transform requires strictly positive data. For non-positive data, use the Yeo-Johnson transform.

from scipy.stats import boxcox
from scipy.special import inv_boxcox

# Apply Box-Cox
y_bc, fitted_lambda = boxcox(ts_positive.values)
print(f"Optimal lambda: {fitted_lambda:.4f}")

# Inverse transform
y_original = inv_boxcox(y_bc, fitted_lambda)

fig, axes = plt.subplots(1, 2, figsize=(14, 4))
axes[0].plot(ts_positive.values, label='Original')
axes[0].set_title('Original Series')
axes[1].plot(y_bc, label='Box-Cox', color='orange')
axes[1].set_title(f'Box-Cox Transformed (λ={fitted_lambda:.2f})')
plt.tight_layout()
plt.savefig('boxcox_transform.png', dpi=150)
plt.show()

Autocorrelation Analysis

ACF (Autocorrelation Function)

Measures correlation between $y_t$ and $y_{t-k}$ :

Autocorrelation Function

\rho_k = \frac{\gamma_k}{\gamma_0} = \frac{\text{Cov}(y_t, y_{t-k})}{\text{Var}(y_t)}

Here,

=Autocorrelation at lag k
=Autocovariance at lag k

PACF (Partial Autocorrelation Function)

Measures correlation between $y_t$ and $y_{t-k}$ after removing intermediate effects.

Architecture Diagram

ACF vs PACF Intuition:

  ACF: "How correlated is y_t with y_{t-k}?"
       Includes BOTH direct and indirect correlations.

  PACF: "How correlated is y_t with y_{t-k} DIRECTLY?"
         Removes the effect of y_{t-1}, y_{t-2}, ..., y_{t-k+1}

  Example:  y_t depends on y_{t-1}, and y_{t-1} depends on y_{t-2}
    ACF(2) = correlation(y_t, y_{t-2})
      = HIGH (because y_t -> y_{t-1} -> y_{t-2})

    PACF(2) = correlation(y_t, y_{t-2} | y_{t-1})
      = LOW (because the indirect path is removed)

ACF/PACF Visual Rules for Model Selection

Architecture Diagram

AR(p) - Autoregressive:
  "Current value depends on p past values"
  Equation: y_t = c + phi_1*y_{t-1} + ... + phi_p*y_{t-p} + epsilon_t

  ACF:  Tails off (gradual decay)
  PACF: Cuts off after lag p
  → Use PACF to determine p (the order)

MA(q) - Moving Average:
  "Current value depends on q past errors"
  Equation: y_t = c + epsilon_t + theta_1*epsilon_{t-1} + ... + theta_q*epsilon_{t-q}

  ACF:  Cuts off after lag q
  PACF: Tails off (gradual decay)
  → Use ACF to determine q (the order)

ARMA(p,q) - Combined:
  "Both AR and MA components"
  ACF:  Tails off
  PACF: Tails off
  → Use information criteria (AIC/BIC) to find p and q

ARIMA(p,d,q) - With Differencing:
  "Apply differencing d times, then fit ARMA(p,q)"
  If series is non-stationary, difference until stationary (d=1 or d=2)

Real-World Example: Air Passengers

📝Air Passengers Time Series Analysis

from statsmodels.datasets import get_rdataset

# Load Air Passengers dataset
airline = get_rdataset('AirPassengers').data
airline['time'] = pd.date_range('1949-01', periods=len(airline), freq='M')
airline.set_index('time', inplace=True)
ts_air = airline['value']

# Visualize
fig, axes = plt.subplots(2, 1, figsize=(12, 6))
ts_air.plot(ax=axes[0], title='Monthly Airline Passengers')
axes[0].set_ylabel('Passengers (thousands)')

# Log transform
np.log(ts_air).plot(ax=axes[1], title='Log-Transformed')
axes[1].set_ylabel('Log(Passengers)')
plt.tight_layout()
plt.savefig('airpassengers.png', dpi=150)
plt.show()

# Decomposition
decomp = seasonal_decompose(ts_air, model='multiplicative', period=12)
fig = decomp.plot()
fig.set_size_inches(12, 8)
plt.suptitle('Air Passengers - Multiplicative Decomposition', y=1.02)
plt.tight_layout()
plt.savefig('airpassengers_decomp.png', dpi=150)
plt.show()

# ADF tests
print("Original:")
adf_test(ts_air)

print("\nLog-transformed:")
adf_test(np.log(ts_air))

print("\nLog + First differencing:")
adf_test(np.log(ts_air).diff().dropna())

print("\nLog + Seasonal differencing:")
adf_test(np.log(ts_air).diff(12).dropna())

Key Takeaways

Decompose first: Understand trend, seasonality, and noise before modeling
Stationarity is key: Most time series models assume stationarity — use ADF test
Differencing removes trends; seasonal differencing removes seasonal patterns
Multiplicative models when seasonal amplitude grows with level; additive otherwise
STL decomposition is more robust than classical methods
Log transforms stabilize variance and convert multiplicative to additive patterns
ACF/PACF guide model selection for ARIMA models (next lesson)

📋Summary: Time Series Basics — Trend, Seasonality

A time series $Y = \{y_1, \ldots, y_T\}$ is decomposed into trend, seasonality, cyclicity, and noise.
The additive model $Y_t = T_t + S_t + C_t + \epsilon_t$ applies when seasonal amplitude is constant; the multiplicative model $Y_t = T_t \times S_t \times C_t \times \epsilon_t$ applies when amplitude grows with level.
Stationarity requires constant mean, constant variance, and lag-dependent autocovariance: $E[y_t] = \mu, \; \text{Var}(y_t) = \sigma^2, \; \text{Cov}(y_t, y_{t+h}) = \gamma(h)$ .
The ADF test checks for unit roots (non-stationarity). Reject H0 (p < 0.05) to conclude the series is stationary.
Differencing ( $y'_t = y_t - y_{t-1}$ ) removes trends; seasonal differencing removes seasonal patterns.
Log transforms and Box-Cox transforms stabilize variance. Box-Cox optimizes $\lambda$ via MLE.
STL decomposition uses LOESS smoothing and is more robust to outliers and changing seasonal patterns than classical methods.
The ACF measures total correlation at lag k (direct + indirect); the PACF measures only direct correlation.
ACF/PACF patterns guide ARIMA order selection: AR(p) cuts off at lag p in PACF; MA(q) cuts off at lag q in ACF.
Always test stationarity before modeling — non-stationary series must be transformed or differenced.

Practice Exercises

Decomposition: Decompose a daily temperature dataset. What period is most appropriate? Is the model additive or multiplicative?
Stationarity: Take a non-stationary stock price series. How many differencing steps are needed to achieve stationarity?
Seasonal Pattern: Identify the seasonal period in hourly electricity demand data. Plot ACF to confirm.
Transformation: Compare log, Box-Cox, and square root transforms on a series with increasing variance. Which is most effective?

Time Series Basics: Trend, Seasonality

Time Series Basics: Trend, Seasonality

What is a Time Series?

DfTime Series

Time Series Components

DfTime Series Decomposition

Additive Decomposition Model

Multiplicative Decomposition Model

1. Trend

2. Seasonality

3. Cyclicity

Stationarity

DfStationarity

Augmented Dickey-Fuller (ADF) Test

DfAugmented Dickey-Fuller Test

Augmented Dickey-Fuller Regression

Decomposition Methods

Classical Decomposition

STL Decomposition (Seasonal-Trend using LOESS)

DfSTL Decomposition

Choosing Between Additive and Multiplicative

Making a Series Stationary

1. Differencing

First-Order Differencing

2. Log Transformation

Log Transformation

3. Box-Cox Transformation

Box-Cox Transformation

Autocorrelation Analysis

ACF (Autocorrelation Function)

Autocorrelation Function

PACF (Partial Autocorrelation Function)

ACF/PACF Visual Rules for Model Selection

Real-World Example: Air Passengers

📝Air Passengers Time Series Analysis

Key Takeaways

📋Summary: Time Series Basics — Trend, Seasonality

Practice Exercises

Need Expert Data Science Help?