dbt Unit Testing

Unit Testing Architecture

Unit Testing Pipeline

Formal Definitions

Detailed Explanation

What is dbt Unit Testing?

dbt unit testing validates that individual models transform data correctly by isolating them from upstream dependencies. Unlike integration tests that run against real data, unit tests use predefined fixtures to ensure deterministic, repeatable results.

Why Unit Testing Matters

Benefit	Description
Isolation	Tests business logic independently of data quality
Speed	Runs in seconds without querying large tables
Determinism	Same fixtures always produce same results
Debugging	Pinpoint exactly which transformation logic fails
Regression	Catch unintended changes to model behavior

Key Takeaway: Unit tests validate transformation logic in isolation using mock fixtures. They complement data tests — use both for comprehensive coverage.

Test Coverage Strategy

Code Examples

Basic Unit Test Configuration

# models/marts/fct_orders.yml
version: 2

models:
  - name: fct_orders
    description: "Fact table for orders"
    data_tests:
      - unique:
          column_name: "order_id"
      - not_null:
          column_name: "order_id"

unit_tests:
  - name: test_fct_orders_basic
    model: fct_orders
    description: "Test basic order transformation logic"
    
    # Mock input for stg_orders
    inputs:
      - name: stg_orders
        rows:
          - { order_id: 1, customer_id: 100, order_date: "2024-01-15", status: "completed", amount: 50.00 }
          - { order_id: 2, customer_id: 101, order_date: "2024-01-16", status: "pending", amount: 75.50 }
      
      - name: stg_customers
        rows:
          - { customer_id: 100, customer_name: "Alice Johnson", segment: "premium" }
          - { customer_id: 101, customer_name: "Bob Smith", segment: "standard" }
    
    # Expected output
    expected_output:
      rows:
        - { order_id: 1, customer_id: 100, customer_name: "Alice Johnson", segment: "premium", amount: 50.00, order_date: "2024-01-15" }
        - { order_id: 2, customer_id: 101, customer_name: "Bob Smith", segment: "standard", amount: 75.50, order_date: "2024-01-16" }

Unit Test for Calculated Fields

# models/marts/fct_revenue.yml
unit_tests:
  - name: test_revenue_calculation
    model: fct_revenue
    description: "Test revenue calculation with discounts and tax"
    
    inputs:
      - name: stg_orders
        rows:
          - { order_id: 1, subtotal: 100.00, discount_pct: 10, tax_rate: 8 }
          - { order_id: 2, subtotal: 200.00, discount_pct: 0, tax_rate: 8 }
          - { order_id: 3, subtotal: 50.00, discount_pct: 25, tax_rate: 0 }
    
    expected_output:
      rows:
        - { order_id: 1, subtotal: 100.00, discount_amount: 10.00, tax_amount: 7.20, total: 97.20 }
        - { order_id: 2, subtotal: 200.00, discount_amount: 0.00, tax_amount: 16.00, total: 216.00 }
        - { order_id: 3, subtotal: 50.00, discount_amount: 12.50, tax_amount: 0.00, total: 37.50 }

Unit Test for Aggregation Logic

# models/marts/fct_daily_summary.yml
unit_tests:
  - name: test_daily_aggregation
    model: fct_daily_summary
    description: "Test daily aggregation with multiple orders per day"
    
    inputs:
      - name: stg_orders
        rows:
          - { order_id: 1, order_date: "2024-01-15", amount: 50.00 }
          - { order_id: 2, order_date: "2024-01-15", amount: 30.00 }
          - { order_id: 3, order_date: "2024-01-16", amount: 75.00 }
    
    expected_output:
      rows:
        - { order_date: "2024-01-15", total_amount: 80.00, order_count: 2 }
        - { order_date: "2024-01-16", total_amount: 75.00, order_count: 1 }

Unit Test for Null Handling

# models/marts/dim_customers.yml
unit_tests:
  - name: test_null_handling
    model: dim_customers
    description: "Test null value handling in customer dimension"
    
    inputs:
      - name: stg_customers
        rows:
          - { customer_id: 1, first_name: "Alice", last_name: "Johnson", email: "alice@example.com" }
          - { customer_id: 2, first_name: null, last_name: "Smith", email: null }
          - { customer_id: 3, first_name: "Charlie", last_name: null, email: "charlie@example.com" }
    
    expected_output:
      rows:
        - { customer_id: 1, full_name: "Alice Johnson", email_domain: "example.com" }
        - { customer_id: 2, full_name: "Smith", email_domain: null }
        - { customer_id: 3, full_name: "Charlie", email_domain: "example.com" }

Model Under Test

-- models/marts/fct_orders.sql
{{
    config(
        materialized='incremental',
        unique_key='order_id',
        tags=['finance', 'unit-tested']
    )
}}

with orders as (
    select * from {{ ref('stg_orders') }}
),

customers as (
    select * from {{ ref('stg_customers') }}
),

final as (
    select
        o.order_id,
        o.customer_id,
        c.customer_name,
        c.segment,
        o.amount,
        o.order_date,
        case
            when o.amount >= 100 then 'high_value'
            when o.amount >= 50 then 'medium_value'
            else 'low_value'
        end as order_tier
    from orders o
    left join customers c on o.customer_id = c.customer_id
)

select * from final

Running Unit Tests

# Run all unit tests
dbt test --select unit_tests

# Run specific unit test
dbt test --select test_fct_orders_basic

# Run with verbose output
dbt test --select unit_tests --vars '{"debug": true}'

# Run unit tests for a specific model
dbt test --select fct_orders+ --resource-type unit_test

Comparison: Unit Tests vs Data Tests

Aspect	Unit Tests	Data Tests
Scope	Single model logic	Data quality rules
Input	Mock fixtures	Real database tables
Speed	Milliseconds	Seconds to minutes
Determinism	Always deterministic	May vary with data
Coverage	Business logic	Schema constraints
Maintenance	Fixtures need updating	Tests need updating
Use Case	Transformation validation	Data integrity checks

Test Fixture Best Practices

Practice	Description	Example
Minimal Rows	Use 3-10 rows per fixture	One row per edge case
Cover Edge Cases	Nulls, zeros, boundary values	Include null customer_id
Descriptive Names	Clear test case descriptions	"test_high_value_order"
Isolated Tests	Each test validates one behavior	Separate join vs calculation tests
Version Control	Track fixture changes in Git	Review fixture updates in PRs
Consistent Format	YAML for fixtures, not CSV	Inline YAML is more readable

Common Test Patterns

Pattern: Testing Joins

unit_tests:
  - name: test_order_customer_join
    model: fct_orders
    inputs:
      - name: stg_orders
        rows:
          - { order_id: 1, customer_id: 100 }
          - { order_id: 2, customer_id: 999 }  # Non-existent customer
      - name: stg_customers
        rows:
          - { customer_id: 100, name: "Alice" }
    expected_output:
      rows:
        - { order_id: 1, customer_id: 100, customer_name: "Alice" }
        - { order_id: 2, customer_id: 999, customer_name: null }  # Left join preserves row

Pattern: Testing Window Functions

unit_tests:
  - name: test_ranking_logic
    model: fct_ranked_orders
    inputs:
      - name: stg_orders
        rows:
          - { customer_id: 100, order_date: "2024-01-15", amount: 100 }
          - { customer_id: 100, order_date: "2024-01-16", amount: 200 }
          - { customer_id: 100, order_date: "2024-01-17", amount: 150 }
    expected_output:
      rows:
        - { customer_id: 100, order_date: "2024-01-15", amount: 100, rank: 3 }
        - { customer_id: 100, order_date: "2024-01-16", amount: 200, rank: 1 }
        - { customer_id: 100, order_date: "2024-01-17", amount: 150, rank: 2 }

Pattern: Testing Conditional Logic

unit_tests:
  - name: test_status_mapping
    model: fct_orders_enriched
    inputs:
      - name: stg_orders
        rows:
          - { order_id: 1, status: "completed" }
          - { order_id: 2, status: "cancelled" }
          - { order_id: 3, status: "returned" }
          - { order_id: 4, status: "unknown" }
    expected_output:
      rows:
        - { order_id: 1, status: "completed", category: "success" }
        - { order_id: 2, status: "cancelled", category: "failure" }
        - { order_id: 3, status: "returned", category: "failure" }
        - { order_id: 4, status: "unknown", category: "unknown" }

Performance Metrics

Metric	Description	Target
Test Execution Time	Time per unit test	< 1 second
Fixture Size	Number of rows per fixture	3-20 rows
Test Coverage	Models with unit tests	> 80% of mart models
Pass Rate	Percentage of passing tests	100%
Maintenance Effort	Time to update fixtures	< 5 min per test

Best Practices

Start with critical models - Unit test mart models first
Test edge cases - Nulls, boundary values, empty results
Keep fixtures small - 3-10 rows is sufficient
Descriptive test names - Clearly state what is being tested
Separate concerns - One test per logical behavior
Version control fixtures - Track changes alongside model code
Run in CI/CD - Execute unit tests on every PR
Complement with data tests - Use both unit and data tests

dbt Unit Testing

dbt Unit Testing

Unit Testing Architecture

Unit Testing Pipeline

Formal Definitions

Detailed Explanation

What is dbt Unit Testing?

Why Unit Testing Matters

Test Coverage Strategy

Code Examples

Basic Unit Test Configuration

Unit Test for Calculated Fields

Unit Test for Aggregation Logic

Unit Test for Null Handling

Model Under Test

Running Unit Tests

Comparison: Unit Tests vs Data Tests

Test Fixture Best Practices

Common Test Patterns

Pattern: Testing Joins

Pattern: Testing Window Functions

Pattern: Testing Conditional Logic

Performance Metrics

Best Practices

See Also

Need Expert dbt Help?