dbt Testing Framework
Testing Architecture
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DBT TESTING FRAMEWORK β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β TEST TYPES β β
β β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β β β SCHEMA TESTS β β DATA TESTS β β CUSTOM TESTS β β β
β β β β β β β β β β
β β β β’ unique β β β’ SQL queriesβ β β’ Generic tests β β β
β β β β’ not_null β β β’ Assertions β β β’ Singular tests β β β
β β β β’ accepted_ β β β’ Business β β β’ Package tests β β β
β β β values β β rules β β β’ Custom macros β β β
β β β β’ relation- β β β β β β β
β β β ships β β β β β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β TEST EXECUTION β β
β β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β PARSE βββββΆβ COMPILE βββββΆβ EXECUTE βββββΆβ RESULTS β β β
β β β TESTS β β SQL β β QUERIES β β OUTPUT β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Test Execution Flow
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TEST EXECUTION PIPELINE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 1. SCHEMA VALIDATION β β
β β β β
β β Test: unique β β
β β SQL: SELECT COUNT(*) FROM ( β β
β β SELECT column, COUNT(*) as cnt β β
β β FROM model β β
β β GROUP BY column β β
β β HAVING COUNT(*) > 1 β β
β β ) β β
β β Result: PASS (0 rows) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 2. DATA VALIDATION β β
β β β β
β β Test: not_null β β
β β SQL: SELECT COUNT(*) FROM model WHERE column IS NULL β β
β β Result: PASS (0 rows) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 3. BUSINESS RULES β β
β β β β
β β Test: accepted_values β β
β β SQL: SELECT DISTINCT status FROM model β β
β β WHERE status NOT IN ('active', 'inactive', 'pending') β β
β β Result: PASS (0 rows) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Test Result Dashboard
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TEST RESULTS DASHBOARD β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β MODEL: fct_orders β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Test Name β Status β Rows Failed β Duration β Details β
β βββββββββββββββββββββΌββββββββββΌββββββββββββββΌβββββββββββΌββββββββββββββββββ β
β unique_order_id β β PASS β 0 β 0.23s β All unique β
β not_null_order_id β β PASS β 0 β 0.18s β No nulls β
β not_null_customer_idβ β PASS β 0 β 0.19s β No nulls β
β valid_status β β PASS β 0 β 0.21s β All valid β
β positive_amount β β FAIL β 142 β 0.25s β Negative values β
β recent_dates β β PASS β 0 β 0.31s β All recent β
β βββββββββββββββββββββΌββββββββββΌββββββββββββββΌβββββββββββΌββββββββββββββββββ β
β TOTAL: 5 passed, 1 failed β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Detailed Explanation
dbt's testing framework is a powerful tool for ensuring data quality and validating business logic. Tests are assertions about your data that help catch issues early in the transformation process.
Schema Tests
Schema tests are defined in YAML files and validate the structure and constraints of your data:
- unique: Ensures all values in a column are unique
- not_null: Ensures no null values exist in a column
- accepted_values: Ensures column values match a predefined list
- relationships: Ensures foreign key relationships are valid
Data Tests
Data tests are SQL queries that return failing rows when conditions are not met:
- Custom SQL assertions
- Business rule validation
- Cross-table comparisons
- Aggregation checks
Custom Tests
Custom tests are reusable test definitions that can be shared across projects:
- Generic tests with parameters
- Custom test macros
- Package-based tests
Test Configuration
Tests can be configured at multiple levels:
- Column level: In YAML definitions
- Model level: In model configurations
- Project level: In dbt_project.yml
- Package level: In installed packages
Test Execution
When you run dbt test, dbt:
- Parses all test definitions
- Compiles test SQL queries
- Executes tests against the target warehouse
- Reports pass/fail results
- Optionally fails the build on test failures
Test Best Practices
- Test early and often - Run tests on every commit
- Test data quality - Not just structure, but content
- Use relationships tests - Validate foreign keys
- Test business rules - Custom assertions for domain logic
- Monitor freshness - Ensure data is up to date
- Use packages - Leverage community test libraries
- Document tests - Explain what each test validates
- Fail fast - Configure tests to fail the build on errors
Code Examples
Schema Test Definitions
# models/staging/_sources.yml
version: 2
sources:
- name: raw
database: raw_data
schema: public
tables:
- name: orders
columns:
- name: id
description: "Unique order identifier"
data_tests:
- unique
- not_null
- name: customer_id
description: "Foreign key to customers"
data_tests:
- not_null
- relationships:
to: ref('stg_customers')
field: customer_id
- name: status
description: "Order status"
data_tests:
- not_null
- accepted_values:
values: ['pending', 'shipped', 'delivered', 'cancelled']
- name: amount
description: "Order amount in USD"
data_tests:
- not_null
Custom Data Tests
-- tests/assert_order_amount_positive.sql
-- Test that all order amounts are positive
select
order_id,
amount,
customer_id
from {{ ref('fct_orders') }}
where amount < 0
Advanced Custom Test
-- tests/test_monthly_revenue_growth.sql
-- Test that monthly revenue is growing
with monthly_revenue as (
select
date_trunc('month', order_date) as month,
sum(amount) as revenue
from {{ ref('fct_orders') }}
group by 1
),
revenue_with_growth as (
select
month,
revenue,
lag(revenue) over (order by month) as prev_month_revenue,
(revenue - lag(revenue) over (order by month)) /
nullif(lag(revenue) over (order by month), 0) as growth_rate
from monthly_revenue
)
select
month,
revenue,
growth_rate
from revenue_with_growth
where growth_rate < -0.1 -- Flag months with >10% revenue decline
Generic Test Macro
-- macros/generic_tests/test_freshness.sql
{% test freshness(model, column_name, interval) %}
with source_data as (
select
max({{ column_name }}) as last_record,
{{ dbt_utils.current_timestamp() }} as current_time
from {{ model }}
),
validation as (
select
last_record,
current_time,
datediff('hour', last_record, current_time) as hours_since_last_record
from source_data
)
select *
from validation
where hours_since_last_record > {{ interval }}
{% endtest %}
Test Configuration in dbt_project.yml
# dbt_project.yml
name: 'my_project'
version: '1.0.0'
tests:
my_project:
+severity: error
+store_failures: true
+schema: test_results
staging:
+severity: warn
marts:
+severity: error
+fail_calc: "count(*)"
+warn_if: ">10"
+error_if: ">100"
Test with Store Failures
# models/marts/fct_orders.yml
version: 2
models:
- name: fct_orders
description: "Fact table for orders"
columns:
- name: order_id
data_tests:
- unique:
config:
store_failures: true
schema: test_failures
- not_null
- name: amount
data_tests:
- not_null
- dbt_utils.accepted_range:
min_value: 0
max_value: 1000000
config:
store_failures: true
Performance Metrics
| Test Type | Execution Time | Resource Usage | Coverage |
|---|---|---|---|
| unique | Low | Low | High |
| not_null | Low | Low | High |
| accepted_values | Medium | Low | Medium |
| relationships | Medium | Medium | High |
| Custom SQL | Variable | Variable | Variable |
| Package tests | Variable | Variable | High |
Best Practices
- Test all critical columns - uniqueness, not_null, relationships
- Use relationships tests - validate foreign key integrity
- Test business rules - custom assertions for domain logic
- Store test failures - for debugging and analysis
- Use severity levels - warn vs error based on impact
- Test freshness - ensure data is current
- Use packages - dbt_expectations for advanced tests
- Document tests - explain what each test validates
- Run tests in CI/CD - catch issues before production
- Monitor test results - track quality metrics over time