dbt Package Development
Package Architecture
Package Development Pipeline
Formal Definitions
Dfdbt Package
A dbt package is a reusable collection of macros, models, tests, sources, and documentation that can be shared across dbt projects. Packages are installed via packages.yml or the dbt_project.yml dependencies section. They enable code reuse, standardization, and community-driven development. Formally, a package P = {M, T, S, D} where M is macros, T is tests, S is sources, and D is documentation.
DfSemantic Versioning
Semantic versioning (SemVer) is a versioning scheme using the format MAJOR.MINOR.PATCH. MAJOR indicates breaking changes, MINOR indicates new features (backward-compatible), and PATCH indicates bug fixes. dbt packages follow SemVer: 1.0.0 means major version 1, minor version 0, patch version 0. The compatible_with field in packages.yml specifies version ranges.
DfPackage Namespace
The package namespace is the prefix used to avoid naming conflicts between packages. When a project installs a package, macros and models from the package are accessed with the package name prefix (e.g., dbt_utils.generate_surrogate_key). The namespace prevents collisions between macros with the same name in different packages.
Detailed Explanation
dbt packages are the primary mechanism for code reuse and standardization. They enable organizations to share common transformations, tests, and documentation across multiple projects.
Package Types
- Public packages - Published to dbt Hub (e.g., dbt-utils, dbt-expectations)
- Private packages - Hosted in private Git repositories
- Local packages - Referenced via local file path
- Internal packages - Organization-specific shared code
Package Components
| Component | Purpose | Location |
|---|---|---|
| Macros | Reusable SQL/Jinja logic | macros/ |
| Models | Reusable data transformations | models/ |
| Tests | Custom data test logic | macros/ (test macros) |
| Sources | Source definitions | models/sources/ |
| Documentation | Package descriptions | docs/ |
| Variables | Configurable parameters | dbt_project.yml |
Packages can include both macros and models. Macros are accessed via the package namespace (e.g., package_name.macro_name()), while models are accessed via ref('package_name.model_name'). This distinction is important for understanding how packages integrate with your project.
When developing packages, keep the interface small and well-documented. Public packages should expose a minimal API surface to reduce breaking changes. Use variables for configuration and provide sensible defaults. Always test your package against multiple dbt versions and database platforms.
Code Examples
Package Project Structure
my-dbt-package/
āāā dbt_project.yml
āāā README.md
āāā LICENSE
āāā macros/
ā āāā generate_schema_name.sql
ā āāā cast_column.sql
ā āāā tests/
ā āāā test_accepted_range.sql
āāā models/
ā āāā staging/
ā ā āāā stg_utils_dates.sql
ā āāā marts/
ā āāā dim_date.sql
āāā integration_tests/
ā āāā dbt_project.yml
ā āāā models/
ā ā āāā test_model.sql
ā āāā macros/
ā āāā run_tests.sql
āāā docs/
āāā package_overview.md
Package Configuration
# dbt_project.yml (package)
name: 'my_utils'
version: '1.2.0'
config-version: 2
require-dbt-version: [">=1.0.0", "<2.0.0"]
models:
my_utils:
staging:
+materialized: view
+schema: staging
marts:
+materialized: table
+schema: analytics
vars:
date_start: '2020-01-01'
date_end: '2025-12-31'
default_schema: 'analytics'
Consuming a Package
# packages.yml (consuming project)
packages:
- package: dbt-labs/dbt_utils
version: [">=1.0.0", "<2.0.0"]
- package: dbt-labs/dbt_expectations
version: [">=0.10.0", "<1.0.0"]
- git: "https://github.com/my-org/my_utils.git"
revision: "1.2.0"
- local: "../my_local_package"
Custom Package Macro
-- macros/generate_schema_name.sql (package)
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}
{{ default_schema }}
{%- else -%}
{{ var('default_schema', default_schema) }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
Package with Models
-- models/staging/stg_utils_dates.sql
{{
config(materialized='view')
}}
with date_spine as (
{{ dbt_utils.date_spine(
start_date="'" ~ var('date_start') ~ "'",
end_date="'" ~ var('date_end') ~ "'",
datepart="day"
)}}
),
formatted as (
select
date_day as date_value,
extract(year from date_day) as year,
extract(month from date_day) as month,
extract(day from date_day) as day,
extract(dayofweek from date_day) as day_of_week,
case
when extract(dayofweek from date_day) in (0, 6) then 'weekend'
else 'weekday'
end as day_type
from date_spine
)
select * from formatted
Package Test Macro
-- macros/tests/test_accepted_range.sql
{% test accepted_range(model, column_name, min_value=none, max_value=none, inclusive=true) %}
{%- set min_clause = "and " ~ column_name ~ " >= " ~ min_value if min_value is not none else "" -%}
{%- set max_clause = "and " ~ column_name ~ " <= " ~ max_value if max_value is not none else "" -%}
select
'{{ model }}' as model_name,
'{{ column_name }}' as column_name,
count(*) as failures
from {{ model }}
where {{ column_name }} is not null
and (
{{ "1=1" if min_value is none else column_name ~ " < " ~ min_value if not inclusive else column_name ~ " < " ~ min_value }}
or
{{ "1=1" if max_value is none else column_name ~ " > " ~ max_value if not inclusive else column_name ~ " > " ~ max_value }}
)
{% endtest %}
Versioning and Changelog
# CHANGELOG.md
## [1.2.0] - 2024-06-15
### Added
- `test_accepted_range` generic test with inclusive/exclusive options
- `date_spine` wrapper macro for cross-database compatibility
### Changed
- Updated `generate_schema_name` to support custom schema prefix
### Fixed
- Fixed null handling in `cast_column` macro
## [1.1.0] - 2024-05-01
### Added
- `dim_date` model for calendar dimensions
- `stg_utils_dates` staging model
## [1.0.0] - 2024-04-01
### Added
- Initial release
- `generate_schema_name` macro
- `cast_column` macro
Integration Tests
# integration_tests/dbt_project.yml
name: 'my_utils_integration_tests'
version: '1.0.0'
require-dbt-version: [">=1.0.0"]
models:
my_utils_integration_tests:
+schema: test_results
vars:
date_start: '2024-01-01'
date_end: '2024-01-31'
# Reference the parent package
packages:
- local: "../"
-- integration_tests/models/test_model.sql
{{
config(materialized='ephemeral')
}}
with test_dates as (
select * from {{ ref('dim_date') }}
),
test_acceptance as (
select
count(*) as total_rows,
count(distinct date_value) as unique_dates,
min(date_value) as min_date,
max(date_value) as max_date
from test_dates
)
select
case
when total_rows = 31 then 'PASS'
else 'FAIL: expected 31 rows, got ' || total_rows
end as test_result
from test_acceptance
Package Comparison
| Feature | dbt Hub | Private Git | Local |
|---|---|---|---|
| Visibility | Public | Private | Private |
| Versioning | SemVer tags | Git tags | Manual |
| Installation | package | git | local |
| Updates | dbt deps | dbt deps | dbt deps |
| Documentation | Auto-generated | Manual | Manual |
| Testing | CI/CD required | CI/CD required | Manual |
Best Practices
- Semantic versioning - Follow MAJOR.MINOR.PATCH strictly
- Backward compatibility - Avoid breaking changes in MINOR/PATCH
- Comprehensive testing - Include integration tests in the package
- Clear documentation - Document all macros, variables, and models
- Small API surface - Expose minimal public interface
- Configurable variables - Use
var()for customization - Multi-database support - Test against all target databases
- Changelog maintenance - Keep CHANGELOG.md updated
See Also
- dbt Projects ā Project configuration and structure
- Custom Macros ā Building reusable macros
- dbt Best Practices ā Code organization patterns
- Advanced Jinja ā Advanced templating techniques