CW

dbt Package Development

Free Lesson

Advertisement

dbt Package Development

Package Architecture

Package Development Pipeline

Formal Definitions

Dfdbt Package

A dbt package is a reusable collection of macros, models, tests, sources, and documentation that can be shared across dbt projects. Packages are installed via packages.yml or the dbt_project.yml dependencies section. They enable code reuse, standardization, and community-driven development. Formally, a package P = {M, T, S, D} where M is macros, T is tests, S is sources, and D is documentation.

DfSemantic Versioning

Semantic versioning (SemVer) is a versioning scheme using the format MAJOR.MINOR.PATCH. MAJOR indicates breaking changes, MINOR indicates new features (backward-compatible), and PATCH indicates bug fixes. dbt packages follow SemVer: 1.0.0 means major version 1, minor version 0, patch version 0. The compatible_with field in packages.yml specifies version ranges.

DfPackage Namespace

The package namespace is the prefix used to avoid naming conflicts between packages. When a project installs a package, macros and models from the package are accessed with the package name prefix (e.g., dbt_utils.generate_surrogate_key). The namespace prevents collisions between macros with the same name in different packages.

Detailed Explanation

dbt packages are the primary mechanism for code reuse and standardization. They enable organizations to share common transformations, tests, and documentation across multiple projects.

Package Types

  1. Public packages - Published to dbt Hub (e.g., dbt-utils, dbt-expectations)
  2. Private packages - Hosted in private Git repositories
  3. Local packages - Referenced via local file path
  4. Internal packages - Organization-specific shared code

Package Components

ComponentPurposeLocation
MacrosReusable SQL/Jinja logicmacros/
ModelsReusable data transformationsmodels/
TestsCustom data test logicmacros/ (test macros)
SourcesSource definitionsmodels/sources/
DocumentationPackage descriptionsdocs/
VariablesConfigurable parametersdbt_project.yml

Packages can include both macros and models. Macros are accessed via the package namespace (e.g., package_name.macro_name()), while models are accessed via ref('package_name.model_name'). This distinction is important for understanding how packages integrate with your project.

When developing packages, keep the interface small and well-documented. Public packages should expose a minimal API surface to reduce breaking changes. Use variables for configuration and provide sensible defaults. Always test your package against multiple dbt versions and database platforms.

Code Examples

Package Project Structure

Architecture Diagram
my-dbt-package/
ā”œā”€ā”€ dbt_project.yml
ā”œā”€ā”€ README.md
ā”œā”€ā”€ LICENSE
ā”œā”€ā”€ macros/
│   ā”œā”€ā”€ generate_schema_name.sql
│   ā”œā”€ā”€ cast_column.sql
│   └── tests/
│       └── test_accepted_range.sql
ā”œā”€ā”€ models/
│   ā”œā”€ā”€ staging/
│   │   └── stg_utils_dates.sql
│   └── marts/
│       └── dim_date.sql
ā”œā”€ā”€ integration_tests/
│   ā”œā”€ā”€ dbt_project.yml
│   ā”œā”€ā”€ models/
│   │   └── test_model.sql
│   └── macros/
│       └── run_tests.sql
└── docs/
    └── package_overview.md

Package Configuration

# dbt_project.yml (package)
name: 'my_utils'
version: '1.2.0'
config-version: 2

require-dbt-version: [">=1.0.0", "<2.0.0"]

models:
  my_utils:
    staging:
      +materialized: view
      +schema: staging
    marts:
      +materialized: table
      +schema: analytics

vars:
  date_start: '2020-01-01'
  date_end: '2025-12-31'
  default_schema: 'analytics'

Consuming a Package

# packages.yml (consuming project)
packages:
  - package: dbt-labs/dbt_utils
    version: [">=1.0.0", "<2.0.0"]
  
  - package: dbt-labs/dbt_expectations
    version: [">=0.10.0", "<1.0.0"]
  
  - git: "https://github.com/my-org/my_utils.git"
    revision: "1.2.0"
  
  - local: "../my_local_package"

Custom Package Macro

-- macros/generate_schema_name.sql (package)
{% macro generate_schema_name(custom_schema_name, node) -%}
    {%- set default_schema = target.schema -%}
    {%- if custom_schema_name is none -%}
        {{ default_schema }}
    {%- else -%}
        {{ var('default_schema', default_schema) }}_{{ custom_schema_name | trim }}
    {%- endif -%}
{%- endmacro %}

Package with Models

-- models/staging/stg_utils_dates.sql
{{
    config(materialized='view')
}}

with date_spine as (
    {{ dbt_utils.date_spine(
        start_date="'" ~ var('date_start') ~ "'",
        end_date="'" ~ var('date_end') ~ "'",
        datepart="day"
    )}}
),

formatted as (
    select
        date_day as date_value,
        extract(year from date_day) as year,
        extract(month from date_day) as month,
        extract(day from date_day) as day,
        extract(dayofweek from date_day) as day_of_week,
        case
            when extract(dayofweek from date_day) in (0, 6) then 'weekend'
            else 'weekday'
        end as day_type
    from date_spine
)

select * from formatted

Package Test Macro

-- macros/tests/test_accepted_range.sql
{% test accepted_range(model, column_name, min_value=none, max_value=none, inclusive=true) %}
    {%- set min_clause = "and " ~ column_name ~ " >= " ~ min_value if min_value is not none else "" -%}
    {%- set max_clause = "and " ~ column_name ~ " <= " ~ max_value if max_value is not none else "" -%}
    
    select
        '{{ model }}' as model_name,
        '{{ column_name }}' as column_name,
        count(*) as failures
    from {{ model }}
    where {{ column_name }} is not null
      and (
          {{ "1=1" if min_value is none else column_name ~ " < " ~ min_value if not inclusive else column_name ~ " < " ~ min_value }}
          or
          {{ "1=1" if max_value is none else column_name ~ " > " ~ max_value if not inclusive else column_name ~ " > " ~ max_value }}
      )
{% endtest %}

Versioning and Changelog

# CHANGELOG.md

## [1.2.0] - 2024-06-15

### Added
- `test_accepted_range` generic test with inclusive/exclusive options
- `date_spine` wrapper macro for cross-database compatibility

### Changed
- Updated `generate_schema_name` to support custom schema prefix

### Fixed
- Fixed null handling in `cast_column` macro

## [1.1.0] - 2024-05-01

### Added
- `dim_date` model for calendar dimensions
- `stg_utils_dates` staging model

## [1.0.0] - 2024-04-01

### Added
- Initial release
- `generate_schema_name` macro
- `cast_column` macro

Integration Tests

# integration_tests/dbt_project.yml
name: 'my_utils_integration_tests'
version: '1.0.0'

require-dbt-version: [">=1.0.0"]

models:
  my_utils_integration_tests:
    +schema: test_results

vars:
  date_start: '2024-01-01'
  date_end: '2024-01-31'

# Reference the parent package
packages:
  - local: "../"
-- integration_tests/models/test_model.sql
{{
    config(materialized='ephemeral')
}}

with test_dates as (
    select * from {{ ref('dim_date') }}
),

test_acceptance as (
    select
        count(*) as total_rows,
        count(distinct date_value) as unique_dates,
        min(date_value) as min_date,
        max(date_value) as max_date
    from test_dates
)

select
    case
        when total_rows = 31 then 'PASS'
        else 'FAIL: expected 31 rows, got ' || total_rows
    end as test_result
from test_acceptance

Package Comparison

Featuredbt HubPrivate GitLocal
VisibilityPublicPrivatePrivate
VersioningSemVer tagsGit tagsManual
Installationpackagegitlocal
Updatesdbt depsdbt depsdbt deps
DocumentationAuto-generatedManualManual
TestingCI/CD requiredCI/CD requiredManual

Best Practices

  1. Semantic versioning - Follow MAJOR.MINOR.PATCH strictly
  2. Backward compatibility - Avoid breaking changes in MINOR/PATCH
  3. Comprehensive testing - Include integration tests in the package
  4. Clear documentation - Document all macros, variables, and models
  5. Small API surface - Expose minimal public interface
  6. Configurable variables - Use var() for customization
  7. Multi-database support - Test against all target databases
  8. Changelog maintenance - Keep CHANGELOG.md updated

See Also

Advertisement

Need Expert dbt Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement