CW

dbt Multi-Project Deployments

Free Lesson

Advertisement

dbt Multi-Project Deployments

Multi-Project Architecture

Multi-Project Pipeline

Formal Definitions

DfCross-Project Reference

A cross-project reference is a ref() call that references a model from a different dbt project. Cross-project refs use the syntax ref('project_name', 'model_name') and enable building downstream models on top of public models from other projects. dbt resolves these references at compile time, replacing them with the appropriate database object reference.

DfModel Access Control

Model access control defines which models can be referenced by external projects. Models can be public (accessible to all projects), private (accessible only within the project), or protected (accessible to the project and its dependents). Access is configured via the access field in model configuration.

DfHub and Spoke Architecture

The hub and spoke architecture organizes multiple dbt projects around a central hub project that contains shared models. The hub exposes public models that downstream projects consume. This pattern enables domain teams to maintain autonomy while sharing common data models and business logic.

Detailed Explanation

Multi-project deployments enable organizations to scale dbt across teams and domains while maintaining code ownership and access control. This architecture is essential for large organizations with multiple data teams.

When to Use Multi-Project

  1. Team autonomy - Different teams own different data domains
  2. Access control - Restrict visibility of sensitive models
  3. Deployment independence - Deploy projects on different schedules
  4. Code organization - Separate concerns by business domain
  5. Performance - Reduce build scope for individual teams

Project Roles

RoleDescriptionExample
HubCentral shared modelsCore analytics, dimensions
DomainTeam-specific modelsMarketing, Finance, Product
DownstreamConsumer modelsBI tools, ML pipelines
SharedCross-cutting concernsLogging, auditing

Cross-project references require dbt Cloud or dbt Core with the --cross-project-ref flag. The referenced project must have the model configured as public. dbt resolves cross-project refs by querying the hub project's manifest for model metadata.

Start with a single project and extract shared models into a hub project when you have 3+ teams or need access control. Over-engineering multi-project early adds unnecessary complexity. Use the access configuration to control what models are exposed.

Code Examples

Hub Project Configuration

# hub_project/dbt_project.yml
name: 'hub_analytics'
version: '1.0.0'
config-version: 2

profile: 'hub_analytics'

models:
  hub_analytics:
    staging:
      +materialized: view
      +access: private
    intermediate:
      +materialized: ephemeral
    marts:
      +materialized: table
      +access: public
      +schema: analytics

Public Model Configuration

-- hub_project/models/marts/dim_customers.sql
{{
    config(
        materialized='table',
        access='public',
        tags=['public', 'dimensions']
    )
}}

with customers as (
    select * from {{ source('erp', 'customers') }}
),

final as (
    select
        customer_id,
        customer_name,
        email,
        segment,
        created_at,
        updated_at
    from customers
)

select * from final

Domain Project Consuming Hub

# marketing_project/dbt_project.yml
name: 'marketing_analytics'
version: '1.0.0'
config-version: 2

profile: 'marketing_analytics'

dependencies:
  - hub_analytics

models:
  marketing_analytics:
    +materialized: table
    +schema: marketing

Cross-Project Reference

-- marketing_project/models/fct_campaign_performance.sql
{{
    config(
        materialized='incremental',
        unique_key='campaign_id'
    )
}}

with campaigns as (
    select * from {{ source('marketing', 'campaigns') }}
),

-- Cross-project reference to hub project
customers as (
    select * from {{ ref('hub_analytics', 'dim_customers') }}
),

campaign_metrics as (
    select
        c.campaign_id,
        c.campaign_name,
        c.channel,
        c.budget,
        c.spend,
        cust.segment as customer_segment,
        count(distinct cust.customer_id) as unique_customers,
        sum(c.spend) as total_spend
    from campaigns c
    left join customers cust on c.customer_id = cust.customer_id
    group by 1, 2, 3, 4, 5, 6
)

select * from campaign_metrics

Access Control Configuration

# models/marts/fct_orders.yml
version: 2

models:
  - name: fct_orders
    description: "Order fact table"
    access: public
    config:
      access: public
      tags: ['public', 'finance']
    
    columns:
      - name: order_id
        description: "Primary key"
      - name: customer_id
        description: "Foreign key to dim_customers"

Private Model (Internal Only)

-- models/internal/int_revenue_calculation.sql
{{
    config(
        materialized='ephemeral',
        access='private'
    )
}}

{#- This model is private and cannot be referenced by external projects -#}
{#- It contains proprietary business logic -#}

with orders as (
    select * from {{ ref('stg_orders') }}
),

revenue_calc as (
    select
        order_id,
        sum(amount * 0.85) as adjusted_revenue,
        sum(amount * 0.15) as platform_fee
    from orders
    group by 1
)

select * from revenue_calc

Multi-Project CI/CD

# .github/workflows/dbt-multi-project.yml
name: dbt Multi-Project CI/CD

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  hub-project:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Setup dbt
        uses: setup-dbt@v1
        with:
          dbt-version: '1.7.0'
      
      - name: Install hub dependencies
        run: dbt deps
      
      - name: Build hub models
        run: dbt build --target prod
      
      - name: Run hub tests
        run: dbt test
  
  domain-projects:
    needs: hub-project
    runs-on: ubuntu-latest
    strategy:
      matrix:
        project: [marketing, finance, product]
    
    steps:
      - uses: actions/checkout@v3
      
      - name: Setup dbt
        uses: setup-dbt@v1
        with:
          dbt-version: '1.7.0'
      
      - name: Install domain dependencies
        run: dbt deps
      
      - name: Build domain models
        run: dbt build --target prod
      
      - name: Run domain tests
        run: dbt test

Project Dependency Graph

Architecture Diagram
+-------------------------------------------------------------------+
|                    MULTI-PROJECT DAG                              |
+-------------------------------------------------------------------+
|                                                                   |
|  +-----------------------------------------------------------+   |
|  |                    HUB PROJECT                            |   |
|  |  hub_analytics                                            |   |
|  |  +-- staging/ (private)                                   |   |
|  |  |   +-- stg_customers                                   |   |
|  |  |   +-- stg_orders                                       |   |
|  |  +-- marts/ (public)                                      |   |
|  |      +-- dim_customers (public)                           |   |
|  |      +-- fct_orders (public)                              |   |
|  +-----------------------------------------------------------+   |
|          |                    |                    |              |
|          v                    v                    v              |
|  +------------------+ +------------------+ +------------------+  |
|  | MARKETING PROJECT | | FINANCE PROJECT  | | PRODUCT PROJECT  |  |
|  | marketing_analytics| | finance_analytics| | product_analytics|  |
|  | +-- fct_campaigns | | +-- fct_revenue  | | +-- fct_events   |  |
|  | |   (cross-ref)   | | |   (cross-ref)  | | |   (cross-ref)  |  |
|  | +-- dim_campaigns | | +-- dim_products | | +-- dim_users    |  |
|  +------------------+ +------------------+ +------------------+  |
|                                                                   |
+-------------------------------------------------------------------+

Comparison: Single vs Multi-Project

AspectSingle ProjectMulti-Project
ComplexityLowHigh
DeploymentAtomicIndependent
Access ControlNoneGranular
Code OwnershipSharedDomain-specific
Cross-Project RefsN/ARequired
CI/CDSimpleCoordinated
TestingUnifiedDistributed
ScalabilityLimitedExcellent

Best Practices

  1. Start simple - Begin with one project, extract when needed
  2. Clear boundaries - Define which models belong to which project
  3. Use access control - Mark public models explicitly
  4. Document dependencies - Maintain clear dependency graphs
  5. Coordinate deployments - Use CI/CD to manage cross-project builds
  6. Version compatibility - Pin dependency versions in packages.yml
  7. Monitor cross-project refs - Alert on breaking changes
  8. Shared testing - Run integration tests across project boundaries

See Also

Advertisement

Need Expert dbt Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement