Mesh and Data Collaboration
Mesh Architecture
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DBT MESH ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β DOMAIN-BASED PROJECTS β β
β β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β β β FINANCE β β MARKETING β β PRODUCT β β β
β β β DOMAIN β β DOMAIN β β DOMAIN β β β
β β β β β β β β β β
β β β β’ Revenue β β β’ Campaigns β β β’ User events β β β
β β β β’ Costs β β β’ Attributionβ β β’ Features β β β
β β β β’ Budget β β β’ ROI β β β’ Engagement β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CROSS-PROJECT REFERENCES β β
β β β β
β β Finance Project βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β β β
β β β {{ ref('finance', 'fct_revenue') }} β β β
β β β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β Marketing Project ββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β β β
β β β {{ ref('marketing', 'fct_campaigns') }} β β β
β β β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Data Contract Architecture
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA CONTRACT ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CONTRACT DEFINITION β β
β β β β
β β { β β
β β "model": "fct_orders", β β
β β "version": "1.0", β β
β β "schema": { β β
β β "columns": [ β β
β β { β β
β β "name": "order_id", β β
β β "type": "integer", β β
β β "description": "Unique order identifier", β β
β β "constraints": ["not_null", "unique"] β β
β β }, β β
β β { β β
β β "name": "amount", β β
β β "type": "decimal(18,2)", β β
β β "description": "Order amount in USD", β β
β β "constraints": ["not_null", "positive"] β β
β β } β β
β β ] β β
β β }, β β
β β "freshness": { β β
β β "max_delay": "4 hours", β β
β β "check_frequency": "hourly" β β
β β } β β
β β } β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CONTRACT VALIDATION β β
β β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β SCHEMA βββββΆβ FRESH. βββββΆβ QUALITY βββββΆβ APPROVE β β β
β β β CHECK β β CHECK β β CHECK β β β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Governance Flow
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GOVERNANCE AND DISCOVERY β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β DATA CATALOG β β
β β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β β β DISCOVERY β β LINEAGE β β DOCUMENTATION β β β
β β β β β β β β β β
β β β β’ Search β β β’ Column β β β’ Descriptions β β β
β β β β’ Browse β β β’ Model β β β’ Metrics β β β
β β β β’ Tags β β β’ Project β β β’ Examples β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ACCESS CONTROL β β
β β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β β β RBAC β β SCHEMA β β POLICY β β β
β β β β β ACCESS β β β β β
β β β β’ Roles β β β’ Read β β β’ Row-level security β β β
β β β β’ Permissionsβ β β’ Write β β β’ Column masking β β β
β β β β’ Users β β β’ Own β β β’ Data classification β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Detailed Explanation
Data mesh is an architectural pattern that decentralized data ownership to domain-specific teams. dbt provides tools to implement data mesh with multi-project architectures, data contracts, and governance features.
Domain-Based Organization
In a data mesh architecture:
- Domain Teams: Own their data products
- Self-Serve Platform: dbt Cloud as the platform
- Data as a Product: Published interfaces
- Federated Governance: Shared standards
Cross-Project References
dbt enables cross-project references for data mesh:
- Project references:
{{ ref('project', 'model') }} - Package references: Shared semantic models
- Metric stores: Centralized metric definitions
Data Contracts
Data contracts define the interface between producers and consumers:
- Schema contracts: Column names, types, constraints
- Freshness contracts: SLA for data availability
- Quality contracts: Acceptable data quality levels
- Cost contracts: Resource usage limits
Governance Features
- Access Control: RBAC and schema-level permissions
- Data Classification: PII, sensitive, public
- Lineage Tracking: End-to-end data lineage
- Audit Logging: Track all data access and changes
Code Examples
Cross-Project Reference
-- models/marts/fct_company_metrics.sql
{{
config(
materialized='incremental',
unique_key='company_id'
)
}}
with revenue as (
select * from {{ ref('finance', 'fct_revenue') }}
),
marketing as (
select * from {{ ref('marketing', 'fct_campaigns') }}
),
final as (
select
revenue.company_id,
revenue.company_name,
sum(revenue.amount) as total_revenue,
sum(marketing.spend) as total_marketing_spend,
sum(revenue.amount) - sum(marketing.spend) as profit,
current_timestamp() as updated_at
from revenue
left join marketing on revenue.company_id = marketing.company_id
group by 1, 2
)
select * from final
Data Contract Definition
# contracts/fct_orders_contract.yml
version: 2
contracts:
- name: fct_orders_contract
description: "Data contract for orders fact table"
model: ref('fct_orders')
schema:
columns:
- name: order_id
type: integer
description: "Unique order identifier"
constraints:
- not_null
- unique
- name: customer_id
type: integer
description: "Foreign key to customers"
constraints:
- not_null
- relationships:
to: ref('dim_customers')
field: customer_id
- name: amount
type: decimal(18,2)
description: "Order amount in USD"
constraints:
- not_null
- positive
- name: order_date
type: date
description: "Date order was placed"
constraints:
- not_null
- recent:
period: 30 days
freshness:
max_delay: 4 hours
check_frequency: hourly
alert_on_delay: true
quality:
- type: uniqueness
column: order_id
- type: completeness
columns: [order_id, customer_id, amount, order_date]
- type: consistency
check: "amount > 0"
threshold: 99.9
access:
- team: analytics
permissions: [read]
- team: finance
permissions: [read, write]
- team: data-science
permissions: [read]
Governance Configuration
# governance/access_control.yml
version: 2
access_control:
roles:
- name: data_reader
description: "Read-only access to data"
permissions:
- model:read
- source:read
- metric:read
grant_to:
- team: analytics
- team: business
- name: data_writer
description: "Read and write access to data"
permissions:
- model:read
- model:write
- source:read
- source:write
grant_to:
- team: data-engineering
- name: data_admin
description: "Full access to data platform"
permissions:
- "*"
grant_to:
- team: platform-team
schema_access:
- schema: raw
permissions:
- role: data_writer
access: full
- role: data_reader
access: none
- schema: staging
permissions:
- role: data_writer
access: full
- role: data_reader
access: read
- schema: analytics
permissions:
- role: data_writer
access: full
- role: data_reader
access: read
data_classification:
- level: public
description: "Non-sensitive data"
mask: false
- level: internal
description: "Internal business data"
mask: false
access: [data_reader, data_writer]
- level: confidential
description: "Sensitive business data"
mask: true
access: [data_writer]
masking_policy: "hash"
- level: restricted
description: "PII and regulated data"
mask: true
access: [data_admin]
masking_policy: "full_mask"
Semantic Layer Configuration
# semantic/company_metrics.yml
version: 2
semantic_models:
- name: company_metrics
description: "Unified company metrics"
model: ref('fct_company_metrics')
entities:
- name: company_id
type: primary
expr: company_id
dimensions:
- name: metric_date
type: time
type_params:
time_granularity: day
- name: company_name
type: categorical
expr: company_name
measures:
- name: total_revenue
agg: sum
expr: total_revenue
- name: total_marketing_spend
agg: sum
expr: total_marketing_spend
- name: profit
agg: sum
expr: profit
metrics:
- name: revenue
type: simple
type_params:
measure: total_revenue
- name: marketing_spend
type: simple
type_params:
measure: total_marketing_spend
- name: profit_margin
type: derived
type_params:
expr: "profit / revenue"
exposures:
- name: company_dashboard
type: dashboard
description: "Executive company dashboard"
depends_on:
- ref('fct_company_metrics')
owner:
name: Executive Team
email: exec@company.com
Performance Metrics
| Metric | Description | Target |
|---|---|---|
| Cross-project ref time | Time to resolve cross-project refs | <5s |
| Contract validation | Time to validate contracts | <30s |
| Governance audit | Time to run governance checks | <1min |
| Discovery search | Time to search data catalog | <2s |
| Lineage generation | Time to generate lineage graph | <10s |
Best Practices
- Define clear domain boundaries - Each team owns their data
- Use cross-project refs - Enable data mesh architecture
- Implement data contracts - Define clear interfaces
- Govern access - Use RBAC and schema permissions
- Classify data - Mark PII and sensitive data
- Track lineage - End-to-end data lineage
- Document everything - Clear descriptions and examples
- Monitor usage - Track data access and consumption