dbt Cloud Features
Cloud Architecture
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DBT CLOUD ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β DBT CLOUD PLATFORM β β
β β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β β β IDE β β JOBS β β MONITORING β β β
β β β β β β β β β β
β β β β’ Web-based β β β’ Scheduled β β β’ Run history β β β
β β β β’ Git β β β’ On-demand β β β’ Performance β β β
β β β β’ SQL editor β β β’ PR triggersβ β β’ Costs β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β INFRASTRUCTURE β β
β β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β β β RUNNERS β β SECRETS β β ARTIFACTS β β β
β β β β β β β β β β
β β β β’ Managed β β β’ Encrypted β β β’ Manifest β β β
β β β β’ Scalable β β β’ Rotated β β β’ Run results β β β
β β β β’ Multi-env β β β’ Audited β β β’ Catalog β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Slim CI Architecture
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SLIM CI PIPELINE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β TRIGGER EVENT β β
β β β β
β β ββββββββββββββββ β β
β β β Git Push / β β β
β β β PR Created β β β
β β ββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 1. COMPARE STATEMENTS β β
β β β β
β β dbt run --select state:modified+ --state manifest.json β β
β β β β
β β Modified models: β β
β β βββ fct_orders (changed) β β
β β βββ dim_customers (new column) β β
β β βββ stg_payments (unchanged - skipped) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 2. SELECTIVE EXECUTION β β
β β β β
β β dbt run --select +fct_orders+dim_customers β β
β β dbt test --select fct_orders dim_customers β β
β β β β
β β Execution time: 5 minutes (vs 45 minutes full run) β β
β β Cost savings: 89% β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Job Scheduling
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β JOB SCHEDULING ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β JOB CONFIGURATION β β
β β β β
β β Job: Production Run β β
β β βββ Trigger: Scheduled (Daily 2:00 AM UTC) β β
β β βββ Environment: Production β β
β β βββ Commands: β β
β β β βββ dbt deps β β
β β β βββ dbt seed β β
β β β βββ dbt run --full-refresh β β
β β β βββ dbt test β β
β β βββ Notifications: β β
β β β βββ Slack: #data-engineering β β
β β β βββ Email: team@company.com β β
β β βββ Alert on: failure β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β SCHEDULE PATTERNS β β
β β β β
β β βββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Pattern β Configuration β β β
β β βββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββ€ β β
β β β Hourly β "0 * * * *" β β β
β β β Daily β "0 2 * * *" β β β
β β β Weekly β "0 2 * * 1" β β β
β β β Monthly β "0 2 1 * *" β β β
β β β Custom β "0 2 * * 1-5" (weekdays only) β β β
β β βββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Detailed Explanation
dbt Cloud is the enterprise version of dbt that provides a comprehensive platform for data transformation with managed infrastructure, scheduling, and monitoring.
Core Features
1. Web-based IDE
- Write SQL and Jinja in the browser
- Git integration with visual diff
- Auto-completion and syntax highlighting
- Interactive documentation viewer
2. Job Scheduling
- Cron-based scheduling
- Event-driven triggers (Git, API)
- Dependency chains
- Alert notifications
3. Slim CI
- Selective execution of modified models
- State comparison between branches
- Cost optimization for CI/CD
- Fast feedback loops
4. Monitoring and Observability
- Run history and logs
- Performance metrics
- Cost tracking
- Error alerting
Enterprise Features
1. SSO and Authentication
- SAML 2.0 integration
- RBAC (Role-Based Access Control)
- Audit logging
- IP allowlisting
2. Multi-tenant Architecture
- Environment isolation
- Resource quotas
- Cost allocation
- Compliance controls
3. Semantic Layer
- Centralized metric definitions
- Version control for metrics
- API access
- BI tool integration
4. Mesh
- Cross-project references
- Data contracts
- Shared semantic models
- Governance controls
Best Practices for dbt Cloud
- Use Slim CI - Only test modified models
- Set up alerts - Notify on failures
- Monitor costs - Track warehouse usage
- Use environments - Separate dev/staging/prod
- Version control - All configurations in Git
- Document jobs - Clear naming and descriptions
- Test regularly - Automated quality checks
- Review logs - Monitor execution details
Code Examples
Job Configuration (YAML)
# .dbt_cloud/job_config.yml
jobs:
- name: "Production Run"
description: "Daily production run for all models"
environment: "Production"
triggers:
- type: "scheduled"
cron: "0 2 * * *"
- type: "git_push"
branches: ["main"]
steps:
- command: "dbt deps"
- command: "dbt seed"
- command: "dbt run --full-refresh"
- command: "dbt test"
notifications:
- type: "slack"
channel: "#data-engineering"
- type: "email"
recipients:
- "data-eng@company.com"
settings:
warehouse: "ANALYTICS_WH"
schema: "production"
threads: 8
alert_on:
- "failure"
- "warning"
Slim CI Configuration
# .dbt_cloud/ci_config.yml
ci:
enabled: true
state:
compare:
- "manifest.json"
- "run_results.json"
selection:
strategy: "modified"
include:
- "state:modified+"
- "state:new+"
exclude:
- "tag:deprecated"
optimization:
enabled: true
max_run_time: "30m"
cost_threshold: 100
notifications:
on_success:
- type: "slack"
channel: "#ci-results"
on_failure:
- type: "slack"
channel: "#ci-alerts"
- type: "pagerduty"
severity: "critical"
Semantic Layer Configuration
# .dbt_cloud/semantic_layer.yml
semantic_layer:
enabled: true
metrics:
- name: "total_revenue"
type: "simple"
expression: "sum(amount)"
description: "Total revenue from all orders"
- name: "order_count"
type: "simple"
expression: "count(*)"
description: "Total number of orders"
- name: "avg_order_value"
type: "derived"
expression: "total_revenue / order_count"
description: "Average order value"
dimensions:
- name: "order_date"
type: "time"
granularity: "day"
- name: "customer_segment"
type: "categorical"
access:
- type: "bi_tool"
name: "Looker"
permissions: ["read"]
- type: "application"
name: "Feature Store"
permissions: ["read", "query"]
Monitoring Configuration
# .dbt_cloud/monitoring.yml
monitoring:
enabled: true
metrics:
- name: "run_duration"
type: "histogram"
alert_threshold: "30m"
- name: "cost_per_model"
type: "gauge"
alert_threshold: 10
alerts:
- name: "Long Running Job"
condition: "run_duration > 30m"
severity: "warning"
channels:
- "slack:#data-engineering"
- "email:data-eng@company.com"
- name: "High Cost Job"
condition: "total_cost > 500"
severity: "critical"
channels:
- "slack:#data-engineering"
- "pagerduty:data-eng"
dashboards:
- name: "Job Performance"
metrics: ["run_duration", "success_rate", "cost"]
refresh: "1h"
- name: "Cost Tracking"
metrics: ["cost_per_model", "cost_per_job"]
refresh: "1d"
Performance Metrics
| Feature | Description | Impact |
|---|---|---|
| Slim CI | Selective execution | 80-90% faster |
| Caching | Result caching | 50-70% faster |
| Parallelism | Concurrent execution | 2-3x faster |
| Monitoring | Real-time insights | Proactive alerts |
| Semantic Layer | Metric consistency | Improved governance |
Best Practices
- Use Slim CI - Only test modified models
- Set up alerts - Notify on failures
- Monitor costs - Track warehouse usage
- Use environments - Separate dev/staging/prod
- Version control - All configurations in Git
- Document jobs - Clear naming and descriptions
- Test regularly - Automated quality checks
- Review logs - Monitor execution details