Virtual Warehouse Management: Multi-Cluster, Auto-Suspend & Scaling
Architecture Diagram 1: Warehouse Hierarchy & Sizing
┌─────────────────────────────────────────────────────────────────────────────┐
│ VIRTUAL WAREHOUSE HIERARCHY │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ACCOUNT LEVEL │
│ ══════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Resource Monitor: account_credit_monitor │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Total Credit Quota: 100,000 credits/month │ │ │
│ │ │ Current Usage: 67,500 credits (67.5%) │ │ │
│ │ │ Alert Threshold: 80% (80,000 credits) │ │ │
│ │ │ Suspension Threshold: 95% (95,000 credits) │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ WAREHOUSE GROUPS: │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐│ │ │
│ │ │ │ Analytics │ │ ETL │ │ Ad-Hoc ││ │ │
│ │ │ │ Group │ │ Group │ │ Group ││ │ │
│ │ │ │ │ │ │ │ ││ │ │
│ │ │ │ • analytics_wh │ │ • etl_wh │ │ • dev_wh ││ │ │
│ │ │ │ • reporting_wh │ │ • transform_wh │ │ • test_wh ││ │ │
│ │ │ │ • bi_wh │ │ • load_wh │ │ • explore_wh││ │ │
│ │ │ └─────────────────┘ └─────────────────┘ └──────────────┘│ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ SCHEMA LEVEL (Database: ANALYTICS) │
│ ═══════════════════════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Warehouse: ANALYTICS_WH │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ │ │ │
│ │ │ Configuration: │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ WAREHOUSE_SIZE = 'XLARGE' │ │ │ │
│ │ │ │ MIN_CLUSTER_COUNT = 1 │ │ │ │
│ │ │ │ MAX_CLUSTER_COUNT = 5 │ │ │ │
│ │ │ │ SCALING_POLICY = 'ECONOMY' │ │ │ │
│ │ │ │ AUTO_SUSPEND = 300 (5 minutes) │ │ │ │
│ │ │ │ AUTO_RESUME = TRUE │ │ │ │
│ │ │ │ INITIALLY_SUSPENDED = FALSE │ │ │ │
│ │ │ │ STATEMENT_TIMEOUT = 3600 (1 hour) │ │ │ │
│ │ │ │ STATEMENT_QUEUED_TIMEOUT = 600 (10 minutes) │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ Cluster Configuration: │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Cluster 1: Active (Primary) │ │ │ │
│ │ │ │ ├─ Node 1: 16 vCPU, 32 GB RAM │ │ │ │
│ │ │ │ ├─ Node 2: 16 vCPU, 32 GB RAM │ │ │ │
│ │ │ │ ├─ Node 3: 16 vCPU, 32 GB RAM │ │ │ │
│ │ │ │ └─ Node 4: 16 vCPU, 32 GB RAM │ │ │ │
│ │ │ │ Total: 64 vCPU, 128 GB RAM │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ Cluster 2: Standby (Auto-start if needed) │ │ │ │
│ │ │ │ Cluster 3: Standby (Auto-start if needed) │ │ │ │
│ │ │ │ Cluster 4: Standby (Auto-start if needed) │ │ │ │
│ │ │ │ Cluster 5: Standby (Auto-start if needed) │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Architecture Diagram 2: Multi-Cluster Scaling Behavior
┌─────────────────────────────────────────────────────────────────────────────┐
│ MULTI-CLUSTER SCALING BEHAVIOR │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ SCENARIO: Concurrent user queries arriving at different times │
│ │
│ TIME 0:00 (Initial State) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Cluster 1: ACTIVE │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Running Queries: 2/8 │ │ │
│ │ │ Queue Depth: 0 │ │ │
│ │ │ CPU Utilization: 25% │ │ │
│ │ │ Status: HEALTHY │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Clusters 2-5: SUSPENDED │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ TIME 0:05 (Load Increase) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Cluster 1: ACTIVE (Overloaded) │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Running Queries: 8/8 (Full) │ │ │
│ │ │ Queue Depth: 5 (Growing) │ │ │
│ │ │ CPU Utilization: 95% │ │ │
│ │ │ Status: SCALING TRIGGERED │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Cluster 2: STARTING (Auto-launched) │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Status: INITIALIZING │ │ │
│ │ │ Expected Ready: 30-60 seconds │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ TIME 0:10 (Peak Load) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Cluster 1: ACTIVE │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Running Queries: 8/8 │ │ │
│ │ │ Queue Depth: 2 (Decreasing) │ │ │
│ │ │ CPU Utilization: 80% │ │ │
│ │ │ Status: HEALTHY │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Cluster 2: ACTIVE │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Running Queries: 4/8 │ │ │
│ │ │ Queue Depth: 0 │ │ │
│ │ │ CPU Utilization: 50% │ │ │
│ │ │ Status: HEALTHY │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Cluster 3: STARTING (Auto-launched due to sustained queue) │ │
│ │ Clusters 4-5: SUSPENDED │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ TIME 0:30 (Load Decrease) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Cluster 1: ACTIVE │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Running Queries: 2/8 │ │ │
│ │ │ Queue Depth: 0 │ │ │
│ │ │ CPU Utilization: 25% │ │ │
│ │ │ Status: HEALTHY │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Cluster 2: SUSPENDING (No queries for > 3 minutes) │ │
│ │ Cluster 3: SUSPENDING (No queries for > 3 minutes) │ │
│ │ Clusters 4-5: SUSPENDED │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ TIME 1:00 (Idle State) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Cluster 1: ACTIVE (Primary - Never fully suspends) │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Running Queries: 0/8 │ │ │
│ │ │ Queue Depth: 0 │ │ │
│ │ │ CPU Utilization: 5% │ │ │
│ │ │ Status: IDLE (Waiting for AUTO_SUSPEND timeout) │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Clusters 2-5: SUSPENDED │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ TIME 1:05 (Auto-Suspend Triggered) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ALL CLUSTERS: SUSPENDED │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Status: SUSPENDED │ │ │
│ │ │ Auto-Resume: TRUE (Will resume on next query) │ │ │
│ │ │ Last Resume Time: 2024-01-15 09:00:00 │ │ │
│ │ │ Last Suspend Time: 2024-01-15 10:05:00 │ │ │
│ │ │ Total Active Time: 65 minutes │ │ │
│ │ │ Credits Used: 65 credits (X-Large rate) │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Architecture Diagram 3: Scaling Policies Comparison
┌─────────────────────────────────────────────────────────────────────────────┐
│ SCALING POLICIES COMPARISON │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ POLICY: STANDARD │
│ ═════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Behavior: Scale out aggressively to maintain queue depth < 1 │ │
│ │ │ │
│ │ Queue Depth │ Clusters │ Action │ │
│ │ ───────────────┼──────────┼────────────────────────────────────── │ │
│ │ 0 │ 1 │ No change │ │
│ │ 1 │ 1-2 │ Start Cluster 2 │ │
│ │ 2 │ 2-3 │ Start Cluster 3 │ │
│ │ 3 │ 3-4 │ Start Cluster 4 │ │
│ │ 4+ │ 4-5 │ Start Cluster 5 │ │
│ │ │ │
│ │ Scale-Down: After queue = 0 for 3 minutes │ │
│ │ │ │
│ │ Use Case: Interactive BI, ad-hoc queries │ │
│ │ Cost Impact: Higher (scales quickly) │ │
│ │ Performance: Optimal (minimal queueing) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ POLICY: ECONOMY │
│ ════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Behavior: Scale out conservatively, allow some queueing │ │
│ │ │ │
│ │ Queue Depth │ Clusters │ Action │ │
│ │ ───────────────┼──────────┼────────────────────────────────────── │ │
│ │ 0-5 │ 1 │ No change │ │
│ │ 6-10 │ 1-2 │ Start Cluster 2 │ │
│ │ 11-15 │ 2-3 │ Start Cluster 3 │ │
│ │ 16-20 │ 3-4 │ Start Cluster 4 │ │
│ │ 21+ │ 4-5 │ Start Cluster 5 │ │
│ │ │ │
│ │ Scale-Down: After queue = 0 for 10 minutes │ │
│ │ │ │
│ │ Use Case: Scheduled ETL, batch processing │ │
│ │ Cost Impact: Lower (scales slowly) │ │
│ │ Performance: Acceptable (some queueing) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ AUTO-SUSPEND TIMELINE: │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Time: 0 5 10 15 20 25 30 35 40 45 50 │ │
│ │ │ │ │ │ │ │ │ │ │ │ │ │ │
│ │ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ Status: ACT ACT ACT ACT IDL IDL IDL IDL IDL SUS │ │ │
│ │ │ Queries: 8 6 4 1 0 0 0 0 0 - │ │ │
│ │ │ Credits: 8 6 4 1 0 0 0 0 0 0 │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ AUTO_SUSPEND = 300 seconds (5 minutes) │ │
│ │ Last query at T=15, suspended at T=20 (15+5=20) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
A multi-cluster warehouse automatically manages multiple independent compute clusters under a logical warehouse. When the primary cluster becomes overloaded (queue depth exceeds threshold), Snowflake starts additional clusters. When load decreases, clusters are suspended to reduce costs.
Scaling policies control how aggressively warehouses add clusters. STANDARD scales quickly (queue depth ≥ 1 triggers new cluster), prioritizing performance. ECONOMY scales conservatively (queue depth ≥ 6 triggers), prioritizing cost savings with some queueing tolerance.
Theorem: Auto-suspend after T seconds of inactivity guarantees zero credit consumption for idle periods, with resume latency bounded by cloud provider cold-start time (typically 30–60 seconds). Proof: Warehouse suspends all compute nodes when idle time ≥ T. No queries execute during suspension. Auto-resume re-initializes nodes on next query submission. The 60-second minimum billing granularity ensures no partial-second charges.
Set AUTO_SUSPEND=60 for development warehouses (frequent idle periods), AUTO_SUSPEND=300 for production (balance between responsiveness and cost). Use SCALING_POLICY=ECONOMY for batch ETL where some queueing is acceptable.
- Monitor queue depth: Target < 5 queries waiting; > 10 indicates under-provisioning
- Check CPU utilization: 60–80% is optimal; > 95% indicates need to scale up
- Review auto-scaling events: Frequent ADD_CLUSTER suggests under-provisioned MAX_CLUSTER_COUNT
- Analyze credit consumption: Compare credits_used vs. queries_completed for efficiency
- Right-size: Scale down if utilization consistently < 40%; scale up if consistently > 90%
- Test: Run representative workload at new size and verify queue depth remains < 5
- Multi-cluster warehouses auto-scale 1–10 clusters based on queue depth
- Credit billing: Per-second (60s minimum) based on warehouse size × time
- STANDARD policy: Fast scaling for interactive workloads
- ECONOMY policy: Conservative scaling for batch ETL (cost-optimized)
- Auto-suspend: 60–86400 seconds; balance responsiveness vs. cost
Detailed Explanation
Virtual Warehouse Architecture
Snowflake's virtual warehouses are independent compute clusters that execute SQL queries and DML operations. Unlike traditional data warehouse appliances with fixed compute resources, virtual warehouses are elastic and can dynamically scale based on workload demands. Each warehouse consists of one or more nodes, where each node contains CPU cores, RAM, and local SSD cache. The number of nodes and their size determine the warehouse's compute capacity.
Warehouses operate in complete isolation from each other, meaning multiple warehouses can access the same data simultaneously without resource contention. This isolation enables different departments or workloads to use dedicated warehouses without impacting each other's performance. The cloud services layer manages query routing, metadata operations, and security enforcement independently of the compute layer.
Multi-Cluster Warehouse Operations
Multi-cluster warehouses extend the single-cluster concept by automatically managing multiple independent clusters under a logical warehouse. When the primary cluster becomes overloaded (measured by queue depth exceeding thresholds), Snowflake automatically starts additional clusters to handle the load. When the load decreases, clusters are automatically suspended to reduce costs.
The scaling behavior depends on the configured scaling policy. The STANDARD policy scales out aggressively to maintain optimal performance, while the ECONOMY policy scales more conservatively to reduce costs. Both policies use queue depth as the primary scaling metric, but with different thresholds and scaling speeds.
Each cluster in a multi-cluster warehouse operates independently, processing queries in parallel. Queries are routed to the cluster with the lowest queue depth, ensuring optimal load distribution. When a cluster is suspended, any running queries complete before the cluster shuts down, ensuring no query interruption.
Auto-Suspend and Auto-Resume
Auto-suspend automatically suspends a warehouse after a configurable period of inactivity (measured in seconds). This feature prevents unnecessary credit consumption when warehouses are not actively processing queries. The suspension timeout can be set from 60 seconds (1 minute) to 86400 seconds (24 hours), with 300 seconds (5 minutes) being a common default.
Auto-resume automatically resumes a suspended warehouse when new queries arrive. This feature ensures seamless query execution without manual intervention. When a warehouse resumes, it initializes compute resources, loads cached data, and begins processing queued queries. The resume process typically completes within 30-60 seconds, depending on warehouse size and cloud provider availability.
Resource Monitoring and Cost Control
Resource monitors provide account-level and warehouse-level credit consumption tracking and alerting. Monitors can be configured to send alerts at specific thresholds (warning, suspend, and suspend immediately) and can automatically suspend warehouses or accounts when credit limits are reached.
Warehouse-level resource monitors track individual warehouse consumption, enabling per-workload cost attribution. Account-level monitors provide aggregate consumption visibility and can enforce overall credit budgets. Both monitor types support email notifications and integration with external monitoring systems.
Performance Optimization Strategies
Warehouse performance depends on several factors including size, scaling policy, and configuration settings. Larger warehouses provide more compute resources per node, enabling faster query execution for complex analytics. Smaller warehouses are cost-effective for simple queries and development workloads.
Scaling policies control how aggressively warehouses add clusters in response to load. STANDARD policy prioritizes performance by scaling quickly, while ECONOMY policy prioritizes cost by allowing some queueing. The choice depends on workload characteristics and performance requirements.
Statement timeouts prevent long-running queries from consuming resources indefinitely. The STATEMENT_TIMEOUT parameter sets the maximum execution time, while STATEMENT_QUEUED_TIMEOUT sets the maximum wait time in the queue. Both parameters help maintain warehouse responsiveness and prevent resource starvation.
Key Concepts Table
| Parameter | Range | Default | Description |
|---|---|---|---|
| WAREHOUSE_SIZE | XS-4XL | M | Compute capacity per node |
| MIN_CLUSTER_COUNT | 1-10 | 1 | Minimum active clusters |
| MAX_CLUSTER_COUNT | 1-10 | 1 | Maximum auto-scaled clusters |
| SCALING_POLICY | STANDARD/ECONOMY | STANDARD | Scaling aggressiveness |
| AUTO_SUSPEND | 60-86400 | 600 | Inactivity timeout (seconds) |
| AUTO_RESUME | TRUE/FALSE | TRUE | Auto-resume on query |
| STATEMENT_TIMEOUT | 0-86400 | 0 | Max execution time (seconds) |
| Warehouse Size | vCPU | Memory | Max Concurrent | Cost/Hour |
|---|---|---|---|---|
| X-Small | 1 | 2 GB | 1 | ~$0.00036/s |
| Small | 2 | 4 GB | 2 | ~$0.00072/s |
| Medium | 4 | 8 GB | 4 | ~$0.00144/s |
| Large | 8 | 16 GB | 8 | ~$0.00288/s |
| X-Large | 16 | 32 GB | 16 | ~$0.00576/s |
| 2X-Large | 32 | 64 GB | 32 | ~$0.01152/s |
| 3X-Large | 64 | 128 GB | 64 | ~$0.02304/s |
| 4X-Large | 128 | 256 GB | 128 | ~$0.04608/s |
| Metric | Target | Warning | Critical |
|---|---|---|---|
| Queue Depth | < 5 | 5-10 | > 10 |
| CPU Utilization | 60-80% | 80-95% | > 95% |
| Auto-Suspend Time | 60-300s | 300-600s | > 600s |
| Credit Consumption | < 80% quota | 80-95% | > 95% |
Code Examples
-- Example 1: Create warehouse with comprehensive settings
CREATE WAREHOUSE analytics_wh
WAREHOUSE_SIZE = 'xlarge'
MIN_CLUSTER_COUNT = 2
MAX_CLUSTER_COUNT = 8
SCALING_POLICY = 'STANDARD'
AUTO_SUSPEND = 300
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED = FALSE
STATEMENT_TIMEOUT_IN_SECONDS = 3600
STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 600
RESOURCE_MONITOR = 'analytics_monitor'
COMMENT = 'Multi-cluster warehouse for analytics workloads';
-- Example 2: Create resource monitor
CREATE RESOURCE MONITOR account_monitor
WITH
CREDIT_QUOTA = 10000
FREQUENCY = MONTHLY
START_TIMESTAMP = IMMEDIATELY
TRIGGERS
ON 80% DO NOTIFY
ON 90% DO SUSPEND
ON 95% DO SUSPEND_IMMEDIATELY;
-- Example 3: Create warehouse-specific monitor
CREATE RESOURCE MONITOR etl_monitor
WITH
CREDIT_QUOTA = 5000
FREQUENCY = MONTHLY
TRIGGERS
ON 75% DO NOTIFY
ON 90% DO SUSPEND;
-- Assign monitor to warehouse
ALTER WAREHOUSE etl_wh SET RESOURCE_MONITOR = etl_monitor;
-- Example 4: Monitor warehouse performance
SELECT
warehouse_name,
warehouse_size,
cluster_number,
AVG(queries_completed) as avg_queries,
AVG(queued_overload_queries) as avg_queued,
AVG(execution_time_ms) / 1000 as avg_exec_seconds,
SUM(credits_used) as total_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2, 3
ORDER BY total_credits DESC;
-- Example 5: Analyze auto-scaling events
SELECT
warehouse_name,
event_name,
event_timestamp,
event_reason,
cluster_number
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_LOAD_HISTORY(
START_TIME => DATEADD(hour, -24, CURRENT_TIMESTAMP())
))
WHERE event_name IN ('RESUME', 'SUSPEND', 'ADD_CLUSTER', 'REMOVE_CLUSTER')
ORDER BY event_timestamp DESC;
-- Example 6: Monitor current warehouse status
SELECT
warehouse_name,
state,
num_queries,
num_queued_queries,
time_active,
time_suspended
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_LOAD_HISTORY(
START_TIME => DATEADD(hour, -1, CURRENT_TIMESTAMP())
))
ORDER BY time_active DESC;
-- Example 7: Dynamic warehouse scaling
-- Scale up warehouse for batch processing
ALTER WAREHOUSE etl_wh SET WAREHOUSE_SIZE = '2xlarge';
-- Scale out warehouse for peak load
ALTER WAREHOUSE analytics_wh SET MAX_CLUSTER_COUNT = 10;
-- Scale in warehouse during off-hours
ALTER WAREHOUSE analytics_wh SET MAX_CLUSTER_COUNT = 2;
-- Example 8: Warehouse usage analysis
SELECT
warehouse_name,
DATE_TRUNC('day', start_time) as usage_date,
SUM(credits_used) as daily_credits,
SUM(credits_used) * 3 as daily_cost_usd, -- ~$3/credit
AVG(queries_completed) as avg_queries_per_hour
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time >= DATEADD(day, -90, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY usage_date DESC, daily_credits DESC;
-- Example 9: Optimize warehouse for specific workload
-- Create warehouse optimized for large scans
CREATE WAREHOUSE scan_wh
WAREHOUSE_SIZE = '3xlarge'
AUTO_SUSPEND = 600
STATEMENT_TIMEOUT_IN_SECONDS = 7200
COMMENT = 'Optimized for large table scans';
-- Create warehouse for interactive queries
CREATE WAREHOUSE interactive_wh
WAREHOUSE_SIZE = 'medium'
AUTO_SUSPEND = 60
STATEMENT_TIMEOUT_IN_SECONDS = 300
STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 60
COMMENT = 'Optimized for interactive queries';
-- Example 10: Monitor query performance by warehouse
SELECT
warehouse_name,
AVG(execution_time_ms) / 1000 as avg_exec_seconds,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY execution_time_ms) / 1000 as median_exec_seconds,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY execution_time_ms) / 1000 as p95_exec_seconds,
COUNT(*) as total_queries,
SUM(CASE WHEN execution_time_ms > 60000 THEN 1 ELSE 0 END) as slow_queries
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
AND execution_status = 'SUCCESS'
GROUP BY 1
ORDER BY avg_exec_seconds DESC;
Performance Metrics
| Metric | Target | Warning | Critical | Description |
|---|---|---|---|---|
| Query Queue Depth | < 5 | 5-10 | > 10 | Queries waiting to execute |
| Cluster Scaling Time | < 30s | 30-60s | > 60s | Time to start new cluster |
| Auto-Suspend Delay | 60-300s | 300-600s | > 600s | Time before warehouse suspends |
| Auto-Resume Time | < 60s | 60-120s | > 120s | Time to resume suspended warehouse |
| Credit Utilization | 60-80% | 80-95% | > 95% | Active time vs total time |
Best Practices
-
Right-size warehouses: Start with Medium for ad-hoc queries, scale up for batch ETL. Monitor queue depth to identify under-provisioned warehouses.
-
Use multi-cluster warehouses: Set MIN_CLUSTER_COUNT=1 and MAX_CLUSTER_COUNT=3-5 for concurrent workloads. Use SCALING_POLICY=ECONOMY for cost optimization.
-
Implement warehouse isolation: Create separate warehouses for different workloads (ETL, reporting, ad-hoc) to prevent resource contention and enable independent scaling.
-
Configure appropriate timeouts: Set STATEMENT_QUEUED_TIMEOUT for queries that shouldn't run indefinitely and STATEMENT_TIMEOUT to kill long-running queries.
-
Leverage resource monitors: Set up alerts for credit consumption to prevent unexpected costs. Use MAX_CREDIT_QUOTA per warehouse for budget control.
-
Optimize auto-suspend: Set AUTO_SUSPEND=60 for development warehouses, AUTO_SUSPEND=300 for production, and AUTO_SUSPEND=60 for infrequently used warehouses.
-
Monitor scaling events: Track warehouse load history to identify scaling patterns and optimize MIN/MAX_CLUSTER_COUNT settings.
-
Use warehouse groups: Organize warehouses by workload type (analytics, ETL, development) to simplify management and cost attribution.
-
Implement cost allocation: Use resource monitors and tags to attribute warehouse costs to specific departments or projects.
-
Regular performance reviews: Analyze warehouse performance metrics weekly to identify optimization opportunities and right-sizing needs.
See Also
- PySpark Iceberg - Compute optimization patterns
- Delta Lake on Databricks - Delta Lake compute model
- Data Warehouse Concepts - Data warehouse design principles