CW

Snowflake Multi-Cluster Warehouses

Free Lesson

Advertisement

Snowflake Multi-Cluster Warehouses

Multi-cluster warehouses in Snowflake automatically scale compute resources based on workload demands, ensuring consistent performance while optimizing costs.

Multi-Cluster Configuration

Creating Multi-Cluster Warehouses

-- Create multi-cluster warehouse
CREATE WAREHOUSE analytics_mcw
  WAREHOUSE_SIZE = 'LARGE'
  MIN_CLUSTER_COUNT = 1
  MAX_CLUSTER_COUNT = 4
  SCALING_POLICY = 'STANDARD'
  AUTO_SUSPEND = 60
  AUTO_RESUME = TRUE;

-- Configure for specific workload
CREATE WAREHOUSE reporting_mcw
  WAREHOUSE_SIZE = 'MEDIUM'
  MIN_CLUSTER_COUNT = 2
  MAX_CLUSTER_COUNT = 8
  SCALING_POLICY = 'ECONOMY'
  AUTO_SUSPEND = 300;

Scaling Policies

PolicyBehaviorUse Case
STANDARDScale up quickly, scale down slowlyInteractive analytics
ECONOMYScale up slowly, scale down quicklyCost-sensitive workloads
-- Set scaling policy
ALTER WAREHOUSE analytics_mcw SET SCALING_POLICY = 'STANDARD';

-- Monitor scaling behavior
SELECT
  warehouse_name,
  cluster_number,
  start_time,
  end_time,
  credits_used,
  TIMESTAMPDIFF('minute', start_time, end_time) as duration_minutes
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_METERING_HISTORY(
  START_TIME => DATEADD('day', -1, CURRENT_TIMESTAMP())
))
WHERE warehouse_name = 'ANALYTICS_MCW'
ORDER BY start_time;

Query Routing

Automatic Query Distribution

-- Queries are automatically routed to available clusters
-- No additional configuration needed

-- Monitor query routing
SELECT
  query_id,
  warehouse_name,
  cluster_number,
  start_time,
  end_time,
  execution_time_ms
FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY(
  START_TIME => DATEADD('hour', -1, CURRENT_TIMESTAMP())
))
WHERE warehouse_name = 'ANALYTICS_MCW'
ORDER BY start_time;

Query Queue Management

-- Monitor query queue
SELECT
  query_id,
  query_text,
  warehouse_name,
  queue_start_time,
  queue_end_time,
  TIMESTAMPDIFF('second', queue_start_time, queue_end_time) as queue_time_seconds
FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY(
  START_TIME => DATEADD('hour', -1, CURRENT_TIMESTAMP())
))
WHERE queue_start_time IS NOT NULL
ORDER BY queue_time_seconds DESC;

-- Configure queue settings
ALTER WAREHOUSE analytics_mcw SET
  STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 3600,
  STATEMENT_TIMEOUT_IN_SECONDS = 86400;

Workload Management

Resource Monitors

-- Create resource monitor
CREATE RESOURCE MONITOR mcw_monitor
  WITH
    CREDIT_QUOTA = 500,
    TRIGGERS
      ON 50% DO NOTIFY
      ON 75% DO NOTIFY
      ON 100% DO SUSPEND;

-- Apply to warehouse
ALTER WAREHOUSE analytics_mcw SET RESOURCE_MONITOR = 'MCW_MONITOR';

-- Monitor resource usage
SELECT
  monitor_name,
  credit_quota,
  credits_used,
  ROUND(credits_used / credit_quota * 100, 2) as usage_pct
FROM INFORMATION_SCHEMA.RESOURCE_MONITORS
WHERE monitor_name = 'MCW_MONITOR';

Warehouse Assignment

-- Create role-based warehouse assignment
CREATE ROLE analyst_role;
CREATE ROLE data_scientist_role;

-- Assign warehouses to roles
GRANT USAGE ON WAREHOUSE analytics_mcw TO ROLE analyst_role;
GRANT USAGE ON WAREHOUSE ml_mcw TO ROLE data_scientist_role;

-- Set default warehouses
ALTER ROLE analyst_role SET DEFAULT_WAREHOUSE = 'ANALYTICS_MCW';
ALTER ROLE data_scientist_role SET DEFAULT_WAREHOUSE = 'ML_MCW';

Performance Monitoring

-- Monitor cluster utilization
SELECT
  warehouse_name,
  cluster_number,
  AVG(query_count) as avg_queries,
  AVG(credits_used) as avg_credits,
  AVG(utilization_percentage) as avg_utilization
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_METERING_HISTORY(
  START_TIME => DATEADD('day', -7, CURRENT_TIMESTAMP())
))
WHERE warehouse_name = 'ANALYTICS_MCW'
GROUP BY warehouse_name, cluster_number
ORDER BY cluster_number;

-- Monitor scaling events
SELECT
  warehouse_name,
  event_type,
  event_time,
  cluster_count,
  reason
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_EVENTS_HISTORY(
  START_TIME => DATEADD('day', -1, CURRENT_TIMESTAMP())
))
WHERE warehouse_name = 'ANALYTICS_MCW'
ORDER BY event_time;

Performance Metrics

-- Calculate performance metrics
SELECT
  DATE_TRUNC('hour', start_time) as hour,
  warehouse_name,
  COUNT(*) as query_count,
  AVG(execution_time_ms) as avg_execution_time,
  AVG(queue_time_ms) as avg_queue_time,
  SUM(credits_used) as total_credits
FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY(
  START_TIME => DATEADD('day', -1, CURRENT_TIMESTAMP())
))
WHERE warehouse_name = 'ANALYTICS_MCW'
GROUP BY 1, 2
ORDER BY hour DESC;

Monitor cluster utilization to ensure optimal scaling. If clusters are consistently under-utilized, consider reducing MAX_CLUSTER_COUNT. If queries are frequently queued, increase MAX_CLUSTER_COUNT or cluster size.

Cost Optimization

Right-Sizing Clusters

-- Analyze cluster usage patterns
SELECT
  cluster_number,
  COUNT(*) as usage_count,
  AVG(credits_used) as avg_credits,
  MAX(credits_used) as max_credits
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_METERING_HISTORY(
  START_TIME => DATEADD('day', -30, CURRENT_TIMESTAMP())
))
WHERE warehouse_name = 'ANALYTICS_MCW'
GROUP BY cluster_number
ORDER BY cluster_number;

-- Adjust based on usage
ALTER WAREHOUSE analytics_mcw SET
  MAX_CLUSTER_COUNT = 4;  -- Reduce if clusters 3-4 rarely used

Cost Analysis

-- Analyze cost per cluster
SELECT
  cluster_number,
  SUM(credits_used) as total_credits,
  SUM(credits_used * 10) as total_cost_usd,
  COUNT(*) as query_count,
  SUM(credits_used * 10) / COUNT(*) as cost_per_query
FROM TABLE(INFORMATION_SCHEMA.WAREHOUSE_METERING_HISTORY(
  START_TIME => DATEADD('day', -30, CURRENT_TIMESTAMP())
))
WHERE warehouse_name = 'ANALYTICS_MCW'
GROUP BY cluster_number
ORDER BY cluster_number;

Best Practices

PracticeImplementationBenefit
Set appropriate MIN/MAXBased on workload patternsBalanced cost/performance
Use scaling policiesSTANDARD for interactive, ECONOMY for batchOptimal resource allocation
Monitor utilizationTrack cluster usage metricsRight-sizing decisions
Implement resource monitorsSet credit quotas and alertsCost control
Assign warehouses by roleMatch workload to warehouseWorkload isolation
-- Recommended configuration for different workloads

-- Interactive analytics (Standard policy)
CREATE WAREHOUSE interactive_mcw
  WAREHOUSE_SIZE = 'MEDIUM'
  MIN_CLUSTER_COUNT = 1
  MAX_CLUSTER_COUNT = 4
  SCALING_POLICY = 'STANDARD'
  AUTO_SUSPEND = 60;

-- Batch processing (Economy policy)
CREATE WAREHOUSE batch_mcw
  WAREHOUSE_SIZE = 'LARGE'
  MIN_CLUSTER_COUNT = 1
  MAX_CLUSTER_COUNT = 8
  SCALING_POLICY = 'ECONOMY'
  AUTO_SUSPEND = 300;

-- ML workloads (High memory)
CREATE WAREHOUSE ml_mcw
  WAREHOUSE_SIZE = 'X-LARGE'
  MIN_CLUSTER_COUNT = 1
  MAX_CLUSTER_COUNT = 2
  SCALING_POLICY = 'STANDARD'
  AUTO_SUSPEND = 120;

Key Takeaways:

  • Multi-cluster warehouses auto-scale based on workload
  • Scaling policies (STANDARD vs ECONOMY) control scaling behavior
  • Query routing distributes work across available clusters
  • Resource monitors provide cost control and alerts
  • Regular monitoring enables right-sizing decisions
  • Different workloads benefit from different configurations

Advertisement

Need Expert Snowflake Help?

Get personalized warehouse optimization, data modeling, or Snowflake platform consulting.

Advertisement