Snowflake Advanced Β· Interview Prep
Virtual Warehouses & Auto-Suspend
Difficulty: Medium-Hard Β· Commonly asked at Netflix, Meta, Amazon
Interview Question
"How do you optimize Snowflake warehouse configurations for a mixed workload environment with both interactive BI queries and heavy ETL jobs? Walk me through auto-suspend strategies and cost implications."
βΉοΈ
Companies Asking This: Netflix (Senior Data Engineer), Meta (Data Platform Engineer), Amazon (L6 Data Engineer), Apple (Staff Data Engineer)
Virtual Warehouse Fundamentals
A virtual warehouse is an MPP compute cluster that Snowflake provisions on demand. Each warehouse is independent and executes queries against your data.
Warehouse Sizes & Compute Power
| Size | vCPUs (Approx) | Memory (GB) | Credits/Hour | Best For |
|---|---|---|---|---|
| X-Small | 1 | 2 | 1 | Dev/Test, light queries |
| Small | 2 | 4 | 2 | Interactive BI, small ETL |
| Medium | 4 | 8 | 4 | Standard workloads |
| Large | 8 | 16 | 8 | Heavy analytics, large joins |
| X-Large | 16 | 32 | 16 | Large ETL, data science |
| 2X-Large | 32 | 64 | 32 | Very large transformations |
| 3X-Large | 64 | 128 | 64 | Massively parallel processing |
| 4X-Large | 128 | 256 | 128 | Enterprise-scale workloads |
-- Create a warehouse with full configuration
CREATE WAREHOUSE analytics_wh
WAREHOUSE_SIZE = 'MEDIUM'
AUTO_SUSPEND = 120 -- Suspend after 120 seconds of inactivity
AUTO_RESUME = TRUE -- Automatically resume when queries arrive
INITIALLY_SUSPENDED = FALSE -- Start in running state
MIN_CLUSTER_COUNT = 1 -- Minimum clusters for multi-cluster
MAX_CLUSTER_COUNT = 4 -- Maximum clusters for auto-scaling
SCALING_POLICY = 'STANDARD'
STATEMENT_TIMEOUT_IN_SECONDS = 3600
STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 600
MAX_CONCURRENCY_LEVEL = 8;
-- Alter warehouse settings
ALTER WAREHOUSE analytics_wh
SET WAREHOUSE_SIZE = 'LARGE'
AUTO_SUSPEND = 60
SCALING_POLICY = 'ECONOMY';
-- Suspend a warehouse manually
ALTER WAREHOUSE analytics_wh SUSPEND;
-- Resume a warehouse
ALTER WAREHOUSE analytics_wh RESUME;
Auto-Suspend Strategies
Auto-suspend is one of the most important cost optimization levers in Snowflake. It automatically suspends a warehouse after a configurable period of inactivity.
Strategy Matrix
-- Strategy 1: Interactive BI workloads (short auto-suspend)
CREATE WAREHOUSE bi_dashboard
WAREHOUSE_SIZE = 'MEDIUM'
AUTO_SUSPEND = 60 -- 1 minute - rapid response for users
AUTO_RESUME = TRUE
MIN_CLUSTER_COUNT = 2
MAX_CLUSTER_COUNT = 8;
-- Strategy 2: ETL batch processing (longer auto-suspend)
CREATE WAREHOUSE etl_batch
WAREHOUSE_SIZE = 'LARGE'
AUTO_SUSPEND = 600 -- 10 minutes - ETL jobs run longer
AUTO_RESUME = TRUE
MIN_CLUSTER_COUNT = 1
MAX_CLUSTER_COUNT = 2;
-- Strategy 3: Data Science workloads (aggressive auto-suspend)
CREATE WAREHOUSE data_science_dev
WAREHOUSE_SIZE = 'X-LARGE'
AUTO_SUSPEND = 60 -- 1 minute - expensive, suspend fast
AUTO_RESUME = TRUE
MIN_CLUSTER_COUNT = 1
MAX_CLUSTER_COUNT = 1;
-- Strategy 4: Always-on for critical dashboards
CREATE WAREHOUSE critical_dashboard
WAREHOUSE_SIZE = 'MEDIUM'
AUTO_SUSPEND = 3600 -- 1 hour - keep warm for SLA
AUTO_RESUME = TRUE
MIN_CLUSTER_COUNT = 2
MAX_CLUSTER_COUNT = 6;
βΉοΈ
Cost Impact: A Large warehouse costs 8 credits/hour. If it sits idle for 10 hours/day with AUTO_SUSPEND = 0 (never suspend), that's 80 wasted credits/day = ~2,400 credits/month = ~10/credit. Setting AUTO_SUSPEND = 300 can save 70-80% of idle costs.
Real-World Scenario: Netflix
Question: "Netflix has thousands of analysts running dashboards during business hours, but heavy ETL jobs run overnight. How do you architect warehouse isolation?"
Solution: Workload Isolation Pattern
-- 1. Create separate warehouses for each workload type
-- BI/Interactive warehouse
CREATE WAREHOUSE bi_interactive
WAREHOUSE_SIZE = 'MEDIUM'
AUTO_SUSPEND = 60
AUTO_RESUME = TRUE
MIN_CLUSTER_COUNT = 2
MAX_CLUSTER_COUNT = 10
SCALING_POLICY = 'STANDARD'
STATEMENT_TIMEOUT_IN_SECONDS = 900;
-- ETL warehouse (scheduled overnight)
CREATE WAREHOUSE etl_overnight
WAREHOUSE_SIZE = 'X-LARGE'
AUTO_SUSPEND = 300
AUTO_RESUME = TRUE
MIN_CLUSTER_COUNT = 1
MAX_CLUSTER_COUNT = 3
SCALING_POLICY = 'ECONOMY'
STATEMENT_TIMEOUT_IN_SECONDS = 7200;
-- 2. Create a resource monitor per warehouse
CREATE RESOURCE MONITOR bi_budget
WITH CREDIT_QUOTA = 1000
TRIGGERS
ON 80% DO NOTIFY
ON 100% DO SUSPEND
ON 110% DO SUSPEND_IMMEDIATE;
CREATE RESOURCE MONITOR etl_budget
WITH CREDIT_QUOTA = 2000
TRIGGERS
ON 80% DO NOTIFY
ON 100% DO SUSPEND
ON 110% DO SUSPEND_IMMEDIATE;
-- 3. Assign warehouses to roles for access control
GRANT USAGE ON WAREHOUSE bi_interactive TO ROLE analyst_role;
GRANT USAGE ON WAREHOUSE etl_overnight TO ROLE etl_role;
Dynamic Scaling in Action
-- Simulate burst workload to trigger auto-scaling
-- Query queue depth triggers multi-cluster scaling
WITH burst_data AS (
SELECT
seq,
RANDOM() AS val
FROM TABLE(GENERATOR(ROWCOUNT => 10000000))
)
SELECT
b1.seq,
b1.val,
AVG(b2.val) OVER (ORDER BY b1.seq ROWS BETWEEN 1000 PRECEDING AND CURRENT ROW) AS moving_avg
FROM burst_data b1
CROSS JOIN burst_data b2
LIMIT 1000000;
-- Monitor scaling during this workload
SELECT
warehouse_name,
cluster_number,
queries_queued,
queries_executing,
credits_used,
start_time
FROM snowflake.account_usage.warehouse_load_history
WHERE warehouse_name = 'bi_interactive'
AND start_time >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
ORDER BY start_time DESC;
Real-World Scenario: Meta
Question: "How do you handle a scenario where a single expensive query from one analyst blocks all other analysts from running queries?"
Solution: Query Queue Management & Isolation
-- 1. Use statement timeout to kill long-running queries
ALTER WAREHOUSE bi_interactive
SET STATEMENT_TIMEOUT_IN_SECONDS = 300; -- 5 minute max
-- 2. Use statement queued timeout to fail queued queries quickly
ALTER WAREHOUSE bi_interactive
SET STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 60; -- 1 minute max queue
-- 3. Monitor query queue depth
SELECT
warehouse_name,
query_id,
query_text,
user_name,
start_time,
queue_time_ms,
execution_time_ms,
total_elapsed_time_ms,
CASE
WHEN queue_time_ms > 60000 THEN 'HIGH LATENCY'
WHEN queue_time_ms > 10000 THEN 'MODERATE LATENCY'
ELSE 'NORMAL'
END AS queue_status
FROM snowflake.account_usage.query_history
WHERE warehouse_name = 'bi_interactive'
AND start_time >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
ORDER BY queue_time_ms DESC;
-- 4. Create a dedicated warehouse for power users
CREATE WAREHOUSE power_user_wh
WAREHOUSE_SIZE = 'LARGE'
AUTO_SUSPEND = 60
AUTO_RESUME = TRUE
MIN_CLUSTER_COUNT = 1
MAX_CLUSTER_COUNT = 2
STATEMENT_TIMEOUT_IN_SECONDS = 600
STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 30;
-- 5. Route expensive queries to a separate warehouse
-- In your application layer, route based on query complexity
ALTER SESSION SET QUERY_TAG = 'power_user';
-- 6. Use query tags to categorize and route
ALTER SESSION SET QUERY_TAG = '{"workload": "exploratory", "priority": "low"}';
SELECT * FROM large_table WHERE complex_condition;
β οΈ
Critical Anti-Pattern: Never use a single warehouse for all workloads. A 4X-Large warehouse running BI queries wastes money (queries only use 2-4 cores), while a Small warehouse running ETL takes hours instead of minutes.
Warehouse Monitoring & Optimization
Credit Consumption Analysis
-- Daily credit consumption by warehouse
SELECT
DATE(start_time) AS usage_date,
warehouse_name,
warehouse_size,
SUM(credits_used) AS total_credits,
SUM(credits_used) * 10 AS estimated_cost_usd,
AVG(credits_used_per_second) AS avg_credits_per_second
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2, 3
ORDER BY 4 DESC;
-- Identify underutilized warehouses
SELECT
w.warehouse_name,
w.warehouse_size,
COALESCE(q.queries_last_7d, 0) AS queries_last_7d,
COALESCE(c.credits_last_7d, 0) AS credits_last_7d,
CASE
WHEN COALESCE(q.queries_last_7d, 0) = 0 THEN 'UNUSED - CONSIDER DROPPING'
WHEN COALESCE(c.credits_last_7d, 0) / NULLIF(q.queries_last_7d, 0) > 0.5 THEN 'HIGH COST PER QUERY'
ELSE 'HEALTHY'
END AS recommendation
FROM information_schema.warehouses w
LEFT JOIN (
SELECT warehouse_name, COUNT(*) AS queries_last_7d
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY warehouse_name
) q ON w.warehouse_name = q.warehouse_name
LEFT JOIN (
SELECT warehouse_name, SUM(credits_used) AS credits_last_7d
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY warehouse_name
) c ON w.warehouse_name = c.warehouse_name
ORDER BY c.credits_last_7d DESC NULLS LAST;
-- Analyze auto-suspend effectiveness
SELECT
warehouse_name,
warehouse_size,
DATEDIFF('minute', MIN(start_time), MAX(end_time)) AS active_minutes,
SUM(credits_used) AS total_credits,
COUNT(DISTINCT DATE(start_time)) AS days_active,
ROUND(SUM(credits_used) / COUNT(DISTINCT DATE(start_time)), 2) AS credits_per_day
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY 4 DESC;
Edge Cases & Advanced Patterns
Pattern: Query-Based Warehouse Routing
-- Create a function to suggest warehouse based on query complexity
CREATE OR REPLACE PROCEDURE route_query(query_text VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
AS
$$
BEGIN
IF (LENGTH(query_text) > 5000 OR LOWER(query_text) LIKE '%cross join%') THEN
RETURN 'POWER_USER_WH';
ELSEIF (LOWER(query_text) LIKE '%insert%' OR LOWER(query_text) LIKE '%merge%') THEN
RETURN 'ETL_WH';
ELSE
RETURN 'BI_INTERACTIVE';
END IF;
END;
$$;
-- Use query tags for routing decisions
ALTER SESSION SET QUERY_TAG = 'auto_route';
Pattern: Warehouse Resume Optimization
-- Pre-warm warehouses for predictable workloads
-- Schedule a lightweight query to keep warehouse warm
CREATE OR REPLACE TASK warm_up_warehouse
WAREHOUSE = admin_wh
SCHEDULE = 'USING CRON 0 8 * * 1-5 America/New_York' -- 8 AM weekdays
AS
SELECT 1; -- Lightweight keep-alive query
-- Check task status
SELECT
task_name,
state,
schedule,
last_committed_on
FROM information_schema.task_history
WHERE task_name = 'WARM_UP_WAREHOUSE';
Best Practices Summary
| Scenario | Warehouse Config | Auto-Suspend | Scaling |
|---|---|---|---|
| Interactive BI | MEDIUM, SCALING=STANDARD | 60s | MIN=2, MAX=8 |
| Batch ETL | LARGE, SCALING=ECONOMY | 300-600s | MIN=1, MAX=2 |
| Data Science | X-LARGE | 60s | MIN=1, MAX=1 |
| Always-on dashboards | MEDIUM | 3600s | MIN=2, MAX=6 |
| Dev/Test | X-SMALL | 60s | MIN=1, MAX=1 |
β οΈ
Cost Traps to Avoid:
- AUTO_SUSPEND = 0 β Never suspend, burns credits 24/7
- Over-sized warehouses β A 4X-Large for 100-row queries wastes 127x compute
- No resource monitors β Unchecked credit consumption
- Ignoring query tags β Can't route or categorize workloads