Snowflake Advanced · Interview Prep

Virtual Warehouses & Auto-Suspend

Difficulty: Medium-Hard · Commonly asked at Netflix, Meta, Amazon

Interview Question

"How do you optimize Snowflake warehouse configurations for a mixed workload environment with both interactive BI queries and heavy ETL jobs? Walk me through auto-suspend strategies and cost implications."

ℹ️

Companies Asking This: Netflix (Senior Data Engineer), Meta (Data Platform Engineer), Amazon (L6 Data Engineer), Apple (Staff Data Engineer)

Virtual Warehouse Fundamentals

A virtual warehouse is an MPP compute cluster that Snowflake provisions on demand. Each warehouse is independent and executes queries against your data.

Warehouse Sizes & Compute Power

Size	vCPUs (Approx)	Memory (GB)	Credits/Hour	Best For
X-Small	1	2	1	Dev/Test, light queries
Small	2	4	2	Interactive BI, small ETL
Medium	4	8	4	Standard workloads
Large	8	16	8	Heavy analytics, large joins
X-Large	16	32	16	Large ETL, data science
2X-Large	32	64	32	Very large transformations
3X-Large	64	128	64	Massively parallel processing
4X-Large	128	256	128	Enterprise-scale workloads

-- Create a warehouse with full configuration
CREATE WAREHOUSE analytics_wh
    WAREHOUSE_SIZE = 'MEDIUM'
    AUTO_SUSPEND = 120          -- Suspend after 120 seconds of inactivity
    AUTO_RESUME = TRUE          -- Automatically resume when queries arrive
    INITIALLY_SUSPENDED = FALSE -- Start in running state
    MIN_CLUSTER_COUNT = 1       -- Minimum clusters for multi-cluster
    MAX_CLUSTER_COUNT = 4       -- Maximum clusters for auto-scaling
    SCALING_POLICY = 'STANDARD'
    STATEMENT_TIMEOUT_IN_SECONDS = 3600
    STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 600
    MAX_CONCURRENCY_LEVEL = 8;

-- Alter warehouse settings
ALTER WAREHOUSE analytics_wh
    SET WAREHOUSE_SIZE = 'LARGE'
        AUTO_SUSPEND = 60
        SCALING_POLICY = 'ECONOMY';

-- Suspend a warehouse manually
ALTER WAREHOUSE analytics_wh SUSPEND;

-- Resume a warehouse
ALTER WAREHOUSE analytics_wh RESUME;

Auto-Suspend Strategies

Auto-suspend is one of the most important cost optimization levers in Snowflake. It automatically suspends a warehouse after a configurable period of inactivity.

Strategy Matrix

-- Strategy 1: Interactive BI workloads (short auto-suspend)
CREATE WAREHOUSE bi_dashboard
    WAREHOUSE_SIZE = 'MEDIUM'
    AUTO_SUSPEND = 60              -- 1 minute - rapid response for users
    AUTO_RESUME = TRUE
    MIN_CLUSTER_COUNT = 2
    MAX_CLUSTER_COUNT = 8;

-- Strategy 2: ETL batch processing (longer auto-suspend)
CREATE WAREHOUSE etl_batch
    WAREHOUSE_SIZE = 'LARGE'
    AUTO_SUSPEND = 600             -- 10 minutes - ETL jobs run longer
    AUTO_RESUME = TRUE
    MIN_CLUSTER_COUNT = 1
    MAX_CLUSTER_COUNT = 2;

-- Strategy 3: Data Science workloads (aggressive auto-suspend)
CREATE WAREHOUSE data_science_dev
    WAREHOUSE_SIZE = 'X-LARGE'
    AUTO_SUSPEND = 60              -- 1 minute - expensive, suspend fast
    AUTO_RESUME = TRUE
    MIN_CLUSTER_COUNT = 1
    MAX_CLUSTER_COUNT = 1;

-- Strategy 4: Always-on for critical dashboards
CREATE WAREHOUSE critical_dashboard
    WAREHOUSE_SIZE = 'MEDIUM'
    AUTO_SUSPEND = 3600            -- 1 hour - keep warm for SLA
    AUTO_RESUME = TRUE
    MIN_CLUSTER_COUNT = 2
    MAX_CLUSTER_COUNT = 6;

ℹ️

Cost Impact: A Large warehouse costs 8 credits/hour. If it sits idle for 10 hours/day with AUTO_SUSPEND = 0 (never suspend), that's 80 wasted credits/day = ~2,400 credits/month = ~ $24,000 at$ 10/credit. Setting AUTO_SUSPEND = 300 can save 70-80% of idle costs.

Real-World Scenario: Netflix

Question: "Netflix has thousands of analysts running dashboards during business hours, but heavy ETL jobs run overnight. How do you architect warehouse isolation?"

Solution: Workload Isolation Pattern

-- 1. Create separate warehouses for each workload type
-- BI/Interactive warehouse
CREATE WAREHOUSE bi_interactive
    WAREHOUSE_SIZE = 'MEDIUM'
    AUTO_SUSPEND = 60
    AUTO_RESUME = TRUE
    MIN_CLUSTER_COUNT = 2
    MAX_CLUSTER_COUNT = 10
    SCALING_POLICY = 'STANDARD'
    STATEMENT_TIMEOUT_IN_SECONDS = 900;

-- ETL warehouse (scheduled overnight)
CREATE WAREHOUSE etl_overnight
    WAREHOUSE_SIZE = 'X-LARGE'
    AUTO_SUSPEND = 300
    AUTO_RESUME = TRUE
    MIN_CLUSTER_COUNT = 1
    MAX_CLUSTER_COUNT = 3
    SCALING_POLICY = 'ECONOMY'
    STATEMENT_TIMEOUT_IN_SECONDS = 7200;

-- 2. Create a resource monitor per warehouse
CREATE RESOURCE MONITOR bi_budget
    WITH CREDIT_QUOTA = 1000
    TRIGGERS 
        ON 80% DO NOTIFY
        ON 100% DO SUSPEND
        ON 110% DO SUSPEND_IMMEDIATE;

CREATE RESOURCE MONITOR etl_budget
    WITH CREDIT_QUOTA = 2000
    TRIGGERS 
        ON 80% DO NOTIFY
        ON 100% DO SUSPEND
        ON 110% DO SUSPEND_IMMEDIATE;

-- 3. Assign warehouses to roles for access control
GRANT USAGE ON WAREHOUSE bi_interactive TO ROLE analyst_role;
GRANT USAGE ON WAREHOUSE etl_overnight TO ROLE etl_role;

Dynamic Scaling in Action

-- Simulate burst workload to trigger auto-scaling
-- Query queue depth triggers multi-cluster scaling
WITH burst_data AS (
    SELECT 
        seq,
        RANDOM() AS val
    FROM TABLE(GENERATOR(ROWCOUNT => 10000000))
)
SELECT 
    b1.seq,
    b1.val,
    AVG(b2.val) OVER (ORDER BY b1.seq ROWS BETWEEN 1000 PRECEDING AND CURRENT ROW) AS moving_avg
FROM burst_data b1
CROSS JOIN burst_data b2
LIMIT 1000000;

-- Monitor scaling during this workload
SELECT 
    warehouse_name,
    cluster_number,
    queries_queued,
    queries_executing,
    credits_used,
    start_time
FROM snowflake.account_usage.warehouse_load_history
WHERE warehouse_name = 'bi_interactive'
  AND start_time >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
ORDER BY start_time DESC;

Real-World Scenario: Meta

Question: "How do you handle a scenario where a single expensive query from one analyst blocks all other analysts from running queries?"

Solution: Query Queue Management & Isolation

-- 1. Use statement timeout to kill long-running queries
ALTER WAREHOUSE bi_interactive 
    SET STATEMENT_TIMEOUT_IN_SECONDS = 300;  -- 5 minute max

-- 2. Use statement queued timeout to fail queued queries quickly
ALTER WAREHOUSE bi_interactive 
    SET STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 60;  -- 1 minute max queue

-- 3. Monitor query queue depth
SELECT 
    warehouse_name,
    query_id,
    query_text,
    user_name,
    start_time,
    queue_time_ms,
    execution_time_ms,
    total_elapsed_time_ms,
    CASE 
        WHEN queue_time_ms > 60000 THEN 'HIGH LATENCY'
        WHEN queue_time_ms > 10000 THEN 'MODERATE LATENCY'
        ELSE 'NORMAL'
    END AS queue_status
FROM snowflake.account_usage.query_history
WHERE warehouse_name = 'bi_interactive'
  AND start_time >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
ORDER BY queue_time_ms DESC;

-- 4. Create a dedicated warehouse for power users
CREATE WAREHOUSE power_user_wh
    WAREHOUSE_SIZE = 'LARGE'
    AUTO_SUSPEND = 60
    AUTO_RESUME = TRUE
    MIN_CLUSTER_COUNT = 1
    MAX_CLUSTER_COUNT = 2
    STATEMENT_TIMEOUT_IN_SECONDS = 600
    STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 30;

-- 5. Route expensive queries to a separate warehouse
-- In your application layer, route based on query complexity
ALTER SESSION SET QUERY_TAG = 'power_user';

-- 6. Use query tags to categorize and route
ALTER SESSION SET QUERY_TAG = '{"workload": "exploratory", "priority": "low"}';

SELECT * FROM large_table WHERE complex_condition;

⚠️

Critical Anti-Pattern: Never use a single warehouse for all workloads. A 4X-Large warehouse running BI queries wastes money (queries only use 2-4 cores), while a Small warehouse running ETL takes hours instead of minutes.

Warehouse Monitoring & Optimization

Credit Consumption Analysis

-- Daily credit consumption by warehouse
SELECT 
    DATE(start_time) AS usage_date,
    warehouse_name,
    warehouse_size,
    SUM(credits_used) AS total_credits,
    SUM(credits_used) * 10 AS estimated_cost_usd,
    AVG(credits_used_per_second) AS avg_credits_per_second
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2, 3
ORDER BY 4 DESC;

-- Identify underutilized warehouses
SELECT 
    w.warehouse_name,
    w.warehouse_size,
    COALESCE(q.queries_last_7d, 0) AS queries_last_7d,
    COALESCE(c.credits_last_7d, 0) AS credits_last_7d,
    CASE 
        WHEN COALESCE(q.queries_last_7d, 0) = 0 THEN 'UNUSED - CONSIDER DROPPING'
        WHEN COALESCE(c.credits_last_7d, 0) / NULLIF(q.queries_last_7d, 0) > 0.5 THEN 'HIGH COST PER QUERY'
        ELSE 'HEALTHY'
    END AS recommendation
FROM information_schema.warehouses w
LEFT JOIN (
    SELECT warehouse_name, COUNT(*) AS queries_last_7d
    FROM snowflake.account_usage.query_history
    WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
    GROUP BY warehouse_name
) q ON w.warehouse_name = q.warehouse_name
LEFT JOIN (
    SELECT warehouse_name, SUM(credits_used) AS credits_last_7d
    FROM snowflake.account_usage.warehouse_metering_history
    WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
    GROUP BY warehouse_name
) c ON w.warehouse_name = c.warehouse_name
ORDER BY c.credits_last_7d DESC NULLS LAST;

-- Analyze auto-suspend effectiveness
SELECT 
    warehouse_name,
    warehouse_size,
    DATEDIFF('minute', MIN(start_time), MAX(end_time)) AS active_minutes,
    SUM(credits_used) AS total_credits,
    COUNT(DISTINCT DATE(start_time)) AS days_active,
    ROUND(SUM(credits_used) / COUNT(DISTINCT DATE(start_time)), 2) AS credits_per_day
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY 4 DESC;

Edge Cases & Advanced Patterns

Pattern: Query-Based Warehouse Routing

-- Create a function to suggest warehouse based on query complexity
CREATE OR REPLACE PROCEDURE route_query(query_text VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
AS
$$
BEGIN
    IF (LENGTH(query_text) > 5000 OR LOWER(query_text) LIKE '%cross join%') THEN
        RETURN 'POWER_USER_WH';
    ELSEIF (LOWER(query_text) LIKE '%insert%' OR LOWER(query_text) LIKE '%merge%') THEN
        RETURN 'ETL_WH';
    ELSE
        RETURN 'BI_INTERACTIVE';
    END IF;
END;
$$;

-- Use query tags for routing decisions
ALTER SESSION SET QUERY_TAG = 'auto_route';

Pattern: Warehouse Resume Optimization

-- Pre-warm warehouses for predictable workloads
-- Schedule a lightweight query to keep warehouse warm
CREATE OR REPLACE TASK warm_up_warehouse
    WAREHOUSE = admin_wh
    SCHEDULE = 'USING CRON 0 8 * * 1-5 America/New_York'  -- 8 AM weekdays
AS
    SELECT 1;  -- Lightweight keep-alive query

-- Check task status
SELECT 
    task_name,
    state,
    schedule,
    last_committed_on
FROM information_schema.task_history
WHERE task_name = 'WARM_UP_WAREHOUSE';

Best Practices Summary

Scenario	Warehouse Config	Auto-Suspend	Scaling
Interactive BI	MEDIUM, SCALING=STANDARD	60s	MIN=2, MAX=8
Batch ETL	LARGE, SCALING=ECONOMY	300-600s	MIN=1, MAX=2
Data Science	X-LARGE	60s	MIN=1, MAX=1
Always-on dashboards	MEDIUM	3600s	MIN=2, MAX=6
Dev/Test	X-SMALL	60s	MIN=1, MAX=1

⚠️

Cost Traps to Avoid:

AUTO_SUSPEND = 0 — Never suspend, burns credits 24/7
Over-sized warehouses — A 4X-Large for 100-row queries wastes 127x compute
No resource monitors — Unchecked credit consumption
Ignoring query tags — Can't route or categorize workloads

Topic: Virtual Warehouses & Auto-Suspend Deep Dive