πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Topic: Pricing Model & Cost Optimization

Snowflake AdvancedPricing⭐ Premium

Advertisement

Snowflake Advanced Β· Interview Prep

Pricing Model & Cost Optimization

Difficulty: Medium Β· Commonly asked at Google, Meta, Amazon

Interview Question

"How do you calculate the total cost of ownership for a Snowflake deployment? Walk me through the pricing model and identify the top 5 cost optimization strategies."

ℹ️

Companies Asking This: Google (Senior Cloud Architect), Meta (Data Platform Engineer), Amazon (L6 Data Engineer), Microsoft (Principal Data Architect)


Snowflake Pricing Model

Snowflake pricing has two components: compute (credits) and storage.

Compute Costs (Credits)

Warehouse SizeCredits/HourCredits/SecondTypical Use Case
X-Small10.000278Dev/Test
Small20.000556Interactive BI
Medium40.001111Standard workloads
Large80.002222Heavy analytics
X-Large160.004444ETL/Data Science
2X-Large320.008889Large ETL
3X-Large640.017778Enterprise-scale
4X-Large1280.035556Massively parallel

Storage Costs

-- Check current storage usage
SELECT 
    table_name,
    row_count,
    bytes / (1024*1024*1024) AS storage_gb,
    retention_time,
    CASE 
        WHEN retention_time > 30 THEN 'HIGH RETENTION - COST IMPACT'
        WHEN retention_time > 7 THEN 'MODERATE RETENTION'
        ELSE 'LOW RETENTION'
    END AS retention_impact
FROM information_schema.tables
WHERE table_schema = 'PUBLIC'
ORDER BY bytes DESC;

-- Calculate total storage cost
SELECT 
    SUM(bytes) / (1024*1024*1024*1024) AS total_storage_tb,
    SUM(bytes) / (1024*1024*1024*1024) * 23 AS estimated_monthly_cost_usd  -- $23/TB/month
FROM information_schema.tables;

-- Breakdown by database
SELECT 
    table_database,
    SUM(bytes) / (1024*1024*1024) AS storage_gb,
    SUM(bytes) / (1024*1024*1024) * 23 / 30 AS daily_cost_usd
FROM information_schema.tables
GROUP BY 1
ORDER BY 2 DESC;

Cost Optimization Strategies

Strategy 1: Auto-Suspend Optimization

-- Identify warehouses with poor auto-suspend settings
SELECT 
    warehouse_name,
    warehouse_size,
    auto_suspend,
    CASE 
        WHEN auto_suspend = 0 THEN 'CRITICAL: Never suspends'
        WHEN auto_suspend < 60 THEN 'WARNING: Very short'
        WHEN auto_suspend <= 300 THEN 'GOOD'
        ELSE 'CONSIDER REDUCING'
    END AS recommendation
FROM information_schema.warehouses;

-- Calculate idle costs
SELECT 
    w.warehouse_name,
    w.warehouse_size,
    w.auto_suspend,
    m.credits_used,
    m.start_time,
    m.end_time,
    DATEDIFF('minute', m.start_time, m.end_time) AS active_minutes,
    CASE 
        WHEN w.auto_suspend = 0 THEN 
            (24 * 60 - DATEDIFF('minute', m.start_time, m.end_time)) * 
            (CASE w.warehouse_size 
                WHEN 'X-Small' THEN 1/60 
                WHEN 'Small' THEN 2/60 
                WHEN 'Medium' THEN 4/60 
                WHEN 'Large' THEN 8/60 
                WHEN 'X-Large' THEN 16/60 
            END)
        ELSE 0
    END AS wasted_credits
FROM information_schema.warehouses w
JOIN snowflake.account_usage.warehouse_metering_history m
    ON w.warehouse_name = m.warehouse_name
WHERE m.start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP());

Strategy 2: Right-Sizing Warehouses

-- Identify over-provisioned warehouses
SELECT 
    warehouse_name,
    warehouse_size,
    AVG(queries_executed) AS avg_queries_per_hour,
    AVG(credits_used) AS avg_credits_per_hour,
    CASE 
        WHEN warehouse_size IN ('X-Large', '2X-Large', '3X-Large', '4X-Large') 
             AND AVG(queries_executed) < 100 THEN 'DOWNSIZE'
        WHEN warehouse_size IN ('Large', 'X-Large') 
             AND AVG(queries_executed) > 1000 THEN 'UPSIZE'
        ELSE 'APPROPRIATE'
    END AS sizing_recommendation
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1, 2;

-- Analyze query complexity vs warehouse size
SELECT 
    warehouse_name,
    warehouse_size,
    AVG(total_elapsed_time_ms) / 1000 AS avg_query_seconds,
    AVG(bytes_scanned / (1024*1024*1024)) AS avg_gb_scanned,
    CASE 
        WHEN AVG(total_elapsed_time_ms) < 1000 AND warehouse_size IN ('Large', 'X-Large') 
        THEN 'OVER-PROVISIONED'
        WHEN AVG(total_elapsed_time_ms) > 10000 AND warehouse_size IN ('Small', 'Medium') 
        THEN 'UNDER-PROVISIONED'
        ELSE 'APPROPRIATE'
    END AS analysis
FROM snowflake.account_usage.query_history
WHERE query_type = 'SELECT'
  AND start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1, 2;

Strategy 3: Resource Monitors

-- Create resource monitors for cost control
CREATE RESOURCE MONITOR monthly_budget
    WITH CREDIT_QUOTA = 5000
    TRIGGERS 
        ON 50% DO NOTIFY
        ON 80% DO NOTIFY
        ON 100% DO SUSPEND
        ON 110% DO SUSPEND_IMMEDIATE;

-- Create per-warehouse monitors
CREATE RESOURCE MONITOR etl_monitor
    WITH CREDIT_QUOTA = 2000
    TRIGGERS 
        ON 80% DO NOTIFY
        ON 100% DO SUSPEND;

-- Assign monitors to warehouses
ALTER WAREHOUSE etl_wh SET RESOURCE_MONITOR = etl_monitor;
ALTER WAREHOUSE bi_wh SET RESOURCE_MONITOR = monthly_budget;

-- Check monitor status
SELECT 
    name,
    credit_quota,
    credits_used,
    credits_remaining,
    suspend_at,
    suspend_immediately_at,
    notify_at
FROM information_schema.resource_monitors;

Strategy 4: Time Travel & Storage Optimization

-- Reduce retention for non-critical tables
ALTER TABLE dev_test_data SET DATA_RETENTION_TIME_IN_DAYS = 1;
ALTER TABLE staging_data SET DATA_RETENTION_TIME_IN_DAYS = 3;

-- Keep longer retention only for critical tables
ALTER TABLE production_financials SET DATA_RETENTION_TIME_IN_DAYS = 90;

-- Calculate storage cost impact of retention
SELECT 
    table_name,
    retention_time,
    bytes / (1024*1024*1024) AS current_storage_gb,
    bytes / (1024*1024*1024) * retention_time / 30 AS estimated_total_storage_gb,
    bytes / (1024*1024*1024) * retention_time / 30 * 23 / 30 AS daily_storage_cost_usd
FROM information_schema.tables
WHERE retention_time > 7
ORDER BY 5 DESC;

Strategy 5: Query Optimization for Cost

-- Identify expensive queries
SELECT 
    query_id,
    query_text,
    warehouse_name,
    credits_used,
    total_elapsed_time_ms / 1000 AS execution_seconds,
    bytes_scanned / (1024*1024*1024) AS gb_scanned,
    CASE 
        WHEN credits_used > 10 THEN 'HIGH COST'
        WHEN credits_used > 1 THEN 'MODERATE COST'
        ELSE 'LOW COST'
    END AS cost_category
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
  AND credits_used > 1
ORDER BY credits_used DESC
LIMIT 20;

-- Calculate cost per query pattern
SELECT 
    CASE 
        WHEN query_text LIKE '%SELECT%JOIN%' THEN 'JOIN'
        WHEN query_text LIKE '%SELECT%GROUP BY%' THEN 'AGGREGATION'
        WHEN query_text LIKE '%INSERT%' THEN 'INSERT'
        WHEN query_text LIKE '%UPDATE%' THEN 'UPDATE'
        WHEN query_text LIKE '%DELETE%' THEN 'DELETE'
        ELSE 'OTHER'
    END AS query_pattern,
    COUNT(*) AS query_count,
    SUM(credits_used) AS total_credits,
    AVG(credits_used) AS avg_credits_per_query,
    SUM(credits_used) * 10 AS estimated_cost_usd
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY 2 DESC;

Cost Monitoring Dashboard

-- Daily cost trend
SELECT 
    DATE(start_time) AS usage_date,
    SUM(credits_used) AS total_credits,
    SUM(credits_used) * 10 AS estimated_cost_usd,
    COUNT(DISTINCT warehouse_name) AS warehouses_used,
    COUNT(*) AS total_queries
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY 1;

-- Cost by warehouse
SELECT 
    warehouse_name,
    warehouse_size,
    SUM(credits_used) AS total_credits,
    SUM(credits_used) * 10 AS estimated_cost_usd,
    ROUND(SUM(credits_used) * 100.0 / (SELECT SUM(credits_used) * 10 
        FROM snowflake.account_usage.warehouse_metering_history 
        WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())), 2) AS pct_of_total
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY 3 DESC;

-- Cost by user
SELECT 
    user_name,
    SUM(credits_used) AS total_credits,
    SUM(credits_used) * 10 AS estimated_cost_usd,
    COUNT(*) AS query_count,
    AVG(total_elapsed_time_ms) / 1000 AS avg_query_seconds
FROM snowflake.account_usage.query_history q
JOIN snowflake.account_usage.warehouse_metering_history w
    ON q.warehouse_name = w.warehouse_name
WHERE q.start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY 2 DESC;

Best Practices

StrategyExpected Savings
Auto-suspend optimization30-50%
Right-sizing warehouses20-40%
Resource monitorsPrevents overruns
Time travel reduction10-30% storage
Query optimization20-50% compute

⚠️

Cost Traps:

  1. AUTO_SUSPEND = 0 β€” Burns credits 24/7
  2. Over-sized warehouses β€” 4X-Large for small queries wastes money
  3. High retention β€” 90-day retention on dev tables is expensive
  4. No resource monitors β€” Unchecked credit consumption
  5. Ignoring idle time β€” Warehouses running without queries

Advertisement