CW

Snowflake External Functions

Free Lesson

Advertisement

Snowflake External Functions

External functions in Snowflake allow you to call custom code running as serverless functions in cloud providers, extending Snowflake's capabilities with custom APIs and third-party services.

Architecture Overview

<svg width="800" height="500" viewBox="0 0 800 500" xmlns="http://www.w3.org/2000/svg">
  <defs>
    <linearGradient id="apiGrad" x1="0%" y1="0%" x2="100%" y2="0%">
      <stop offset="0%" style="stop-color:#FF6B6B;stop-opacity:1" />
      <stop offset="100%" style="stop-color:#FF8E8E;stop-opacity:1" />
    </linearGradient>
    <linearGradient id="snowGrad" x1="0%" y1="0%" x2="100%" y2="0%">
      <stop offset="0%" style="stop-color:#29B5E8;stop-opacity:1" />
      <stop offset="100%" style="stop-color:#1E90FF;stop-opacity:1" />
    </linearGradient>
  </defs>
  <text x="400" y="35" text-anchor="middle" font-size="18" font-weight="bold" fill="#333">Snowflake External Function Architecture</text>
  <rect x="50" y="60" width="300" height="380" rx="15" fill="url(#snowGrad)" opacity="0.9"/>
  <text x="200" y="90" text-anchor="middle" font-size="14" fill="white" font-weight="bold">Snowflake</text>
  <rect x="80" y="110" width="240" height="50" rx="8" fill="white" opacity="0.9"/>
  <text x="200" y="140" text-anchor="middle" font-size="12" fill="#333">SQL Query Processing</text>
  <rect x="80" y="180" width="240" height="50" rx="8" fill="white" opacity="0.9"/>
  <text x="200" y="210" text-anchor="middle" font-size="12" fill="#333">API Integration Layer</text>
  <rect x="80" y="250" width="240" height="50" rx="8" fill="white" opacity="0.9"/>
  <text x="200" y="280" text-anchor="middle" font-size="12" fill="#333">CREATE EXTERNAL FUNCTION</text>
  <path d="M320 280 L380 280" stroke="#333" stroke-width="2" fill="none" marker-end="url(#arrowExt)"/>
  <text x="350" y="270" text-anchor="middle" font-size="10" fill="#666">HTTP/S</text>
  <rect x="420" y="60" width="340" height="380" rx="15" fill="url(#apiGrad)" opacity="0.9"/>
  <text x="590" y="90" text-anchor="middle" font-size="14" fill="white" font-weight="bold">Cloud Provider</text>
  <rect x="450" y="110" width="280" height="50" rx="8" fill="white" opacity="0.9"/>
  <text x="590" y="140" text-anchor="middle" font-size="12" fill="#333">API Gateway / Load Balancer</text>
  <rect x="450" y="180" width="130" height="50" rx="8" fill="white" opacity="0.9"/>
  <text x="515" y="210" text-anchor="middle" font-size="10" fill="#333">Lambda 1</text>
  <rect x="600" y="180" width="130" height="50" rx="8" fill="white" opacity="0.9"/>
  <text x="665" y="210" text-anchor="middle" font-size="10" fill="#333">Lambda 2</text>
  <rect x="450" y="250" width="280" height="60" rx="8" fill="white" opacity="0.9"/>
  <text x="590" y="275" text-anchor="middle" font-size="12" fill="#333">Custom Business Logic</text>
  <text x="590" y="295" text-anchor="middle" font-size="10" fill="#666">APIs | ML Models | Transforms</text>
  <defs>
    <marker id="arrowExt" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto" markerUnits="strokeWidth">
      <path d="M0,0 L0,6 L9,3 z" fill="#333"/>
    </marker>
  </defs>
</svg>

Key Concepts

DfExternal Function

DfBatching

API Integration Setup

AWS API Gateway + Lambda

-- Create API integration
CREATE OR REPLACE API INTEGRATION aws_api_integration
  API_PROVIDER = aws_api_gateway
  API_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/snowflake-api-role'
  API_ALLOWED_PREFIXES = ('https://abc123.execute-api.us-east-1.amazonaws.com/prod')
  ENABLED = TRUE;

-- Check integration status
DESC INTEGRATION aws_api_integration;

Azure API Management + Functions

CREATE OR REPLACE API INTEGRATION azure_api_integration
  API_PROVIDER = azure_api_management
  AZURE_TENANT_ID = 'your-tenant-id'
  AZURE_AD_APPLICATION_ID = 'your-app-id'
  API_ALLOWED_PREFIXES = ('https://myapim.azure-api.net/')
  ENABLED = TRUE;

GCP API Gateway + Cloud Functions

CREATE OR REPLACE API INTEGRATION gcp_api_integration
  API_PROVIDER = google_cloud
  GCP_SERVICE_ACCOUNT = 'snowflake@my-project.iam.gserviceaccount.com'
  API_ALLOWED_PREFIXES = ('https://my-gateway-abc123.uc.r.appspot.com/')
  ENABLED = TRUE;

Creating External Functions

Simple External Function

CREATE OR REPLACE EXTERNAL FUNCTION sentiment_analysis(text VARCHAR)
  RETURNS VARCHAR
  API_INTEGRATION = aws_api_integration
  MAX_BATCH_ROWS = 100
  AS 'https://abc123.execute-api.us-east-1.amazonaws.com/prod/sentiment';

Complex External Function with Multiple Returns

CREATE OR REPLACE EXTERNAL FUNCTION translate_text(
  input_text VARCHAR,
  source_lang VARCHAR,
  target_lang VARCHAR
)
  RETURNS TABLE (translated_text VARCHAR, confidence DECIMAL(5,4))
  API_INTEGRATION = azure_api_integration
  MAX_BATCH_ROWS = 50
  AS 'https://myapim.azure-api.net/translate';

Using External Functions

-- Simple call
SELECT
  customer_feedback,
  sentiment_analysis(customer_feedback) as sentiment
FROM feedback_table;

-- Table function call
SELECT
  f.translated_text,
  f.confidence
FROM my_table,
LATERAL TABLE(translate_text(text, 'en', 'es')) f;

Lambda Function Example (Python)

import json
import boto3
from datetime import datetime

def lambda_handler(event, context):
    """
    Snowflake External Function Handler
    Receives batch of rows and returns results
    """
    # Parse Snowflake request
    rows = event['data']
    results = []
    
    for row in rows:
        row_number = row[0]
        input_data = row[1:]
        
        # Custom processing logic
        result = process_data(input_data)
        
        results.append({
            'row_number': row_number,
            'data': [result]
        })
    
    return {'data': results}

def process_data(data):
    # Custom business logic here
    # Example: sentiment analysis, translation, etc.
    return {"processed": True, "timestamp": datetime.utcnow().isoformat()}

Performance Considerations

External function performance depends on API response time, network latency, and batch size. Optimal batch sizes typically range from 100-500 rows. Monitor function duration to avoid timeouts (default 10 minutes).

Batching Strategy

Batch SizeLatencyThroughputUse Case
10-50LowModerateReal-time APIs
100-250MediumHighBatch processing
500-1000HighMaximumBulk transformations

Error Handling

-- Create function with error handling
CREATE OR REPLACE EXTERNAL FUNCTION safe_api_call(input VARCHAR)
  RETURNS VARCHAR
  API_INTEGRATION = aws_api_integration
  MAX_BATCH_ROWS = 100
  AS 'https://api.example.com/process';

-- Handle errors in queries
SELECT
  input_data,
  safe_api_call(input_data) as result
FROM my_table
WHERE safe_api_call(input_data) IS NOT NULL;

Monitoring and Logging

-- Check function invocations
SELECT *
FROM TABLE(INFORMATION_SCHEMA.EXTERNAL_FUNCTION_INVOCATION_HISTORY(
  START_TIME => DATEADD(hour, -24, CURRENT_TIMESTAMP()),
  END_TIME => CURRENT_TIMESTAMP()
))
ORDER BY START_TIME DESC;
  • External functions extend Snowflake with custom cloud-native code
  • API Integrations configure secure connections to cloud services
  • Batch processing optimizes performance and reduces API calls
  • Support for AWS, Azure, and GCP serverless functions
  • Monitor invocation history for performance optimization

Advertisement

Need Expert Snowflake Help?

Get personalized warehouse optimization, data modeling, or Snowflake platform consulting.

Advertisement