Snowflake External Functions
External functions in Snowflake allow you to call custom code running as serverless functions in cloud providers, extending Snowflake's capabilities with custom APIs and third-party services.
Architecture Overview
<svg width="800" height="500" viewBox="0 0 800 500" xmlns="http://www.w3.org/2000/svg">
<defs>
<linearGradient id="apiGrad" x1="0%" y1="0%" x2="100%" y2="0%">
<stop offset="0%" style="stop-color:#FF6B6B;stop-opacity:1" />
<stop offset="100%" style="stop-color:#FF8E8E;stop-opacity:1" />
</linearGradient>
<linearGradient id="snowGrad" x1="0%" y1="0%" x2="100%" y2="0%">
<stop offset="0%" style="stop-color:#29B5E8;stop-opacity:1" />
<stop offset="100%" style="stop-color:#1E90FF;stop-opacity:1" />
</linearGradient>
</defs>
<text x="400" y="35" text-anchor="middle" font-size="18" font-weight="bold" fill="#333">Snowflake External Function Architecture</text>
<rect x="50" y="60" width="300" height="380" rx="15" fill="url(#snowGrad)" opacity="0.9"/>
<text x="200" y="90" text-anchor="middle" font-size="14" fill="white" font-weight="bold">Snowflake</text>
<rect x="80" y="110" width="240" height="50" rx="8" fill="white" opacity="0.9"/>
<text x="200" y="140" text-anchor="middle" font-size="12" fill="#333">SQL Query Processing</text>
<rect x="80" y="180" width="240" height="50" rx="8" fill="white" opacity="0.9"/>
<text x="200" y="210" text-anchor="middle" font-size="12" fill="#333">API Integration Layer</text>
<rect x="80" y="250" width="240" height="50" rx="8" fill="white" opacity="0.9"/>
<text x="200" y="280" text-anchor="middle" font-size="12" fill="#333">CREATE EXTERNAL FUNCTION</text>
<path d="M320 280 L380 280" stroke="#333" stroke-width="2" fill="none" marker-end="url(#arrowExt)"/>
<text x="350" y="270" text-anchor="middle" font-size="10" fill="#666">HTTP/S</text>
<rect x="420" y="60" width="340" height="380" rx="15" fill="url(#apiGrad)" opacity="0.9"/>
<text x="590" y="90" text-anchor="middle" font-size="14" fill="white" font-weight="bold">Cloud Provider</text>
<rect x="450" y="110" width="280" height="50" rx="8" fill="white" opacity="0.9"/>
<text x="590" y="140" text-anchor="middle" font-size="12" fill="#333">API Gateway / Load Balancer</text>
<rect x="450" y="180" width="130" height="50" rx="8" fill="white" opacity="0.9"/>
<text x="515" y="210" text-anchor="middle" font-size="10" fill="#333">Lambda 1</text>
<rect x="600" y="180" width="130" height="50" rx="8" fill="white" opacity="0.9"/>
<text x="665" y="210" text-anchor="middle" font-size="10" fill="#333">Lambda 2</text>
<rect x="450" y="250" width="280" height="60" rx="8" fill="white" opacity="0.9"/>
<text x="590" y="275" text-anchor="middle" font-size="12" fill="#333">Custom Business Logic</text>
<text x="590" y="295" text-anchor="middle" font-size="10" fill="#666">APIs | ML Models | Transforms</text>
<defs>
<marker id="arrowExt" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto" markerUnits="strokeWidth">
<path d="M0,0 L0,6 L9,3 z" fill="#333"/>
</marker>
</defs>
</svg>
Key Concepts
DfExternal Function
DfBatching
API Integration Setup
AWS API Gateway + Lambda
-- Create API integration
CREATE OR REPLACE API INTEGRATION aws_api_integration
API_PROVIDER = aws_api_gateway
API_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/snowflake-api-role'
API_ALLOWED_PREFIXES = ('https://abc123.execute-api.us-east-1.amazonaws.com/prod')
ENABLED = TRUE;
-- Check integration status
DESC INTEGRATION aws_api_integration;
Azure API Management + Functions
CREATE OR REPLACE API INTEGRATION azure_api_integration
API_PROVIDER = azure_api_management
AZURE_TENANT_ID = 'your-tenant-id'
AZURE_AD_APPLICATION_ID = 'your-app-id'
API_ALLOWED_PREFIXES = ('https://myapim.azure-api.net/')
ENABLED = TRUE;
GCP API Gateway + Cloud Functions
CREATE OR REPLACE API INTEGRATION gcp_api_integration
API_PROVIDER = google_cloud
GCP_SERVICE_ACCOUNT = 'snowflake@my-project.iam.gserviceaccount.com'
API_ALLOWED_PREFIXES = ('https://my-gateway-abc123.uc.r.appspot.com/')
ENABLED = TRUE;
Creating External Functions
Simple External Function
CREATE OR REPLACE EXTERNAL FUNCTION sentiment_analysis(text VARCHAR)
RETURNS VARCHAR
API_INTEGRATION = aws_api_integration
MAX_BATCH_ROWS = 100
AS 'https://abc123.execute-api.us-east-1.amazonaws.com/prod/sentiment';
Complex External Function with Multiple Returns
CREATE OR REPLACE EXTERNAL FUNCTION translate_text(
input_text VARCHAR,
source_lang VARCHAR,
target_lang VARCHAR
)
RETURNS TABLE (translated_text VARCHAR, confidence DECIMAL(5,4))
API_INTEGRATION = azure_api_integration
MAX_BATCH_ROWS = 50
AS 'https://myapim.azure-api.net/translate';
Using External Functions
-- Simple call
SELECT
customer_feedback,
sentiment_analysis(customer_feedback) as sentiment
FROM feedback_table;
-- Table function call
SELECT
f.translated_text,
f.confidence
FROM my_table,
LATERAL TABLE(translate_text(text, 'en', 'es')) f;
Lambda Function Example (Python)
import json
import boto3
from datetime import datetime
def lambda_handler(event, context):
"""
Snowflake External Function Handler
Receives batch of rows and returns results
"""
# Parse Snowflake request
rows = event['data']
results = []
for row in rows:
row_number = row[0]
input_data = row[1:]
# Custom processing logic
result = process_data(input_data)
results.append({
'row_number': row_number,
'data': [result]
})
return {'data': results}
def process_data(data):
# Custom business logic here
# Example: sentiment analysis, translation, etc.
return {"processed": True, "timestamp": datetime.utcnow().isoformat()}
Performance Considerations
External function performance depends on API response time, network latency, and batch size. Optimal batch sizes typically range from 100-500 rows. Monitor function duration to avoid timeouts (default 10 minutes).
Batching Strategy
| Batch Size | Latency | Throughput | Use Case |
|---|---|---|---|
| 10-50 | Low | Moderate | Real-time APIs |
| 100-250 | Medium | High | Batch processing |
| 500-1000 | High | Maximum | Bulk transformations |
Error Handling
-- Create function with error handling
CREATE OR REPLACE EXTERNAL FUNCTION safe_api_call(input VARCHAR)
RETURNS VARCHAR
API_INTEGRATION = aws_api_integration
MAX_BATCH_ROWS = 100
AS 'https://api.example.com/process';
-- Handle errors in queries
SELECT
input_data,
safe_api_call(input_data) as result
FROM my_table
WHERE safe_api_call(input_data) IS NOT NULL;
Monitoring and Logging
-- Check function invocations
SELECT *
FROM TABLE(INFORMATION_SCHEMA.EXTERNAL_FUNCTION_INVOCATION_HISTORY(
START_TIME => DATEADD(hour, -24, CURRENT_TIMESTAMP()),
END_TIME => CURRENT_TIMESTAMP()
))
ORDER BY START_TIME DESC;
- External functions extend Snowflake with custom cloud-native code
- API Integrations configure secure connections to cloud services
- Batch processing optimizes performance and reduces API calls
- Support for AWS, Azure, and GCP serverless functions
- Monitor invocation history for performance optimization