Cortex AI: LLM Integration, Cortex Functions & AI Features
Architecture Diagram 1: Cortex AI Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ SNOWFLAKE CORTEX AI ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ USER INTERFACE │
│ ═══════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │ │
│ │ │ Snowflake │ │ SQL │ │ Cortex Playground │ │ │
│ │ │ Worksheets │ │ Client │ │ (Web Interface) │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ SNOWFLAKE CLOUD SERVICES │
│ ════════════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Cortex AI Service Layer │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Request Router │ │ │ │
│ │ │ │ • Parse AI function calls │ │ │ │
│ │ │ │ • Validate permissions │ │ │ │
│ │ │ │ • Route to appropriate model │ │ │ │
│ │ │ │ • Manage rate limiting │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Model Registry │ │ │ │
│ │ │ │ • Built-in models (Cortex LLM functions) │ │ │ │
│ │ │ │ • Custom models (Snowpark ML) │ │ │ │
│ │ │ │ • Model versioning and deployment │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ CORTEX AI FUNCTIONS │
│ ═══════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ LLM Functions: │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ COMPLETE(prompt, model, options) │ │ │ │
│ │ │ │ • Generate text completion │ │ │ │
│ │ │ │ • Supports multiple LLM models │ │ │ │
│ │ │ │ • Configurable temperature, max tokens │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ TRANSLATE(text, source_lang, target_lang) │ │ │ │
│ │ │ │ • Translate text between languages │ │ │ │
│ │ │ │ • Supports 100+ languages │ │ │ │
│ │ │ │ • Batch translation capability │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ SENTIMENT(text) │ │ │ │
│ │ │ │ • Analyze text sentiment │ │ │ │
│ │ │ │ • Returns positive/negative/neutral │ │ │ │
│ │ │ │ • Confidence scores │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ EXTRACT(text, entities) │ │ │ │
│ │ │ │ • Named entity recognition │ │ │ │
│ │ │ │ • Extract people, places, organizations │ │ │ │
│ │ │ │ • Custom entity extraction │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ CLASSIFY(text, categories) │ │ │ │
│ │ │ │ • Text classification │ │ │ │
│ │ │ │ • Zero-shot classification │ │ │ │
│ │ │ │ • Custom category support │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Vector Functions: │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ EMBED(text, model) │ │ │ │
│ │ │ │ • Generate text embeddings │ │ │ │
│ │ │ │ • Multiple embedding models │ │ │ │
│ │ │ │ • Dimension selection │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ SIMILARITY(text1, text2) │ │ │ │
│ │ │ │ • Compute cosine similarity │ │ │ │
│ │ │ │ • Vector distance metrics │ │ │ │
│ │ │ │ • Batch similarity computation │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ Compute Layer │
│ ▼ │
│ CORTEX COMPUTE │
│ ═══════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Model Inference Infrastructure: │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Built-in Models: │ │ │ │
│ │ │ │ • Llama-2-7B (General purpose) │ │ │ │
│ │ │ │ • Llama-2-70B (Complex reasoning) │ │ │ │
│ │ │ │ • Mistral-7B (Fast inference) │ │ │ │
│ │ │ │ • Embedding models (text-embedding-3-small) │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Custom Models: │ │ │ │
│ │ │ │ • Snowpark ML models │ │ │ │
│ │ │ │ • Imported ONNX models │ │ │ │
│ │ │ │ • Fine-tuned models │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Architecture Diagram 2: LLM Integration Patterns
┌─────────────────────────────────────────────────────────────────────────────┐
│ LLM INTEGRATION PATTERNS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ PATTERN 1: Text Generation │
│ ════════════════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Input: "Summarize the key findings from Q4 sales report" │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Cortex COMPLETE Function: │ │ │
│ │ │ │ │ │
│ │ │ SELECT SNOWFLAKE.CORTEX.COMPLETE( │ │ │
│ │ │ 'llama2-70b-chat', │ │ │
│ │ │ 'Summarize: ' || sales_summary, │ │ │
│ │ │ { │ │ │
│ │ │ 'temperature': 0.7, │ │ │
│ │ │ 'max_tokens': 500 │ │ │
│ │ │ } │ │ │
│ │ │ ) AS summary; │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Output: "Q4 sales showed 15% growth driven by..." │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ PATTERN 2: RAG (Retrieval-Augmented Generation) │
│ ══════════════════════════════════════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Step 1: Embed Documents │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ SELECT │ │ │
│ │ │ document_id, │ │ │
│ │ │ content, │ │ │
│ │ │ SNOWFLAKE.CORTEX.EMBED( │ │ │
│ │ │ 'text-embedding-3-small', │ │ │
│ │ │ content │ │ │
│ │ │ ) AS embedding │ │ │
│ │ │ FROM documents; │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Step 2: Query with Vector Search │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ SELECT │ │ │
│ │ │ document_id, │ │ │
│ │ │ content, │ │ │
│ │ │ similarity_score │ │ │
│ │ │ FROM ( │ │ │
│ │ │ SELECT │ │ │
│ │ │ document_id, │ │ │
│ │ │ content, │ │ │
│ │ │ SNOWFLAKE.CORTEX.SIMILARITY( │ │ │
│ │ │ embedding, │ │ │
│ │ │ SNOWFLAKE.CORTEX.EMBED('text-embedding-3-small', query)│ │ │
│ │ │ ) AS similarity_score │ │ │
│ │ │ FROM documents │ │ │
│ │ │ ) │ │ │
│ │ │ ORDER BY similarity_score DESC │ │ │
│ │ │ LIMIT 5; │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Step 3: Generate Response with Context │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ SELECT SNOWFLAKE.CORTEX.COMPLETE( │ │ │
│ │ │ 'llama2-70b-chat', │ │ │
│ │ │ 'Context: ' || relevant_documents || │ │ │
│ │ │ '\n\nQuestion: ' || user_query, │ │ │
│ │ │ {'temperature': 0.3} │ │ │
│ │ │ ); │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ PATTERN 3: Text Classification Pipeline │
│ ═════════════════════════════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Input Table: customer_feedback │ │ │
│ │ │ ┌──────┬────────────────────────────────────────────────┐ │ │
│ │ │ │ id │ feedback_text │ │ │
│ │ │ ├──────┼────────────────────────────────────────────────┤ │ │
│ │ │ │ 1 │ "Product quality is excellent..." │ │ │
│ │ │ │ 2 │ "Shipping was delayed..." │ │ │
│ │ │ │ 3 │ "Customer support was helpful..." │ │ │
│ │ │ └──────┴────────────────────────────────────────────────┘ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Classification Query: │ │ │
│ │ │ │ │ │
│ │ │ SELECT │ │ │
│ │ │ id, │ │ │
│ │ │ feedback_text, │ │ │
│ │ │ SNOWFLAKE.CORTEX.CLASSIFY( │ │ │
│ │ │ feedback_text, │ │ │
│ │ │ ['positive', 'negative', 'neutral'] │ │ │
│ │ │ ) AS category, │ │ │
│ │ │ SNOWFLAKE.CORTEX.SENTIMENT(feedback_text) AS sentiment │ │ │
│ │ │ FROM customer_feedback; │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Result: │ │ │
│ │ │ ┌──────┬──────────────────────┬───────────┬─────────────┐ │ │
│ │ │ │ id │ feedback_text │ category │ sentiment │ │ │
│ │ │ ├──────┼──────────────────────┼───────────┼─────────────┤ │ │
│ │ │ │ 1 │ "Product quality..." │ positive │ 0.95 │ │ │
│ │ │ │ 2 │ "Shipping was..." │ negative │ -0.82 │ │ │
│ │ │ │ 3 │ "Customer support..."│ positive │ 0.88 │ │ │
│ │ │ └──────┴──────────────────────┴───────────┴─────────────┘ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Architecture Diagram 3: ML Model Deployment
┌─────────────────────────────────────────────────────────────────────────────┐
│ ML MODEL DEPLOYMENT ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ MODEL DEVELOPMENT │
│ ═════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Development Environment: │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │ │ │
│ │ │ │ Local Dev │ │ Jupyter │ │ VS Code │ │ │ │
│ │ │ │ Environment │ │ Notebooks │ │ + Snowflake │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ └─────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ Libraries: │ │ │
│ │ │ • scikit-learn (Traditional ML) │ │ │
│ │ │ • PyTorch/TensorFlow (Deep Learning) │ │ │
│ │ │ • XGBoost/LightGBM (Gradient Boosting) │ │ │
│ │ │ • Snowpark ML (Snowflake native) │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ Package & Upload │
│ ▼ │
│ MODEL REGISTRY │
│ ═══════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Snowflake Model Registry: │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Model: sales_forecast_model │ │ │ │
│ │ │ │ Version: v2.1 │ │ │ │
│ │ │ │ Framework: XGBoost │ │ │ │
│ │ │ │ Metrics: │ │ │ │
│ │ │ │ • RMSE: 0.05 │ │ │ │
│ │ │ │ • R²: 0.92 │ │ │ │
│ │ │ │ • MAE: 0.03 │ │ │ │
│ │ │ │ Status: PRODUCTION │ │ │ │
│ │ │ │ Owner: data-science-team │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ Model Versions: │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ v1.0 (2024-01-01): Initial model (RMSE: 0.08) │ │ │ │
│ │ │ │ v2.0 (2024-03-15): Improved model (RMSE: 0.06) │ │ │ │
│ │ │ │ v2.1 (2024-06-01): Current production (RMSE: 0.05) │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ Deploy │
│ ▼ │
│ MODEL INFERENCE │
│ ════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Inference Options: │ │ │
│ │ │ │ │ │
│ │ │ 1. SQL PREDICT Function │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ SELECT PREDICT( │ │ │ │
│ │ │ │ @sales_forecast_model, │ │ │ │
│ │ │ │ features │ │ │ │
│ │ │ │ ) FROM input_data; │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ 2. Snowpark ML Inference │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ from snowflake.ml import Model │ │ │ │
│ │ │ │ model = Model("sales_forecast_model") │ │ │ │
│ │ │ │ predictions = model.predict(input_df) │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ 3. Batch Inference (Scheduled Tasks) │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ CREATE TASK inference_task │ │ │ │
│ │ │ │ AS INSERT INTO predictions │ │ │ │
│ │ │ │ SELECT PREDICT(@model, features) FROM input_data; │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ MONITORING & RETRAINING │
│ ═══════════════════════ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Model Monitoring: │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Drift Detection: │ │ │ │
│ │ │ │ • Input feature distribution monitoring │ │ │ │
│ │ │ │ • Prediction distribution monitoring │ │ │ │
│ │ │ │ • Performance metric tracking │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Automated Retraining: │ │ │ │
│ │ │ │ • Scheduled retraining tasks │ │ │ │
│ │ │ │ • Model performance threshold triggers │ │ │ │
│ │ │ │ • A/B testing for model updates │ │ │ │
│ │ │ └────────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Snowflake Cortex is Snowflake's AI/ML service providing built-in LLM inference, fine-tuning, and embeddings directly within Snowflake. It eliminates the need to move data to external AI platforms by executing ML workloads on Snowflake's managed infrastructure.
Snowflake Copilot is a natural language interface that translates English questions into SQL queries. It uses Cortex LLMs to understand intent, generate SQL, and explain results, enabling non-technical users to query data conversationally.
Start with smaller Cortex models (llama-3.1-8b) for cost efficiency; upgrade to larger models (llama-3.1-70b) for complex reasoning. Use batch inference for bulk processing. Combine Copilot with role-based access for governance.
- Cortex: Built-in LLM inference, fine-tuning, embeddings within Snowflake
- Copilot: Natural language → SQL translation for non-technical users
- Cost model: Pay per token; smaller models cheaper, larger models more capable
- Security: Data never leaves Snowflake; model inference runs in Snowflake's VPC
- Use cases: Text analysis, sentiment classification, entity extraction, summarization
Detailed Explanation
Snowflake Cortex Overview
Snowflake Cortex is a fully managed AI service that brings machine learning and LLM capabilities directly to your data. Unlike traditional ML workflows that require extracting data to external systems, Cortex enables AI-powered analytics where your data remains in Snowflake. This approach eliminates data movement, reduces security risks, and simplifies ML operations.
Cortex provides two main categories of functions: LLM functions for natural language processing and traditional ML functions for prediction and classification. LLM functions include text generation (COMPLETE), translation (TRANSLATE), sentiment analysis (SENTIMENT), and text classification (CLASSIFY). Traditional ML functions include embedding generation (EMBED) and similarity computation (SIMILARITY).
LLM Integration Patterns
Large Language Model (LLM) integration in Snowflake follows several patterns depending on the use case. The text generation pattern uses the COMPLETE function to generate text based on prompts. This pattern is suitable for summarization, content generation, and conversational AI applications. The function supports multiple LLM models with configurable parameters like temperature and max tokens.
The Retrieval-Augmented Generation (RAG) pattern combines vector search with LLM generation. First, documents are embedded using the EMBED function and stored in a vector column. When a query arrives, relevant documents are retrieved using similarity search, then the LLM generates a response using the retrieved context. This pattern grounds LLM responses in factual data, reducing hallucinations and improving accuracy.
The text classification pattern uses the CLASSIFY function to categorize text into predefined categories. This pattern is useful for routing customer support tickets, categorizing feedback, and content moderation. The function supports zero-shot classification, where categories are specified at query time without training data.
Embedding and Vector Search
Snowflake Cortex provides embedding generation and vector search capabilities for building semantic search applications. The EMBED function converts text into numerical vectors that capture semantic meaning. These vectors can be stored in Snowflake tables and searched using the SIMILARITY function or vector indexing.
Embedding-based search enables finding semantically similar content rather than exact keyword matches. This is particularly useful for search applications, recommendation systems, and duplicate detection. The function supports multiple embedding models with different dimensions and performance characteristics.
ML Model Deployment
Snowflake supports deploying custom ML models through Snowpark ML and the Model Registry. Models can be trained externally and registered in Snowflake for inference. The Model Registry provides versioning, metadata management, and deployment capabilities. Models can be invoked using SQL PREDICT functions or Snowpark ML APIs.
Model deployment in Snowflake enables inference directly on data without extraction. This approach is particularly valuable for real-time predictions, batch scoring, and embedded ML applications. The compute infrastructure automatically scales based on inference workload, ensuring consistent performance.
Advanced AI Patterns
Advanced AI patterns in Snowflake include multi-step AI pipelines that combine multiple Cortex functions. For example, a pipeline might extract entities from text, classify sentiment, and generate a summary in sequence. These pipelines can be implemented using stored procedures or Snowpark workflows.
AI-powered data enrichment uses Cortex functions to augment existing data. For example, adding sentiment scores to customer feedback, translating product descriptions, or classifying support tickets. This enrichment can be performed as batch ETL processes or real-time streaming.
Conversational AI applications use COMPLETE functions with chat models to build chatbots and virtual assistants. Snowflake can serve as the data backend for these applications, providing access to business data through natural language queries.
Key Concepts Table
| Cortex Function | Purpose | Input | Output |
|---|---|---|---|
| COMPLETE | Text generation | Prompt, model | Generated text |
| TRANSLATE | Translation | Text, languages | Translated text |
| SENTIMENT | Sentiment analysis | Text | Sentiment score |
| CLASSIFY | Text classification | Text, categories | Category label |
| EMBED | Text embedding | Text | Vector embedding |
| SIMILARITY | Vector similarity | Two vectors | Similarity score |
| Model Type | Use Case | Performance | Cost |
|---|---|---|---|
| Llama-2-7B | General purpose | Fast | Low |
| Llama-2-70B | Complex reasoning | Medium | High |
| Mistral-7B | Fast inference | Very fast | Low |
| Embedding models | Vector search | Fast | Low |
| AI Pattern | Components | Complexity | Use Case |
|---|---|---|---|
| Text Generation | COMPLETE | Low | Summarization, content creation |
| RAG | EMBED + SIMILARITY + COMPLETE | High | Knowledge base, Q&A |
| Classification | CLASSIFY | Low | Content moderation, routing |
| Entity Extraction | EXTRACT | Medium | Data enrichment |
Code Examples
-- Example 1: Basic text generation
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'llama2-70b-chat',
'Summarize the following sales data: ' || sales_summary,
{
'temperature': 0.7,
'max_tokens': 500
}
) AS summary
FROM sales_data;
-- Example 2: Text translation
SELECT
product_name,
description,
SNOWFLAKE.CORTEX.TRANSLATE(
description,
'en',
'es'
) AS description_spanish
FROM products;
-- Example 3: Sentiment analysis
SELECT
feedback_id,
feedback_text,
SNOWFLAKE.CORTEX.SENTIMENT(feedback_text) AS sentiment_score,
CASE
WHEN SNOWFLAKE.CORTEX.SENTIMENT(feedback_text) > 0.5 THEN 'Positive'
WHEN SNOWFLAKE.CORTEX.SENTIMENT(feedback_text) < -0.5 THEN 'Negative'
ELSE 'Neutral'
END AS sentiment_label
FROM customer_feedback;
-- Example 4: Text classification
SELECT
ticket_id,
ticket_text,
SNOWFLAKE.CORTEX.CLASSIFY(
ticket_text,
['technical_issue', 'billing', 'general_inquiry', 'complaint']
) AS ticket_category
FROM support_tickets;
-- Example 5: Generate embeddings
SELECT
document_id,
content,
SNOWFLAKE.CORTEX.EMBED(
'text-embedding-3-small',
content
) AS embedding
FROM documents;
-- Example 6: Similarity search
SELECT
d1.document_id,
d1.content,
SNOWFLAKE.CORTEX.SIMILARITY(
d1.embedding,
SNOWFLAKE.CORTEX.EMBED('text-embedding-3-small', 'search query')
) AS similarity_score
FROM documents d1
ORDER BY similarity_score DESC
LIMIT 5;
-- Example 7: RAG implementation
WITH query_embedding AS (
SELECT SNOWFLAKE.CORTEX.EMBED(
'text-embedding-3-small',
'What are the key findings from Q4 sales?'
) AS embedding
),
relevant_docs AS (
SELECT
d.document_id,
d.content,
SNOWFLAKE.CORTEX.SIMILARITY(
d.embedding,
qe.embedding
) AS similarity_score
FROM documents d, query_embedding qe
ORDER BY similarity_score DESC
LIMIT 3
)
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'llama2-70b-chat',
'Context: ' || STRING_AGG(content, '\n') ||
'\n\nQuestion: What are the key findings from Q4 sales?',
{'temperature': 0.3}
)
FROM relevant_docs;
-- Example 8: Batch sentiment analysis
CREATE TABLE feedback_sentiment AS
SELECT
feedback_id,
feedback_text,
SNOWFLAKE.CORTEX.SENTIMENT(feedback_text) AS sentiment,
SNOWFLAKE.CORTEX.CLASSIFY(
feedback_text,
['product_quality', 'shipping', 'customer_service', 'price']
) AS category
FROM customer_feedback;
-- Example 9: AI-powered data enrichment
UPDATE customer_data cd
SET
sentiment_score = (
SELECT SNOWFLAKE.CORTEX.SENTIMENT(cd.recent_feedback)
),
preferred_language = (
SELECT SNOWFLAKE.CORTEX.DETECT_LANGUAGE(cd.communication_history)
)
WHERE cd.customer_id = 'CUST-001';
-- Example 10: Create AI function for batch processing
CREATE OR REPLACE FUNCTION analyze_feedback(feedback_text VARCHAR)
RETURNS VARIANT
LANGUAGE SQL
AS
$$
SELECT OBJECT_CONSTRUCT(
'sentiment', SNOWFLAKE.CORTEX.SENTIMENT(feedback_text),
'category', SNOWFLAKE.CORTEX.CLASSIFY(
feedback_text,
['positive', 'negative', 'neutral']
),
'key_phrases', SNOWFLAKE.CORTEX.EXTRACT(
feedback_text,
['product', 'service', 'price', 'delivery']
)
)
$$;
Performance Metrics
| Metric | Target | Warning | Critical |
|---|---|---|---|
| LLM Response Time | < 2s | 2-5s | > 5s |
| Embedding Generation | < 100ms | 100-500ms | > 500ms |
| Similarity Search | < 50ms | 50-200ms | > 200ms |
| Classification Accuracy | > 85% | 70-85% | < 70% |
| Batch Processing Rate | > 1000 rows/s | 500-1000 rows/s | < 500 rows/s |
Best Practices
-
Choose appropriate models: Use smaller models (Llama-2-7B, Mistral-7B) for simple tasks and larger models (Llama-2-70B) for complex reasoning.
-
Optimize prompts: Craft clear, specific prompts to improve response quality. Include examples and constraints when needed.
-
Implement caching: Cache embeddings and LLM responses to reduce costs and improve latency for repeated queries.
-
Monitor model performance: Track accuracy, latency, and cost metrics. Set up alerts for performance degradation.
-
Use RAG for accuracy: Combine vector search with LLM generation to ground responses in factual data and reduce hallucinations.
-
Batch process when possible: Use batch inference for large datasets to optimize costs and throughput.
-
Implement fallback strategies: Design systems to handle LLM failures gracefully, with fallback to simpler models or rule-based systems.
-
Monitor costs: Cortex functions consume compute resources. Track usage and optimize expensive operations.
-
Secure sensitive data: Be careful with prompts containing sensitive information. Consider data masking before LLM processing.
-
Test thoroughly: Validate AI outputs for accuracy and bias. Implement human review for critical decisions.
See Also
- PySpark Iceberg - ML integration patterns
- Delta Lake on Databricks - Delta Lake ML integration
- Data Warehouse Concepts - Data warehouse design principles