CW

Snowflake Advanced UDFs and UDTFs

Free Lesson

Advertisement

Snowflake Advanced UDFs and UDTFs

Advanced UDFs and UDTFs in Snowflake enable custom data transformations, complex calculations, and reusable business logic that extends Snowflake's built-in capabilities.

Architecture Overview

<svg width="800" height="450" viewBox="0 0 800 450" xmlns="http://www.w3.org/2000/svg">
  <defs>
    <linearGradient id="udfGrad" x1="0%" y1="0%" x2="100%" y2="0%">
      <stop offset="0%" style="stop-color:#E74C3C;stop-opacity:1" />
      <stop offset="100%" style="stop-color:#EC7063;stop-opacity:1" />
    </linearGradient>
    <linearGradient id="udtfGrad" x1="0%" y1="0%" x2="100%" y2="0%">
      <stop offset="0%" style="stop-color:#3498DB;stop-opacity:1" />
      <stop offset="100%" style="stop-color:#5DADE2;stop-opacity:1" />
    </linearGradient>
  </defs>

  <text x="400" y="30" text-anchor="middle" font-size="18" font-weight="bold" fill="#333">Advanced UDFs and UDTFs Architecture</text>
  <rect x="30" y="60" width="350" height="180" rx="10" fill="url(#udfGrad)" opacity="0.9"/>
  <text x="205" y="85" text-anchor="middle" font-size="14" fill="white" font-weight="bold">User-Defined Functions (UDFs)</text>

  <rect x="50" y="100" width="150" height="60" rx="8" fill="white" opacity="0.9"/>
  <text x="125" y="125" text-anchor="middle" font-size="10" fill="#333" font-weight="bold">Scalar UDFs</text>
  <text x="125" y="145" text-anchor="middle" font-size="9" fill="#666">Single value in, single value out</text>

  <rect x="210" y="100" width="150" height="60" rx="8" fill="white" opacity="0.9"/>
  <text x="285" y="125" text-anchor="middle" font-size="10" fill="#333" font-weight="bold">Stored UDFs</text>
  <text x="285" y="145" text-anchor="middle" font-size="9" fill="#666">Persisted results caching</text>

  <text x="205" y="185" text-anchor="middle" font-size="11" fill="white">SQL | JavaScript | Python | Java | Scala</text>
  <text x="205" y="205" text-anchor="middle" font-size="10" fill="white" font-style="italic">Can be used in SELECT, WHERE, GROUP BY</text>
  <text x="205" y="225" text-anchor="middle" font-size="10" fill="white" font-style="italic">Deterministic or non-deterministic</text>
  <rect x="420" y="60" width="350" height="180" rx="10" fill="url(#udtfGrad)" opacity="0.9"/>
  <text x="595" y="85" text-anchor="middle" font-size="14" fill="white" font-weight="bold">UDTFs</text>

  <rect x="440" y="100" width="150" height="60" rx="8" fill="white" opacity="0.9"/>
  <text x="515" y="125" text-anchor="middle" font-size="10" fill="#333" font-weight="bold">Table Functions</text>
  <text x="515" y="145" text-anchor="middle" font-size="9" fill="#666">Multi-row output</text>

  <rect x="600" y="100" width="150" height="60" rx="8" fill="white" opacity="0.9"/>
  <text x="675" y="125" text-anchor="middle" font-size="10" fill="#333" font-weight="bold">Generator Functions</text>
  <text x="675" y="145" text-anchor="middle" font-size="9" fill="#666">Row generation</text>

  <text x="595" y="185" text-anchor="middle" font-size="11" fill="white">Used with LATERAL FLATTEN, TABLE()</text>
  <text x="595" y="205" text-anchor="middle" font-size="10" fill="white" font-style="italic">Returns result set, not single value</text>
  <text x="595" y="225" text-anchor="middle" font-size="10" fill="white" font-style="italic">Can process arrays and objects</text>
  <rect x="30" y="260" width="740" height="170" rx="10" fill="#27AE60" opacity="0.85"/>
  <text x="400" y="285" text-anchor="middle" font-size="14" fill="white" font-weight="bold">UDF vs UDTF Comparison</text>

  <rect x="50" y="300" width="170" height="110" rx="8" fill="white"/>
  <text x="135" y="320" text-anchor="middle" font-size="11" fill="#333" font-weight="bold">UDF</text>
  <text x="135" y="340" text-anchor="middle" font-size="9" fill="#666">Returns single value</text>
  <text x="135" y="355" text-anchor="middle" font-size="9" fill="#666">Used in expressions</text>
  <text x="135" y="370" text-anchor="middle" font-size="9" fill="#666">Inline or stored</text>
  <text x="135" y="385" text-anchor="middle" font-size="9" fill="#666">Deterministic option</text>
  <text x="135" y="400" text-anchor="middle" font-size="9" fill="#666">Performance: Fast</text>

  <rect x="240" y="300" width="170" height="110" rx="8" fill="white"/>
  <text x="325" y="320" text-anchor="middle" font-size="11" fill="#333" font-weight="bold">UDTF</text>
  <text x="325" y="340" text-anchor="middle" font-size="9" fill="#666">Returns multiple rows</text>
  <text x="325" y="355" text-anchor="middle" font-size="9" fill="#666">Used with TABLE()</text>
  <text x="325" y="370" text-anchor="middle" font-size="9" fill="#666">Can be pipelined</text>
  <text x="325" y="385" text-anchor="middle" font-size="9" fill="#666">Row-by-row processing</text>
  <text x="325" y="400" text-anchor="middle" font-size="9" fill="#666">Performance: Medium</text>

  <rect x="430" y="300" width="170" height="110" rx="8" fill="white"/>
  <text x="515" y="320" text-anchor="middle" font-size="11" fill="#333" font-weight="bold">Stored Procedure</text>
  <text x="515" y="340" text-anchor="middle" font-size="9" fill="#666">Can do DML/DDL</text>
  <text x="515" y="355" text-anchor="middle" font-size="9" fill="#666">Side effects allowed</text>
  <text x="515" y="370" text-anchor="middle" font-size="9" fill="#666">Transaction control</text>
  <text x="515" y="385" text-anchor="middle" font-size="9" fill="#666">Complex logic</text>
  <text x="515" y="400" text-anchor="middle" font-size="9" fill="#666">Performance: Varies</text>

  <rect x="620" y="300" width="140" height="110" rx="8" fill="white"/>
  <text x="690" y="320" text-anchor="middle" font-size="11" fill="#333" font-weight="bold">External Fn</text>
  <text x="690" y="340" text-anchor="middle" font-size="9" fill="#666">Calls cloud APIs</text>
  <text x="690" y="355" text-anchor="middle" font-size="9" fill="#666">Network latency</text>
  <text x="690" y="370" text-anchor="middle" font-size="9" fill="#666">Batch processing</text>
  <text x="690" y="385" text-anchor="middle" font-size="9" fill="#666">Custom logic</text>
  <text x="690" y="400" text-anchor="middle" font-size="9" fill="#666">Performance: Variable</text>
</svg>

Scalar UDFs

SQL Scalar UDF

CREATE OR REPLACE FUNCTION calculate_discount(price DECIMAL(10,2), discount_pct DECIMAL(5,2))
RETURNS DECIMAL(10,2)
LANGUAGE SQL
IMMUTABLE
AS
  $$
    SELECT price * (1 - discount_pct / 100)
  $$;

-- Usage
SELECT
  product_name,
  price,
  calculate_discount(price, 15) as discounted_price
FROM products;

JavaScript Scalar UDF

CREATE OR REPLACE FUNCTION parse_email(email VARCHAR)
RETURNS OBJECT
LANGUAGE JAVASCRIPT
IMMUTABLE
AS
  $$
    const parts = email.split('@');
    return {
      username: parts[0],
      domain: parts[1],
      isValid: parts.length === 2 && parts[1].includes('.')
    };
  $$;

-- Usage
SELECT
  email,
  parse_email(email):username::STRING as username,
  parse_email(email):domain::STRING as domain
FROM users;

Python Scalar UDF

CREATE OR REPLACE FUNCTION sentiment_score(text VARCHAR)
RETURNS FLOAT
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('textblob')
HANDLER = 'get_sentiment'
AS
  $$
    from textblob import TextBlob
    
    def get_sentiment(text):
      blob = TextBlob(text)
      return blob.sentiment.polarity
  $$;

User-Defined Table Functions (UDTFs)

SQL UDTF

CREATE OR REPLACE FUNCTION generate_date_range(start_date DATE, end_date DATE)
RETURNS TABLE (date DATE, day_name VARCHAR, week_number INTEGER)
LANGUAGE SQL
AS
  $$
    WITH date_series AS (
      SELECT DATEADD(day, SEQ4(), :start_date) as date
      FROM TABLE(GENERATOR(ROWCOUNT => DATEDIFF('day', :start_date, :end_date) + 1))
    )
    SELECT
      date,
      DAYNAME(date) as day_name,
      WEEKOFYEAR(date) as week_number
    FROM date_series
  $$;

-- Usage
SELECT * FROM TABLE(generate_date_range('2024-01-01', '2024-12-31'));

JavaScript UDTF

CREATE OR REPLACE FUNCTION split_string_to_rows(input VARCHAR, delimiter VARCHAR)
RETURNS TABLE (value VARCHAR, position INTEGER)
LANGUAGE JAVASCRIPT
AS
  $$
    const values = INPUT_DELIMITER.split(DELIMITER);
    for (let i = 0; i < values.length; i++) {
      yield {VALUE: values[i], POSITION: i + 1};
    }
  $$
  INPUT_DELIMITER => (delimiter VARCHAR, delimiter VARCHAR);

-- Usage
SELECT * FROM TABLE(split_string_to_rows('a,b,c,d', ','));

Python UDTF for Time Series

CREATE OR REPLACE FUNCTION moving_average(values ARRAY, window_size INTEGER)
RETURNS TABLE (index INTEGER, value FLOAT, moving_avg FLOAT)
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
HANDLER = 'compute_moving_avg'
AS
  $$
    class compute_moving_avg:
      def __init__(self):
        self.values = []
        self.window_size = 0
      
      def process(self, values, window_size):
        self.values = values
        self.window_size = window_size
        for i in range(len(values)):
          start = max(0, i - window_size + 1)
          window_vals = values[start:i+1]
          moving_avg = sum(window_vals) / len(window_vals)
          yield (i, values[i], moving_avg)
  $$;

Stored UDFs for Caching

CREATE OR REPLACE FUNCTION expensive_calculation(input VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
CALLED ON NULL INPUT
STABLE
AS
  $$
    SELECT MD5(input || CURRENT_DATE())
  $$;

-- Stored UDF with caching
CREATE OR REPLACE FUNCTION cached_lookup(key VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
STABLE
AS
  $$
    SELECT value FROM lookup_table WHERE lookup_key = key
  $$;

Use IMMUTABLE for functions with no side effects and consistent outputs. Use STABLE for functions that depend on session state. Use VOLATILE for functions with random or time-dependent outputs.

UDF Best Practices

AspectBest PracticeImpact
DeterminismMark as IMMUTABLE/STABLEEnables caching
NULL handlingUse COALESCE or CALLED ON NULLPrevents errors
Language choiceMatch to complexityPerformance
Batch sizeProcess in batchesMemory efficiency
Error handlingValidate inputsReliability
  • UDFs return single values, UDTFs return result sets
  • SQL, JavaScript, Python, Java, and Scala are supported
  • IMMUTABLE/STABLE/VOLATILE affect caching behavior
  • UDTFs are essential for row-generating transformations
  • Stored UDFs cache results for repeated calls

Advertisement

Need Expert Snowflake Help?

Get personalized warehouse optimization, data modeling, or Snowflake platform consulting.

Advertisement