Time Travel, Fail-Safe & Zero-Copy Cloning
Architecture Diagram 1: Time Travel Data Flow
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TIME TRAVEL ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β CURRENT STATE TIME TRAVEL WINDOW β
β βββββββββββββ ββββββββββββββββββ β
β β
β βββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ β
β β LIVE TABLE β β HISTORICAL VERSIONS β β
β β sales_data β β β β
β β β β βββββββββββββββββββββββββββββββ β β
β β βββββββββββββββ β CURRENT β β Time Travel (7 days) β β β
β β β Row 1: v5 β β ββββββββββΆβ β β β β
β β β Row 2: v3 β β β β βββββββββββββββββββββββ β β β
β β β Row 3: v2 β β β β β Version at T-1d β β β β
β β β Row 4: v1 β β β β β Row 1: v4 β β β β
β β βββββββββββββββ β β β β Row 2: v3 β β β β
β βββββββββββββββββββ β β β Row 3: v1 β β β β
β β β β βββββββββββββββββββββββ β β β
β β β β β β β
β β β β βββββββββββββββββββββββ β β β
β β β β β Version at T-3d β β β β
β β β β β Row 1: v2 β β β β
β β β β β Row 2: v2 β β β β
β β β β β Row 3: v1 β β β β
β β β β βββββββββββββββββββββββ β β β
β β β β β β β
β β β β βββββββββββββββββββββββ β β β
β β β β β Version at T-7d β β β β
β β β β β Row 1: v1 β β β β
β β β β β Row 2: v1 β β β β
β β β β β Row 3: v0 β β β β
β β β β βββββββββββββββββββββββ β β β
β β β βββββββββββββββββββββββββββββββ β β
β β β β β
β βΌ βββββββββββββββββββββββββββββββββββββββ β
β βββββββββββββββββββ β
β β FAIL-SAFE β βββββββββ 7 days after Time Travel expires β
β β (7 days) β β
β β β β’ Immutable, read-only β
β β βββββββββββββββ β β’ Cannot be queried by users β
β β β Historical β β β’ Available for Snowflake support only β
β β β Data β β β’ Automatic data recovery β
β β β (Protected) β β β’ No user access to modify or delete β
β β βββββββββββββββ β β
β βββββββββββββββββββ β
β β
β RETENTION TIMELINE: β
β ββββββββββββββββ¬ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ€ β
β β Current β Time Travel β Fail-Safe β β
β β (Live) β (0-7 days) β (7-14 days) β β
β β β β β β
β β User access β User query β Support access only β β
β β Full CRUD β Read-only β Read-only β β
β β Real-time β Point-in-time β Disaster recovery β β
β ββββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Architecture Diagram 2: Zero-Copy Cloning Process
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ZERO-COPY CLONING ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β BEFORE CLONE: β
β βββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Source Table: sales_2024 β β
β β β β
β β Metadata: Storage: β β
β β ββββββββββββββββββββββ ββββββββββββββββββββββββββββββ β β
β β β Table ID: T12345 β β Cloud Blob Storage β β β
β β β Rows: 10,000,000 β β β β β
β β β Size: 50 GB βββββββββΆ β βββββββββββββββββββββββ β β β
β β β Micro-Parts: 250 β β β Micro-Partition 1 β β β β
β β β Clustering: date β β β Micro-Partition 2 β β β β
β β ββββββββββββββββββββββ β β Micro-Partition 3 β β β β
β β β β ... β β β β
β β β β Micro-Partition 250β β β β
β β β βββββββββββββββββββββββ β β β
β β β β β β
β β β Storage Used: 50 GB β β β
β β ββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β β CLONE OPERATION β
β β (Near-instantaneous) β
β βΌ β
β AFTER CLONE: β
β ββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β Source: sales_2024 Clone: sales_2024_clone β β
β β ββββββββββββββββββββββ ββββββββββββββββββββββ β β
β β β Table ID: T12345 β β Table ID: T12346 β β β
β β β Rows: 10,000,000 β β Rows: 10,000,000 β β β
β β β Size: 50 GB β β Size: 50 GB (virt) β β β
β β β Micro-Parts: 250 β β Micro-Parts: 250 β β β
β β ββββββββββββββββββββββ ββββββββββββββββββββββ β β
β β β β β β
β β β Both reference the SAME micro-partitions β β
β β β β β β
β β βΌ βΌ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β SHARED MICRO-PARTITIONS β β β
β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β MP_001 (500 MB) ββ Shared by T12345 & T12346 β β β β
β β β β MP_002 (450 MB) ββ Shared by T12345 & T12346 β β β β
β β β β MP_003 (520 MB) ββ Shared by T12345 & T12346 β β β β
β β β β MP_004 (480 MB) ββ Shared by T12345 & T12346 β β β β
β β β β ... β β β β
β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β β β
β β β Total Storage: 50 GB (NOT 100 GB!) β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β DIVERGENCE OVER TIME: β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β Time 0: Clone created β β
β β Source: [MP_001][MP_002][MP_003] Clone: [MP_001][MP_002][MP_003]β
β β Shared βββββββββββββββββββββββββββββββΆ Shared β β
β β β β
β β Time 1: Source modified (INSERT) β β
β β Source: [MP_001][MP_002][MP_003][MP_004] Clone: [MP_001][MP_002][MP_003]β
β β New MP created Still referencing originalβ
β β β β
β β Time 2: Clone modified (UPDATE on MP_002) β β
β β Source: [MP_001][MP_002][MP_003][MP_004] Clone: [MP_001][MP_002*][MP_003]β
β β Original MP_002 New MP_002* created β β
β β β β
β β RESULT: Clone only stores modified micro-partitions (copy-on-write) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Architecture Diagram 3: Data Recovery Workflow
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA RECOVERY WORKFLOW β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β SCENARIO: Accidental data deletion at 10:30 AM β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β TIMELINE OF EVENTS β β
β β β β
β β 09:00 AM βββ Table sales_data has 10M rows β β
β β β β β
β β βΌ β β
β β 10:00 AM βββ ETL job adds 1M rows (total: 11M) β β
β β β β β
β β βΌ β β
β β 10:15 AM βββ Accidental DELETE removes 9M rows β β
β β β (Only 2M rows remain) β β
β β βΌ β β
β β 10:30 AM βββ ISSUE DISCOVERED! β β
β β β β β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β RECOVERY OPTIONS: β β β
β β β β β β
β β β Option 1: Time Travel Query (Preferred) β β β
β β β ββββββββββββββββββββββββββββββββββββββ β β β
β β β β’ Query historical data before deletion β β β
β β β β’ Create new table or INSERT back β β β
β β β β’ Duration: Minutes β β β
β β β β’ Cost: Compute only β β β
β β β β β β
β β β Option 2: Clone Point-in-Time β β β
β β β βββββββββββββββββββββββββββββ β β β
β β β β’ Clone table as of specific timestamp β β β
β β β β’ Zero-copy operation β β β
β β β β’ Duration: Seconds β β β
β β β β’ Cost: Minimal (metadata only) β β β
β β β β β β
β β β Option 3: UNDROP Table β β β
β β β ββββββββββββββββββββββ β β β
β β β β’ Restore dropped table (if dropped, not deleted) β β β
β β β β’ Available within Time Travel window β β β
β β β β’ Duration: Seconds β β β
β β β β’ Cost: None β β β
β β β β β β
β β β Option 4: Fail-Safe Recovery β β β
β β β ββββββββββββββββββββββββββ β β β
β β β β’ Last resort after Time Travel expires β β β
β β β β’ Requires Snowflake Support ticket β β β
β β β β’ Duration: Hours to days β β β
β β β β’ Cost: Support fees may apply β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β RECOVERY PROCESS FLOW: β
β β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β
β β Detect ββββββΆβ Assess ββββββΆβ Choose ββββββΆβ Execute β β
β β Issue β β Impact β β Method β β Recoveryβ β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β
β β β β β β
β βΌ βΌ βΌ βΌ β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β
β β β’ Alerts β β β’ Count β β β’ Time β β β’ Run β β
β β β’ Queriesβ β missingβ β Travel β β query β β
β β β’ Reportsβ β β’ Identifyβ β β’ Clone β β β’ Verify β β
β β β β cause β β β’ UNDROP β β β’ Test β β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Time Travel enables querying historical data at any point within a configurable retention period (1β90 days on Enterprise Edition). Snowflake maintains micro-partition versioning, allowing point-in-time queries via AT/OFFSET/BEFORE clauses without restoring from backups.
Zero-Copy Cloning creates a complete copy of a table, schema, or database by referencing the same micro-partitions β no data is physically duplicated. A copy-on-write mechanism diverges only modified micro-partitions, consuming storage proportional to actual changes.
Fail-Safe is an automatic 7-day immutable retention period after Time Travel expires. Data in Fail-Safe is read-only, accessible only through Snowflake Support for disaster recovery. It cannot be queried or modified by users.
Theorem: A zero-copy clone is logically equivalent to a full copy at the moment of creation. Proof sketch: Both clone and full copy reference identical micro-partitions. The clone's metadata points to the same physical data blocks. Copy-on-write ensures divergence only occurs when either source or clone is modified, creating new micro-partitions for changed data while keeping unmodified data shared.
Consistency guarantee: A Time Travel query at timestamp T returns the exact state of the table as it existed at T, including all committed transactions before T. Uncommitted transactions at T are invisible. This follows snapshot isolation semantics β each Time Travel query sees a consistent snapshot.
Clone before DDL changes for instant rollback. Use Time Travel for data auditing and regulatory compliance. Combine cloning + Time Travel for development environments with production data fidelity.
- Time Travel provides point-in-time queries (1β90 days retention)
- Zero-Copy Cloning uses copy-on-write β storage cost only for diverged data
- Fail-Safe adds 7 days immutable retention after Time Travel expires
- Clone operations are O(1) metadata β near-instantaneous regardless of table size
- Recovery options: UNDROP β Time Travel β Clone β Fail-Safe (in order of preference)
Detailed Explanation
Time Travel: Point-in-Time Data Access
Snowflake's Time Travel feature provides the ability to query historical data that has been changed or deleted. Unlike traditional backup systems that require separate storage and manual restoration, Time Travel leverages Snowflake's micro-partition architecture to maintain historical versions of data automatically. When data is modified or deleted, Snowflake doesn't immediately overwrite the original micro-partitions. Instead, it creates new versions and maintains metadata pointers to both the current and historical versions.
The Time Travel window is configurable per table, ranging from 0 to 90 days for Enterprise Edition and above (Standard Edition supports up to 1 day). This retention period determines how far back you can query historical data. The underlying mechanism relies on Snowflake's micro-partition versioning system, where each micro-partition maintains a chain of versions representing all changes made to that partition over time.
When you execute a Time Travel query using the AT or BEFORE clauses with timestamps, offsets, or query IDs, Snowflake's optimizer reconstructs the table's state at the specified point in time. This reconstruction is highly efficient because it only materializes the specific micro-partitions needed for the query, not the entire historical table. The optimizer uses micro-partition metadata to determine which versions of which partitions were valid at the requested time, then assembles them on-the-fly.
Fail-Safe: Automatic Disaster Recovery
Fail-Safe extends data protection beyond the Time Travel window, providing an additional 7 days of automatic data retention. Unlike Time Travel, Fail-Safe data is not accessible through SQL queries. It exists as an immutable, read-only archive in Snowflake's internal storage, protected from any user modification or deletion.
The Fail-Safe mechanism is designed for disaster recovery scenarios where all other recovery options have been exhausted. If you discover data loss after the Time Travel window has expired, you can open a support ticket with Snowflake to initiate Fail-Safe recovery. This process typically takes several hours to complete, as Snowflake engineers must manually extract and reconstruct the historical data from Fail-Safe storage.
Fail-Safe operates automatically without any user configuration. Once data ages beyond the Time Travel window, it transitions to Fail-Safe status. The data remains in Fail-Safe for exactly 7 days, after which it is permanently purged from Snowflake's systems. This automatic lifecycle management ensures that historical data is available for recovery while maintaining predictable storage costs.
Zero-Copy Cloning: Efficient Data Duplication
Zero-copy cloning is one of Snowflake's most powerful features, enabling you to create complete copies of tables, schemas, or databases without duplicating the underlying data. When you create a clone, Snowflake creates new metadata objects that reference the same micro-partitions as the source, without copying any actual data. This means cloning a 1TB table takes seconds and uses virtually no additional storage.
The cloning process implements a copy-on-write mechanism. Initially, both the source and clone reference identical micro-partitions. When either the source or clone is modified, Snowflake creates new micro-partitions for the modified data while keeping the original micro-partitions shared. This means divergence only consumes storage proportional to the actual changes, not the entire dataset size.
Clones maintain full independence from their sources in terms of DML operations. You can INSERT, UPDATE, DELETE, or TRUNCATE data in a clone without affecting the source, and vice versa. Clones also inherit Time Travel retention settings from their source, meaning you can query historical versions of cloned data using the same Time Travel mechanisms.
Advanced Use Cases
Time Travel and cloning enable numerous advanced use cases beyond simple data recovery. You can implement data auditing by cloning tables before and after regulatory changes, creating point-in-time snapshots for compliance purposes. Development environments can be created by cloning production schemas, giving developers realistic data volumes without impacting production performance. Data science workflows can use Time Travel to reproduce historical model training datasets, ensuring reproducibility of machine learning experiments.
Regression testing becomes straightforward by cloning production tables, applying schema changes, and comparing results between the original and modified versions. Data quality analysis can leverage Time Travel to analyze how data quality metrics have evolved over time, identifying trends in data freshness, completeness, and accuracy.
Key Concepts Table
| Feature | Time Travel | Fail-Safe | Zero-Copy Clone |
|---|---|---|---|
| Purpose | Query historical data | Disaster recovery | Create data copies |
| User Access | SQL queries | Support ticket | SQL queries |
| Retention | 0-90 days (configurable) | 7 days (fixed) | Indefinite (with source) |
| Storage Cost | Included in table storage | Included in table storage | Copy-on-write only |
| Performance | Same as current data | Hours to days | Near-instantaneous |
| Retention Setting | Edition Support | Max Window | Use Case |
|---|---|---|---|
| 0 days | All editions | None | Cost optimization |
| 1 day | Standard | 24 hours | Basic recovery |
| 7 days | Enterprise | 168 hours | Regulatory compliance |
| 90 days | Enterprise+ | 2160 hours | Long-term audit |
| Clone Operation | Time Complexity | Space Complexity | Independence |
|---|---|---|---|
| CREATE CLONE | O(1) - metadata | O(1) initially | Full DML independence |
| INSERT to clone | O(1) + data size | O(changed data) | Source unaffected |
| UPDATE in clone | O(1) + row count | O(modified rows) | Source unaffected |
| DELETE from clone | O(1) + row count | O(deleted rows) | Source unaffected |
Code Examples
-- Example 1: Time Travel queries with different syntax
-- Query data as of specific timestamp
SELECT * FROM sales_data
AT (TIMESTAMP => '2024-01-15 10:30:00'::TIMESTAMP_TZ);
-- Query data as of offset from current time (in seconds)
SELECT * FROM sales_data
AT (OFFSET => -3600); -- 1 hour ago
-- Query data before a specific statement
SELECT * FROM sales_data
BEFORE (STATEMENT => '01234567-89ab-cdef-0123-456789abcdef');
-- Example 2: Clone table with Time Travel
CREATE CLONE sales_data_clone
FROM sales_data
AT (TIMESTAMP => '2024-01-15 00:00:00'::TIMESTAMP_TZ);
-- Clone entire schema with specific point-in-time
CREATE SCHEMA analytics_clone
CLONE analytics_prod
AT (OFFSET => -86400); -- 24 hours ago
-- Example 3: Data recovery using Time Travel
-- Step 1: Identify when data was deleted
SELECT * FROM sales_data
AT (TIMESTAMP => '2024-01-15 10:00:00'::TIMESTAMP_TZ)
EXCEPT
SELECT * FROM sales_data
AT (TIMESTAMP => '2024-01-15 11:00:00'::TIMESTAMP_TZ);
-- Step 2: Recover deleted data
CREATE TABLE sales_data_recovered AS
SELECT * FROM sales_data
AT (TIMESTAMP => '2024-01-15 10:30:00'::TIMESTAMP_TZ);
-- Step 3: Merge recovered data back (if needed)
MERGE INTO sales_data t
USING sales_data_recovered s
ON t.id = s.id AND t.date = s.date
WHEN NOT MATCHED THEN
INSERT (id, date, amount, region)
VALUES (s.id, s.date, s.amount, s.region);
-- Example 4: UNDROP operations
-- Restore dropped table
UNDROP TABLE sales_data;
-- Restore dropped schema
UNDROP SCHEMA analytics;
-- Restore dropped database
UNDROP DATABASE warehouse;
-- Example 5: Advanced cloning patterns
-- Clone with explicit warehouse for large operations
CREATE WAREHOUSE clone_wh WAREHOUSE_SIZE = 'xlarge';
USE WAREHOUSE clone_wh;
CREATE CLONE large_fact_table_clone
FROM large_fact_table;
-- Clone with data transformation
CREATE CLONE sales_filtered_clone AS
SELECT * FROM sales_data
WHERE region = 'US'
AND transaction_date >= '2024-01-01';
-- Example 6: Time Travel data analysis
-- Analyze data changes over time
SELECT
'Current' as version,
COUNT(*) as row_count,
SUM(amount) as total_amount
FROM sales_data
UNION ALL
SELECT
'1 hour ago' as version,
COUNT(*) as row_count,
SUM(amount) as total_amount
FROM sales_data AT (OFFSET => -3600)
UNION ALL
SELECT
'1 day ago' as version,
COUNT(*) as row_count,
SUM(amount) as total_amount
FROM sales_data AT (OFFSET => -86400);
-- Example 7: Automated backup cloning
CREATE OR REPLACE PROCEDURE daily_backup()
RETURNS STRING
LANGUAGE SQL
AS
$$
BEGIN
CREATE CLONE sales_backup_
CLONE sales_data;
-- Drop clones older than 7 days
EXECUTE IMMEDIATE '
DROP TABLE IF EXISTS sales_backup_' ||
TO_CHAR(DATEADD(day, -7, CURRENT_DATE()), 'YYYYMMDD');
RETURN 'Backup completed successfully';
END;
$$;
-- Example 8: Compare table versions
-- Find rows added since specific time
SELECT * FROM sales_data
MINUS
SELECT * FROM sales_data AT (OFFSET => -3600);
-- Find rows deleted since specific time
SELECT * FROM sales_data AT (OFFSET => -3600)
MINUS
SELECT * FROM sales_data;
-- Find rows modified since specific time
SELECT id, amount FROM sales_data
EXCEPT
SELECT id, amount FROM sales_data AT (OFFSET => -3600);
Performance Metrics
| Operation | Time Complexity | Storage Impact | Typical Duration |
|---|---|---|---|
| Time Travel Query | O(partitions) | None (uses existing) | 1-30 seconds |
| Zero-Copy Clone | O(1) metadata | Near-zero initially | < 1 second |
| Fail-Safe Recovery | Manual process | Full table copy | 2-24 hours |
| UNDROP Table | O(1) metadata | None | < 1 second |
| Point-in-Time Clone | O(1) metadata | Near-zero initially | < 1 second |
| Time Travel Window | Storage Overhead | Query Performance | Recovery Capability |
|---|---|---|---|
| 1 day | ~5-15% | Same as current | Recent changes |
| 7 days | ~20-40% | Same as current | Weekly recovery |
| 30 days | ~50-100% | Slightly slower | Monthly recovery |
| 90 days | ~100-200% | Moderately slower | Quarterly recovery |
Best Practices
-
Set appropriate retention periods: Use 1 day for cost-sensitive workloads, 7 days for regulatory compliance, and 90 days only when required by specific regulations.
-
Implement automated cloning schedules: Create scheduled tasks that clone critical tables daily, providing additional recovery points beyond Time Travel.
-
Use Time Travel for auditing: Query historical data to track data quality trends, identify when anomalies occurred, and validate regulatory compliance.
-
Clone before schema changes: Always clone tables before DDL operations, enabling instant rollback if changes cause issues.
-
Monitor Time Travel storage: Track historical data growth to optimize retention settings and manage storage costs.
-
Document recovery procedures: Create runbooks for common data recovery scenarios, including Time Travel queries and cloning operations.
-
Test recovery processes: Regularly test Time Travel queries and cloning operations to ensure they work as expected during actual emergencies.
-
Use UNDROP for accidental drops: Prefer UNDROP over Time Travel for recently dropped objects, as it's faster and doesn't require data copying.
-
Leverage clones for development: Create development environments by cloning production schemas, ensuring developers have realistic data without impacting production.
-
Monitor Fail-Safe eligibility: Understand that Fail-Safe only applies to dropped objects, not deleted data, and plan recovery strategies accordingly.
See Also
- PySpark Iceberg - Time travel with Iceberg tables
- Delta Lake on Databricks - Delta Lake time travel comparison
- Data Warehouse Concepts - Data warehouse design principles