🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Transaction Isolation: SERIALIZABLE, SNAPSHOT, Lock-Based

Advanced SQLTransaction Isolation⭐ Premium

Advertisement

Interview Question: "Explain the difference between READ COMMITTED and REPEATABLE READ. What anomalies can occur at each isolation level? How does MVCC prevent dirty reads?" — Asked at JPMorgan, Goldman Sachs, Bloomberg for Senior Database roles

ℹ️

Difficulty: Advanced | Companies: JPMorgan, Goldman Sachs, Bloomberg, Citadel, Two Sigma | Time: 60-75 minutes

Isolation Levels Hierarchy

The SQL standard defines four isolation levels with increasing consistency:

READ UNCOMMITTEDREAD COMMITTEDREPEATABLE READSERIALIZABLE\text{READ UNCOMMITTED} \subset \text{READ COMMITTED} \subset \text{REPEATABLE READ} \subset \text{SERIALIZABLE}
LevelDirty ReadNon-Repeatable ReadPhantom ReadSerialization Anomaly
READ UNCOMMITTEDPossiblePossiblePossiblePossible
READ COMMITTEDPreventedPossiblePossiblePossible
REPEATABLE READPreventedPreventedPossible*Possible
SERIALIZABLEPreventedPreventedPreventedPrevented

*PostgreSQL prevents phantoms at REPEATABLE READ via predicate locking.

-- Create banking example
CREATE TABLE accounts (
    account_id INT PRIMARY KEY,
    customer_name VARCHAR(100),
    balance DECIMAL(15,2),
    version INT DEFAULT 1
);

INSERT INTO accounts VALUES
(1, 'Alice', 10000.00, 1),
(2, 'Bob', 15000.00, 1),
(3, 'Charlie', 8000.00, 1);

-- Transaction isolation level settings
-- PostgreSQL default is READ COMMITTED
SHOW default_transaction_isolation;

-- Set isolation level for session
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Or for next transaction only
BEGIN ISOLATION LEVEL SERIALIZABLE;

Dirty Read Demonstration

-- Session 1: READ UNCOMMITTED
BEGIN ISOLATION LEVEL READ UNCOMMITTED;
UPDATE accounts SET balance = 5000.00 WHERE account_id = 1;
-- Not committed yet

-- Session 2: READ UNCOMMITTED
BEGIN ISOLATION LEVEL READ UNCOMMITTED;
SELECT balance FROM accounts WHERE account_id = 1;
-- Returns 5000.00 (dirty read!)

-- Session 1: Rollback
ROLLBACK;

-- Session 2: Balance is now incorrect
-- Shows 5000.00 but actual is 10000.00
COMMIT;

Non-Repeatable Read Demonstration

-- Session 1: READ COMMITTED
BEGIN ISOLATION LEVEL READ COMMITTED;
SELECT balance FROM accounts WHERE account_id = 1;
-- Returns 10000.00

-- Session 2: Update and commit
BEGIN;
UPDATE accounts SET balance = 12000.00 WHERE account_id = 1;
COMMIT;

-- Session 1: Same query, different result
SELECT balance FROM accounts WHERE account_id = 1;
-- Returns 12000.00 (non-repeatable read!)
COMMIT;

Phantom Read Demonstration

-- Session 1: REPEATABLE READ
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM accounts WHERE balance > 9000;
-- Returns 2

-- Session 2: Insert new row
BEGIN;
INSERT INTO accounts VALUES (4, 'David', 11000.00, 1);
COMMIT;

-- Session 1: Same query, different count
SELECT COUNT(*) FROM accounts WHERE balance > 9000;
-- Returns 3 (phantom read!) in most databases
-- PostgreSQL returns 2 (prevents phantoms)
COMMIT;

MVCC Implementation

Multi-Version Concurrency Control maintains multiple versions:

Visible Version=min{v:v.commit_time<nowv.creator_txvisible_txs}\text{Visible Version} = \min\{v : v.\text{commit\_time} < \text{now} \wedge v.\text{creator\_tx} \in \text{visible\_txs}\}
-- Check MVCC system columns
SELECT 
    ctid,  -- Physical row location (page, offset)
    xmin,  -- Transaction ID that created this version
    xmax,  -- Transaction ID that deleted/updated this version (0 if not deleted)
    account_id,
    balance
FROM accounts
WHERE account_id = 1;

-- Update to create new version
UPDATE accounts SET balance = 11000.00 WHERE account_id = 1;

-- Check both versions
SELECT 
    ctid,
    xmin,
    xmax,
    account_id,
    balance
FROM accounts
WHERE account_id = 1;

-- Check transaction visibility
SELECT 
    txid_current() AS current_txid,
    txid_visible_in_snapshot(xmin, txid_current_snapshot()) AS created_visible,
    txid_visible_in_snapshot(xmax, txid_current_snapshot()) AS deleted_visible
FROM accounts
WHERE account_id = 1;

SERIALIZABLE with Serializable Snapshot Isolation (SSI)

PostgreSQL uses SSI for SERIALIZABLE:

SSI=Snapshot Isolation+Conflict Detection\text{SSI} = \text{Snapshot Isolation} + \text{Conflict Detection}
-- SSI prevents write skew anomaly
BEGIN ISOLATION LEVEL SERIALIZABLE;

-- Session 1: Check doctors on call
SELECT COUNT(*) FROM doctors 
WHERE on_call = TRUE AND department = 'ER';
-- Returns 2

-- Session 2: Same check
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT COUNT(*) FROM doctors 
WHERE on_call = TRUE AND department = 'ER';
-- Returns 2

-- Session 1: Remove one doctor (assuming another is on call)
UPDATE doctors SET on_call = FALSE 
WHERE doctor_id = 1 AND department = 'ER';
COMMIT;

-- Session 2: Also removes a doctor (write skew!)
UPDATE doctors SET on_call = FALSE 
WHERE doctor_id = 2 AND department = 'ER';
-- Serialization failure! Would violate constraint
COMMIT;

Lock-Based Concurrency Control

-- Explicit locking
BEGIN;

-- Row-level lock
SELECT * FROM accounts WHERE account_id = 1 FOR UPDATE;
-- Other transactions block on this row

-- Share lock (allows reads, blocks writes)
SELECT * FROM accounts WHERE account_id = 1 FOR SHARE;

-- No-wait lock (fail immediately if locked)
SELECT * FROM accounts WHERE account_id = 1 FOR UPDATE NOWAIT;

-- Skip locked (skip rows that are locked)
SELECT * FROM accounts WHERE account_id = 1 FOR UPDATE SKIP LOCKED;

-- Advisory locks (application-level)
SELECT pg_advisory_lock(12345);  -- Acquire
SELECT pg_try_advisory_lock(12345);  -- Try acquire
SELECT pg_advisory_unlock(12345);  -- Release
COMMIT;

Deadlock Detection and Prevention

-- Create deadlock scenario
-- Session 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
-- Wait for session 2 to release lock on account 2

-- Session 2
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 2;
-- Wait for session 1 to release lock on account 1
-- DEADLOCK! One session will be rolled back

-- Check for deadlocks
SELECT * FROM pg_stat_activity WHERE wait_event_type = 'Lock';

-- Set deadlock timeout
SET deadlock_timeout = '1s';

optimistic Concurrency Control

-- Version-based OCC
CREATE TABLE products (
    product_id INT PRIMARY KEY,
    name VARCHAR(100),
    price DECIMAL(10,2),
    version INT DEFAULT 1
);

-- Optimistic update pattern
WITH current_version AS (
    SELECT version FROM products WHERE product_id = 1
)
UPDATE products 
SET price = 29.99, version = version + 1
WHERE product_id = 1 
    AND version = (SELECT version FROM current_version)
RETURNING *;
-- If no rows updated, version changed by another transaction

Isolation Level Comparison Table

FeatureREAD COMMITTEDREPEATABLE READSERIALIZABLE
Lock DurationStatementTransactionTransaction
MVCC SnapshotPer statementPer transactionPer transaction
PhantomsAllowedPreventedPrevented
Write SkewAllowedAllowedPrevented
PerformanceBestGoodModerate
Use CaseOLTP generalReportingFinancial

ℹ️

PostgreSQL REPEATABLE READ: Uses snapshot isolation, not traditional locking. Prevents phantoms via predicate locking, unlike MySQL/Oracle.

Advanced: Gap Locking

-- Gap locks prevent inserts in range
BEGIN ISOLATION LEVEL REPEATABLE READ;

-- Lock gap between 100 and 200
SELECT * FROM accounts 
WHERE account_id BETWEEN 100 AND 200 FOR UPDATE;

-- Other transactions cannot insert account_id in (100, 200)
-- This prevents phantom reads
INSERT INTO accounts VALUES (150, 'New', 5000.00, 1);
-- Blocks until first transaction commits

COMMIT;

Mathematical Formulation

For a set of transactions T={t1,t2,...,tn}T = \{t_1, t_2, ..., t_n\}:

Conflict(ti,tj)={trueif rti,wtj on same data itemfalseotherwise\text{Conflict}(t_i, t_j) = \begin{cases} \text{true} & \text{if } \exists r \in t_i, w \in t_j \text{ on same data item} \\ \text{false} & \text{otherwise} \end{cases}
Serializable    serialization graph is acyclic\text{Serializable} \iff \text{serialization graph is acyclic}

The serialization graph G=(V,E)G = (V, E) where:

  • V=TV = T (transactions as vertices)
  • (ti,tj)E(t_i, t_j) \in E if tit_i should come before tjt_j

⚠️

Performance Impact: SERIALIZABLE can cause up to 10% throughput reduction due to aborts. Monitor with pg_stat_database conflicts.

Monitoring and Diagnostics

-- Check current locks
SELECT 
    l.pid,
    l.mode,
    l.granted,
    a.query,
    a.state
FROM pg_locks l
JOIN pg_stat_activity a ON l.pid = a.pid
WHERE l.relation = 'accounts'::regclass;

-- Check transaction ID wraparound
SELECT 
    datname,
    age(datfrozenxid) AS xid_age,
    2^31 - age(datfrozenxid) AS transactions_until_wraparound
FROM pg_database;

-- Monitor serialization failures
SELECT 
    datname,
    conflicts,
    deadlocks,
    conflicts_bet_rol
FROM pg_stat_database
WHERE datname = current_database();

ℹ️

Pro Tip: Use SET LOCAL to change isolation level for current transaction only. This is useful for mixed workloads.

Best Practices

  1. Use READ COMMITTED for most OLTP workloads
  2. Use SERIALIZABLE for financial transactions
  3. Implement retry logic for serialization failures
  4. Monitor lock contention with pg_stat_activity
  5. Avoid long-running transactions to reduce MVCC overhead
-- Retry pattern for serialization failures
DO $$
DECLARE
    max_retries INT := 3;
    retry_count INT := 0;
BEGIN
    LOOP
        BEGIN
            -- Your transaction here
            PERFORM transfer_funds(1, 2, 100);
            EXIT;  -- Success
        EXCEPTION
            WHEN serialization_failure OR deadlock_detected THEN
                retry_count := retry_count + 1;
                IF retry_count >= max_retries THEN
                    RAISE;
                END IF;
                PERFORM pg_sleep(random() * 0.1);
        END;
    END LOOP;
END $$;

Advertisement