Microservices Data Patterns
Difficulty: Senior Level | Companies: AWS, Google, Microsoft, Netflix, Uber
The Data Challenge in Microservices
Each microservice should own its data. This creates challenges for cross-service queries, transactions, and data consistency.
โน๏ธ
Database-per-service is the gold standard, but it requires careful handling of joins, transactions, and data synchronization across services.
Pattern 1: Database Per Service
Each service owns its database schema completely.
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Order Service โ โ Inventory Svc โ โ Payment Service โ
โ โโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโ โ
โ โ PostgreSQLโ โ โ โ DynamoDB โ โ โ โ PostgreSQLโ โ
โ โโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโ โ
โโโโโโโโโโฌโโโโโโโโโ โโโโโโโโโโฌโโโโโโโโโ โโโโโโโโโโฌโโโโโโโโโ
โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโดโโโโโโโโโโโโ
โ Event Bus (SNS/SQS) โ
โโโโโโโโโโโโโโโโโโโโโโโโโ
// Order Service - owns its database
// src/order/repository.ts
import { PrismaClient } from '@prisma/client';
const prisma = new PrismaClient();
export class OrderRepository {
async createOrder(input: CreateOrderInput): Promise<Order> {
// Order service only accesses its own database
return prisma.order.create({
data: {
customerId: input.customerId,
items: {
create: input.items.map(item => ({
productId: item.productId,
quantity: item.quantity,
price: item.price,
})),
},
status: 'PENDING',
total: input.items.reduce((sum, i) => sum + i.price * i.quantity, 0),
},
include: { items: true },
});
}
async getOrder(id: string): Promise<Order | null> {
return prisma.order.findUnique({
where: { id },
include: { items: true },
});
}
}
// Publishes events for other services to consume
export async function publishOrderCreated(order: Order) {
await eventBus.publish({
source: 'order-service',
detailType: 'order.created',
detail: {
orderId: order.id,
customerId: order.customerId,
items: order.items,
total: order.total,
},
});
}
Pattern 2: API Composition Pattern
Aggregate data from multiple services at the API layer.
// API Gateway / BFF composition
import { OrderService } from './services/order';
import { InventoryService } from './services/inventory';
import { UserService } from './services/user';
export class OrderQueryHandler {
constructor(
private orderService: OrderService,
private inventoryService: InventoryService,
private userService: UserService,
) {}
async getOrderWithDetails(orderId: string): Promise<OrderDetails> {
// Fetch from multiple services in parallel
const [order, inventory, user] = await Promise.all([
this.orderService.getOrder(orderId),
this.inventoryService.getStockForOrder(orderId),
this.userService.getUser(order.customerId),
]);
// Compose the response
return {
id: order.id,
customer: {
id: user.id,
name: user.name,
email: user.email,
},
items: order.items.map(item => ({
...item,
inStock: inventory[item.productId]?.available ?? false,
warehouse: inventory[item.productId]?.warehouse,
})),
status: order.status,
total: order.total,
};
}
}
โ ๏ธ
API Composition adds latency because it calls multiple services sequentially or in parallel. Use it for read-heavy queries where eventual consistency is acceptable.
Pattern 3: Event-Driven Data Synchronization
Sync data across services using events instead of direct calls.
# Event handler for data synchronization
import json
from dataclasses import dataclass
from typing import List
@dataclass
class InventoryItem:
product_id: str
warehouse_id: str
quantity: int
reserved: int
class InventoryProjection:
"""Maintains a read-optimized view of order data."""
def __init__(self, db_connection):
self.db = db_connection
async def handle_order_created(self, event):
"""Update inventory when order is created."""
order = event['detail']
for item in order['items']:
await self.db.execute("""
UPDATE inventory
SET reserved = reserved + %s
WHERE product_id = %s AND warehouse_id = (
SELECT warehouse_id FROM warehouses
WHERE region = %s ORDER BY available DESC LIMIT 1
)
""", (item['quantity'], item['productId'], order['region']))
async def handle_order_cancelled(self, event):
"""Release reserved inventory when order is cancelled."""
order = event['detail']
for item in order['items']:
await self.db.execute("""
UPDATE inventory
SET reserved = reserved - %s
WHERE product_id = %s
""", (item['quantity'], item['productId']))
async def get_product_availability(self, product_id: str):
"""Read from projection for fast queries."""
result = await self.db.fetchrow("""
SELECT
product_id,
SUM(quantity) as total,
SUM(reserved) as reserved,
SUM(quantity) - SUM(reserved) as available
FROM inventory
WHERE product_id = %s
GROUP BY product_id
""", product_id)
return dict(result) if result else None
Pattern 4: CQRS with Materialized Views
Separate read and write models for optimized queries.
-- Write model (Order Service database)
CREATE TABLE orders (
id UUID PRIMARY KEY,
customer_id UUID NOT NULL,
status VARCHAR(50) NOT NULL,
total DECIMAL(10,2) NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE order_items (
id UUID PRIMARY KEY,
order_id UUID REFERENCES orders(id),
product_id UUID NOT NULL,
quantity INTEGER NOT NULL,
price DECIMAL(10,2) NOT NULL
);
-- Read model (separate database for queries)
CREATE MATERIALIZED VIEW order_summary AS
SELECT
o.id,
o.customer_id,
c.name as customer_name,
c.email as customer_email,
COUNT(oi.id) as item_count,
SUM(oi.quantity * oi.price) as total_amount,
o.status,
o.created_at
FROM orders o
JOIN customers c ON o.customer_id = c.id
JOIN order_items oi ON o.id = oi.order_id
GROUP BY o.id, c.id;
-- Refresh periodically or on-demand
-- REFRESH MATERIALIZED VIEW CONCURRENTLY order_summary;
Pattern 5: Distributed Transactions with Outbox Pattern
Ensure atomicity between database writes and event publishing.
// Transactional Outbox Pattern
import { PrismaClient } from '@prisma/client';
import { Kafka } from 'kafkajs';
const prisma = new PrismaClient();
const kafka = new Kafka({ brokers: ['kafka:9092'] });
const producer = kafka.producer();
export class OrderService {
async createOrder(input: CreateOrderInput) {
// Single transaction for DB write + outbox insert
const result = await prisma.$transaction(async (tx) => {
// 1. Create order
const order = await tx.order.create({
data: {
customerId: input.customerId,
items: { create: input.items },
status: 'PENDING',
total: this.calculateTotal(input.items),
},
});
// 2. Write event to outbox table (same transaction)
await tx.outbox.create({
data: {
aggregateType: 'Order',
aggregateId: order.id,
eventType: 'OrderCreated',
payload: JSON.stringify({
orderId: order.id,
customerId: order.customerId,
items: input.items,
total: order.total,
}),
createdAt: new Date(),
},
});
return order;
});
return result;
}
// Background process: poll outbox and publish to Kafka
async processOutbox() {
const events = await prisma.outbox.findMany({
where: { processed: false },
orderBy: { createdAt: 'asc' },
take: 100,
});
for (const event of events) {
await producer.send({
topic: `orders.${event.eventType.toLowerCase()}`,
messages: [
{
key: event.aggregateId,
value: event.payload,
headers: {
'eventType': event.eventType,
'aggregateType': event.aggregateType,
},
},
],
});
await prisma.outbox.update({
where: { id: event.id },
data: { processed: true, processedAt: new Date() },
});
}
}
}
โน๏ธ
The Outbox Pattern guarantees at-least-once delivery. Consumers must be idempotent to handle duplicate events.
Data Pattern Decision Matrix
| Pattern | Consistency | Complexity | Best For |
|---|---|---|---|
| DB per Service | Eventual | Low | Simple CRUD services |
| API Composition | Eventual | Medium | Read-heavy aggregations |
| CQRS | Eventual | High | Complex query patterns |
| Event Sourcing | Strong | High | Audit trails, financial |
Follow-Up Questions
- How do you handle schema evolution when events are shared between multiple microservices?
- What strategies would you use to query across 5+ microservices without creating a monolithic API?
- How do you implement distributed locking in a microservices architecture without introducing tight coupling?