πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Kafka Performance Tuning

🟒 Free Lesson

Advertisement

Kafka Performance Tuning

Producer Tuningbatch.sizelinger.msbuffer.memorycompression.typeBroker Tuningnum.io.threadsnum.network.threadslog.flush.intervalsocket.send.bufferConsumer Tuningfetch.min.bytesmax.poll.recordsfetch.max.waitmax.partition.fetchNetwork Tuningsocket.buffer.sizesocket.request.maxconnections.maxsend.buffer.bytesPerformance Optimization Layers

Overview

Kafka performance tuning requires balancing throughput and latency while optimizing CPU, memory, disk I/O, and network resources. This guide covers optimization strategies for producers, brokers, consumers, and networking.

Performance Goals

  • High Throughput: Maximize messages/bytes per second
  • Low Latency: Minimize end-to-end delay
  • Resource Efficiency: Optimize CPU, memory, disk usage
  • Scalability: Handle growth without degradation

Producer Tuning

Batching Configuration

# Producer batch settings
batch.size=32768          # 32KB batch size
linger.ms=5              # Wait 5ms to fill batch
buffer.memory=67108864   # 64MB buffer memory
max.in.flight.requests.per.connection=5  # Allow 5 in-flight requests

Batching Strategy

from kafka import KafkaProducer

# High throughput configuration
producer = KafkaProducer(
    bootstrap_servers=['kafka:9092'],
    batch_size=32768,          # 32KB batches
    linger_ms=5,              # Wait for batch fill
    buffer_memory=67108864,   # 64MB buffer
    max_in_flight_requests_per_connection=5,
    acks='all',
    compression_type='lz4'
)

# Low latency configuration
producer = KafkaProducer(
    bootstrap_servers=['kafka:9092'],
    batch_size=16384,          # 16KB batches
    linger_ms=0,              # Send immediately
    buffer_memory=33554432,   # 32MB buffer
    max_in_flight_requests_per_connection=1,
    acks='all',
    compression_type='none'
)

Compression Options

AlgorithmCPU UsageCompression RatioLatency
noneNone1:1Lowest
gzipHigh~3:1High
snappyLow~2:1Low
lz4Low~2.5:1Low
zstdMedium~3.5:1Medium
# Compression configuration
producer = KafkaProducer(
    bootstrap_servers=['kafka:9092'],
    compression_type='lz4',  # Best balance of speed/ratio
    batch_size=65536,        # 64KB for better compression
    linger_ms=10            # Allow larger batches
)

Producer Performance Testing

# kafka-producer-perf-test
kafka-producer-perf-test.sh \
  --topic test-topic \
  --num-records 1000000 \
  --record-size 1024 \
  --throughput -1 \
  --producer-props \
    bootstrap.servers=kafka:9092 \
    batch.size=32768 \
    linger.ms=5 \
    compression.type=lz4 \
    acks=1

Broker Tuning

Thread Configuration

# Server thread settings
num.io.threads=16           # I/O threads (disk operations)
num.network.threads=8       # Network threads (socket handling)
num.replica.fetchers=4      # Replica fetch threads
background.threads=10       # Background tasks

Log Configuration

# Log flush settings
log.flush.interval.messages=10000  # Flush every 10K messages
log.flush.interval.ms=1000         # Flush every second
log.flush.scheduler.interval.ms=500  # Check every 500ms

# Log segment settings
log.segment.bytes=1073741824   # 1GB segments
log.roll.hours=168             # Roll weekly
log.index.size.max.bytes=10485760  # 10MB index
log.index.interval.bytes=4096  # Index interval

Memory Configuration

# Memory allocation
heap.size=8G               # JVM heap size
socket.send.buffer.bytes=1048576   # 1MB send buffer
socket.receive.buffer.bytes=1048576  # 1MB receive buffer
socket.request.max.bytes=104857600   # 100MB max request

Disk I/O Optimization

# Filesystem optimization
mount -o noatime,nodiratime /dev/sda1 /kafka-logs

# I/O scheduler (for SSDs)
echo noop > /sys/block/sda/queue/scheduler

# RAID configuration (RAID 10 for performance)
mdadm --create /dev/md0 --level=10 --raid-devices=4 /dev/sd{a,b,c,d}1

Consumer Tuning

Fetch Configuration

# Consumer fetch settings
fetch.min.bytes=1048576       # 1MB minimum fetch
fetch.max.wait.ms=500         # Wait up to 500ms
max.partition.fetch.bytes=1048576  # 1MB per partition
max.poll.records=500          # Process 500 records per poll
max.poll.interval.ms=300000   # 5 minutes max processing

Consumer Strategy

from kafka import KafkaConsumer

# High throughput consumer
consumer = KafkaConsumer(
    'orders',
    bootstrap_servers=['kafka:9092'],
    group_id='order-processor',
    fetch_min_bytes=1048576,        # 1MB minimum
    fetch_max_wait_ms=500,          # Wait for batch
    max_partition_fetch_bytes=1048576,
    max_poll_records=500,
    auto_offset_reset='earliest',
    enable_auto_commit=False
)

# Low latency consumer
consumer = KafkaConsumer(
    'orders',
    bootstrap_servers=['kafka:9092'],
    group_id='order-processor',
    fetch_min_bytes=1,              # Get anything available
    fetch_max_wait_ms=100,          # Quick response
    max_partition_fetch_bytes=65536,
    max_poll_records=50,
    auto_offset_reset='earliest',
    enable_auto_commit=False
)

Consumer Performance Testing

# kafka-consumer-perf-test
kafka-consumer-perf-test.sh \
  --topic test-topic \
  --messages 1000000 \
  --group perf-test-group \
  --bootstrap-server kafka:9092 \
  --fetch-size 1048576

Network Tuning

Socket Configuration

# Producer network
socket.send.buffer.bytes=1048576    # 1MB send buffer
socket.request.max.bytes=104857600  # 100MB max request

# Consumer network
socket.receive.buffer.bytes=1048576  # 1MB receive buffer
fetch.min.bytes=1048576              # 1MB minimum fetch

# Broker network
num.network.threads=8
socket.receive.buffer.bytes=1048576
socket.send.buffer.bytes=1048576

OS Tuning

# Increase file descriptor limits
ulimit -n 100000

# Kernel network tuning
sysctl -w net.core.rmem_max=16777216
sysctl -w net.core.wmem_max=16777216
sysctl -w net.core.rmem_default=16777216
sysctl -w net.core.wmem_default=16777216
sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216"
sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216"
sysctl -w net.core.netdev_max_backlog=5000

Performance Monitoring

Key Metrics

# Performance metrics to track
metrics:
  - name: kafka_server_BrokerTopicMetrics_MessagesInPerSec
    description: "Messages received per second"
    target: "> 100000"
    
  - name: kafka_server_BrokerTopicMetrics_BytesInPerSec
    description: "Bytes received per second"
    target: "> 100MB/s"
    
  - name: kafka_server_ReplicaManager_UnderReplicatedPartitions
    description: "Under-replicated partitions"
    target: "= 0"
    
  - name: kafka_consumer_group_lag
    description: "Consumer group lag"
    target: "< 1000"

Performance Dashboard

{
  "panels": [
    {
      "title": "Throughput (MB/s)",
      "type": "timeseries",
      "targets": [
        {
          "expr": "sum(rate(kafka_server_BrokerTopicMetrics_BytesInPerSec[5m]))",
          "legendFormat": "Inbound"
        },
        {
          "expr": "sum(rate(kafka_server_BrokerTopicMetrics_BytesOutPerSec[5m]))",
          "legendFormat": "Outbound"
        }
      ]
    },
    {
      "title": "Latency (p99)",
      "type": "timeseries",
      "targets": [
        {
          "expr": "histogram_quantile(0.99, kafka_network_RequestMetrics_TotalTimeMs_bucket)",
          "legendFormat": "p99 latency"
        }
      ]
    }
  ]
}

Benchmarking

End-to-End Benchmark

import time
from kafka import KafkaProducer, KafkaConsumer
import statistics

def benchmark_producer(num_messages, message_size, batch_size, linger_ms):
    producer = KafkaProducer(
        bootstrap_servers=['kafka:9092'],
        batch_size=batch_size,
        linger_ms=linger_ms,
        compression_type='lz4',
        acks='all'
    )
    
    start_time = time.time()
    for i in range(num_messages):
        message = b'x' * message_size
        producer.send('benchmark-topic', value=message)
    producer.flush()
    end_time = time.time()
    
    elapsed = end_time - start_time
    throughput = num_messages / elapsed
    mb_throughput = (num_messages * message_size) / elapsed / 1024 / 1024
    
    print(f"Throughput: {throughput:.0f} messages/sec")
    print(f"Bandwidth: {mb_throughput:.1f} MB/sec")
    print(f"Latency avg: {elapsed/num_messages*1000:.2f} ms")

# Run benchmark
benchmark_producer(
    num_messages=100000,
    message_size=1024,
    batch_size=32768,
    linger_ms=5
)

Common Issues and Solutions

High Latency

SymptomCauseSolution
High end-to-end latencyLarge batchesReduce batch.size and linger.ms
Slow consumer processingSmall fetchesIncrease fetch.min.bytes
Network congestionInsufficient buffersIncrease socket buffers

Low Throughput

SymptomCauseSolution
Low messages/secNo compressionEnable compression.type=lz4
High CPU usageCompression overheadUse snappy or lz4
Disk I/O bottleneckToo many small writesIncrease batch.size

Summary

Kafka performance tuning requires optimizing producer batching, broker threading, consumer fetching, and network buffers. Monitor key metrics and adjust configurations based on your specific throughput and latency requirements.

⭐

Premium Content

Kafka Performance Tuning

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Kafka Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement