Python Generators — Lazy Evaluation & Memory Efficiency

Python AdvancedGeneratorsFree Lesson

Advertisement

Python Generators — Lazy Evaluation & Memory Efficiency

Generators produce values one at a time instead of building entire lists in memory. They are essential for processing large datasets, streams, and pipelines.

Learning Objectives

  • Write generator functions with yield
  • Create generator expressions for concise lazy evaluation
  • Build generator pipelines for data processing
  • Understand send(), throw(), and close() methods

Generator Basics

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Each call to next() resumes where yield left off
fib = fibonacci()
print(next(fib))  # 0
print(next(fib))  # 1
print(next(fib))  # 1
print(next(fib))  # 2

# Use in a for loop (auto-stops on StopIteration)
for i, num in enumerate(fibonacci()):
    if i >= 10:
        break
    print(num, end=" ")  # 0 1 1 2 3 5 8 13 21 34

Generator Expressions

# Like list comprehension but with parentheses
squares_list = [x**2 for x in range(1000000)]   # Uses ~8MB RAM
squares_gen = (x**2 for x in range(1000000))     # Uses ~0 bytes

# Consume lazily
total = sum(x**2 for x in range(1000))  # No list created
first_big = next(x for x in range(1000000) if x > 999990)

Generator Pipelines

def read_large_file(path):
    with open(path, 'r') as f:
        for line in f:
            yield line.strip()

def filter_comments(lines):
    for line in lines:
        if not line.startswith('#'):
            yield line

def parse_csv(lines):
    header = next(lines)
    fields = header.split(',')
    for line in lines:
        values = line.split(',')
        yield dict(zip(fields, values))

# Pipeline — processes data lazily
lines = read_large_file('data.csv')
non_comments = filter_comments(lines)
records = parse_csv(non_comments)

for record in records:
    if record['status'] == 'active':
        print(record)

Yield From

def flatten(nested_list):
    for item in nested_list:
        if isinstance(item, list):
            yield from flatten(item)
        else:
            yield item

data = [1, [2, 3], [4, [5, 6]], 7]
print(list(flatten(data)))  # [1, 2, 3, 4, 5, 6, 7]

def chain(*iterables):
    for iterable in iterables:
        yield from iterable

result = list(chain([1, 2], [3, 4], [5, 6]))
# [1, 2, 3, 4, 5, 6]

Generator Send

def accumulator():
    total = 0
    while True:
        value = yield total
        if value is None:
            break
        total += value

acc = accumulator()
next(acc)            # Prime the generator
print(acc.send(10))  # 10
print(acc.send(20))  # 30
print(acc.send(5))   # 35

Key Takeaways

  1. Generators produce values lazily — one at a time
  2. Generator expressions use () not []
  3. yield from delegates to sub-generators
  4. Pipelines chain generators for memory-efficient processing
  5. Use generators for large files, streams, and infinite sequences

Advertisement

Need Expert Python Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement