Python Generators — Lazy Evaluation & Memory Efficiency
Generators produce values one at a time instead of building entire lists in memory. They are essential for processing large datasets, streams, and pipelines.
Learning Objectives
- Write generator functions with
yield - Create generator expressions for concise lazy evaluation
- Build generator pipelines for data processing
- Understand
send(),throw(), andclose()methods
Generator Basics
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
# Each call to next() resumes where yield left off
fib = fibonacci()
print(next(fib)) # 0
print(next(fib)) # 1
print(next(fib)) # 1
print(next(fib)) # 2
# Use in a for loop (auto-stops on StopIteration)
for i, num in enumerate(fibonacci()):
if i >= 10:
break
print(num, end=" ") # 0 1 1 2 3 5 8 13 21 34
Generator Expressions
# Like list comprehension but with parentheses
squares_list = [x**2 for x in range(1000000)] # Uses ~8MB RAM
squares_gen = (x**2 for x in range(1000000)) # Uses ~0 bytes
# Consume lazily
total = sum(x**2 for x in range(1000)) # No list created
first_big = next(x for x in range(1000000) if x > 999990)
Generator Pipelines
def read_large_file(path):
with open(path, 'r') as f:
for line in f:
yield line.strip()
def filter_comments(lines):
for line in lines:
if not line.startswith('#'):
yield line
def parse_csv(lines):
header = next(lines)
fields = header.split(',')
for line in lines:
values = line.split(',')
yield dict(zip(fields, values))
# Pipeline — processes data lazily
lines = read_large_file('data.csv')
non_comments = filter_comments(lines)
records = parse_csv(non_comments)
for record in records:
if record['status'] == 'active':
print(record)
Yield From
def flatten(nested_list):
for item in nested_list:
if isinstance(item, list):
yield from flatten(item)
else:
yield item
data = [1, [2, 3], [4, [5, 6]], 7]
print(list(flatten(data))) # [1, 2, 3, 4, 5, 6, 7]
def chain(*iterables):
for iterable in iterables:
yield from iterable
result = list(chain([1, 2], [3, 4], [5, 6]))
# [1, 2, 3, 4, 5, 6]
Generator Send
def accumulator():
total = 0
while True:
value = yield total
if value is None:
break
total += value
acc = accumulator()
next(acc) # Prime the generator
print(acc.send(10)) # 10
print(acc.send(20)) # 30
print(acc.send(5)) # 35
Key Takeaways
- Generators produce values lazily — one at a time
- Generator expressions use
()not[] yield fromdelegates to sub-generators- Pipelines chain generators for memory-efficient processing
- Use generators for large files, streams, and infinite sequences