Introduction
The pickle module implements binary protocols for serializing and deserializing Python object structures. It converts Python objects to byte streams and back.
Basic Pickling
import pickle
data = {"name": "Alice", "age": 30, "scores": [95, 87, 92]}
# Serialize to bytes
serialized = pickle.dumps(data)
# Deserialize back
deserialized = pickle.loads(serialized)
print(deserialized)
Working with Files
import pickle
data = {"key": "value"}
# Write to file
with open("data.pkl", "wb") as f:
pickle.dump(data, f)
# Read from file
with open("data.pkl", "rb") as f:
loaded = pickle.load(f)
Protocol Versions
import pickle
data = [1, 2, 3, 4, 5]
# Protocol 0 (ASCII, compatible)
p0 = pickle.dumps(data, protocol=0)
# Protocol 4 (Python 3.4+, efficient)
p4 = pickle.dumps(data, protocol=4)
# Protocol 5 (Python 3.8+, out-of-band data)
p5 = pickle.dumps(data, protocol=5)
print(f"Protocol 0 size: {len(p0)}")
print(f"Protocol 4 size: {len(p4)}")
print(f"Protocol 5 size: {len(p5)}")
Custom Object Serialization
import pickle
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __getstate__(self):
return {"name": self.name, "age": self.age}
def __setstate__(self, state):
self.name = state["name"]
self.age = state["age"]
person = Person("Bob", 25)
data = pickle.dumps(person)
Security Considerations
# NEVER unpickle untrusted data!
# Bad: pickle.loads(untrusted_input)
# Use a safer alternative like json or msgpack
import json
safe_data = json.dumps({"name": "Alice"})
safe_loaded = json.loads(safe_data)
Practice Problems
- Serialize and deserialize a custom class
- Use different protocol versions
- Handle large objects efficiently
- Implement reduce for complex objects
- Compare pickle with json serialization