Python Dictionaries — The Complete Guide
Learning Objectives
By the end of this tutorial, you will be able to:
- Create dictionaries using multiple approaches
- Access, modify, and delete key-value pairs
- Use all major dictionary methods
- Write dictionary comprehensions for data transformation
- Work with nested dictionaries safely
- Choose the right data structure based on performance needs
What Are Dictionaries?
A dictionary is Python's built-in mapping type that stores key-value pairs. Think of it as a real-world dictionary: you look up a word (the key) to find its definition (the value).
# A simple dictionary
student = {
"name": "Alice",
"age": 21,
"major": "Computer Science"
}
print(student)
# Output: {'name': 'Alice', 'age': 21, 'major': 'Computer Science'}
Core Properties
| Property | Description | Example |
|---|---|---|
| Key-value pairs | Data stored as associations | "name": "Alice" |
| Ordered | Insertion order preserved (Python 3.7+) | Keys appear in creation order |
| Mutable | Can add, change, or remove items | d["new"] = 42 |
| Hashable keys | Keys must be hashable types | Strings, numbers, tuples |
| Fast lookup | O(1) average time complexity | d["key"] is nearly instant |
# Dictionaries are ordered (Python 3.7+)
d = {}
d["first"] = 1
d["second"] = 2
d["third"] = 3
print(list(d.keys()))
# Output: ['first', 'second', 'third']
Creating Dictionaries
Dictionary Literal
The most common way to create a dictionary:
# Empty dictionary
empty = {}
# Dictionary with initial values
config = {
"host": "localhost",
"port": 8080,
"debug": True
}
print(config)
# Output: {'host': 'localhost', 'port': 8080, 'debug': True}
dict() Constructor
Use the dict() constructor for more explicit creation:
# From keyword arguments
person = dict(name="Bob", age=30, city="New York")
print(person)
# Output: {'name': 'Bob', 'age': 30, 'city': 'New York'}
# From a list of tuples
pairs = [("a", 1), ("b", 2), ("c", 3)]
mapping = dict(pairs)
print(mapping)
# Output: {'a': 1, 'b': 2, 'c': 3}
# From two lists
keys = ["x", "y", "z"]
values = [10, 20, 30]
coord = dict(zip(keys, values))
print(coord)
# Output: {'x': 10, 'y': 20, 'z': 30}
dict.fromkeys()
Create a dictionary with all keys set to the same value:
# All keys with default value
scores = dict.fromkeys(["alice", "bob", "charlie"], 0)
print(scores)
# Output: {'alice': 0, 'bob': 0, 'charlie': 0}
# All keys with None
keys = ["name", "age", "email"]
blank = dict.fromkeys(keys)
print(blank)
# Output: {'name': None, 'age': None, 'email': None}
Dictionary Comprehension
Create dictionaries with expressions:
# Squares of numbers
squares = {x: x**2 for x in range(1, 6)}
print(squares)
# Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
# From two lists
names = ["Alice", "Bob", "Charlie"]
grades = ["A", "B+", "A-"]
report = {name: grade for name, grade in zip(names, grades)}
print(report)
# Output: {'Alice': 'A', 'Bob': 'B+', 'Charlie': 'A-'}
The ** Unpacking Operator
Merge dictionaries or create new ones with unpacking:
defaults = {"color": "blue", "size": "medium", "count": 1}
overrides = {"color": "red", "count": 5}
# Merge with overrides taking precedence
result = {**defaults, **overrides}
print(result)
# Output: {'color': 'red', 'size': 'medium', 'count': 5}
Accessing Values
Direct Access with []
user = {"name": "Alice", "age": 25}
print(user["name"])
# Output: Alice
# This raises a KeyError if key doesn't exist
# user["email"] # KeyError: 'email'
Safe Access with get()
user = {"name": "Alice", "age": 25}
# Returns None if key doesn't exist
print(user.get("email"))
# Output: None
# Returns default value if key doesn't exist
print(user.get("email", "not provided"))
# Output: not provided
# Returns value if key exists
print(user.get("name", "unknown"))
# Output: Alice
setdefault() — Get or Set
config = {"timeout": 30}
# If key exists, return its value
value = config.setdefault("timeout", 60)
print(value)
# Output: 30
print(config)
# Output: {'timeout': 30}
# If key doesn't exist, set it and return the new value
value = config.setdefault("retries", 3)
print(value)
# Output: 3
print(config)
# Output: {'timeout': 30, 'retries': 3}
Dictionary Methods
Complete Method Reference
| Method | Description | Example |
|---|---|---|
keys() | Return view of all keys | d.keys() |
values() | Return view of all values | d.values() |
items() | Return view of (key, value) pairs | d.items() |
get(key, default) | Return value or default | d.get("x", 0) |
setdefault(key, default) | Get or set value | d.setdefault("x", 0) |
update(other) | Merge other dict into self | d.update({"a": 1}) |
pop(key, default) | Remove and return value | d.pop("a", None) |
popitem() | Remove and return last item | d.popitem() |
clear() | Remove all items | d.clear() |
copy() | Shallow copy | d.copy() |
fromkeys(keys, value) | Create dict from keys | dict.fromkeys(["a"], 1) |
Working with Views
user = {"name": "Alice", "age": 25, "city": "NYC"}
# Keys view
print(user.keys())
# Output: dict_keys(['name', 'age', 'city'])
# Values view
print(user.values())
# Output: dict_values(['Alice', 25, 'NYC'])
# Items view
print(user.items())
# Output: dict_items([('name', 'Alice'), ('age', 25), ('city', 'NYC')])
# Check membership efficiently
print("name" in user.keys()) # True
print("email" in user) # Faster - checks keys directly
update() and pop()
config = {"host": "localhost", "port": 8080}
# Update with another dictionary
config.update({"debug": True, "port": 9090})
print(config)
# Output: {'host': 'localhost', 'port': 9090, 'debug': True}
# Update with keyword arguments
config.update(timeout=30, retries=3)
print(config)
# Output: {'host': 'localhost', 'port': 9090, 'debug': True, 'timeout': 30, 'retries': 3}
# Remove a key and get its value
port = config.pop("port")
print(port)
# Output: 9090
# Remove with default to avoid KeyError
timeout = config.pop("timeout", None)
print(timeout)
# Output: 30
missing = config.pop("nonexistent", "default")
print(missing)
# Output: default
# Remove and return last inserted item (Python 3.7+)
last = config.popitem()
print(last)
# Output: ('retries', 3)
Dictionary Comprehensions
Basic Syntax
# {key_expression: value_expression for item in iterable}
# Create a mapping of numbers to their squares
squares = {n: n**2 for n in range(1, 11)}
print(squares)
# Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100}
Filtering in Comprehensions
# Only include even numbers
even_squares = {n: n**2 for n in range(1, 11) if n % 2 == 0}
print(even_squares)
# Output: {2: 4, 4: 16, 6: 36, 8: 64, 10: 100}
# Filter by value condition
prices = {"apple": 1.0, "banana": 0.5, "steak": 15.0, "bread": 2.5}
expensive = {k: v for k, v in prices.items() if v > 5.0}
print(expensive)
# Output: {'steak': 15.0}
Transforming Keys and Values
# Transform values
prices = {"apple": 1.0, "banana": 0.5, "orange": 1.5}
with_tax = {k: round(v * 1.08, 2) for k, v in prices.items()}
print(with_tax)
# Output: {'apple': 1.08, 'banana': 0.54, 'orange': 1.62}
# Transform keys (make uppercase)
data = {"a": 1, "b": 2, "c": 3}
upper_keys = {k.upper(): v for k, v in data.items()}
print(upper_keys)
# Output: {'A': 1, 'B': 2, 'C': 3}
Inverting a Dictionary
# Swap keys and values
original = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in original.items()}
print(inverted)
# Output: {1: 'a', 2: 'b', 3: 'c'}
# Handle duplicate values by grouping
grades = {"Alice": "A", "Bob": "B", "Charlie": "A", "Diana": "B"}
by_grade = {}
for name, grade in grades.items():
by_grade.setdefault(grade, []).append(name)
print(by_grade)
# Output: {'A': ['Alice', 'Charlie'], 'B': ['Bob', 'Diana']}
Nested Dictionaries
Creating Nested Structures
# Students with their courses and grades
students = {
"alice": {
"name": "Alice Johnson",
"courses": {
"math": 95,
"english": 88,
"physics": 92
}
},
"bob": {
"name": "Bob Smith",
"courses": {
"math": 78,
"english": 85,
"physics": 80
}
}
}
print(students["alice"]["courses"]["math"])
# Output: 95
Accessing Deep Values
config = {
"database": {
"host": "localhost",
"credentials": {
"user": "admin",
"password": "secret"
}
}
}
# Direct access (risky)
host = config["database"]["host"]
# Safe access pattern
def deep_get(d, *keys, default=None):
"""Safely access nested dictionary values."""
current = d
for key in keys:
if isinstance(current, dict) and key in current:
current = current[key]
else:
return default
return current
host = deep_get(config, "database", "host")
print(host)
# Output: localhost
missing = deep_get(config, "database", "nonexistent", default="N/A")
print(missing)
# Output: N/A
Flattening Nested Dictionaries
def flatten_dict(d, parent_key="", sep="."):
"""Flatten a nested dictionary."""
items = []
for key, value in d.items():
new_key = f"{parent_key}{sep}{key}" if parent_key else key
if isinstance(value, dict):
items.extend(flatten_dict(value, new_key, sep).items())
else:
items.append((new_key, value))
return dict(items)
nested = {
"user": {
"name": "Alice",
"address": {
"city": "NYC",
"zip": "10001"
}
}
}
flat = flatten_dict(nested)
print(flat)
# Output: {'user.name': 'Alice', 'user.address.city': 'NYC', 'user.address.zip': '10001'}
Dictionary Merging
The | Operator (Python 3.9+)
# The cleanest way to merge
dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
merged = dict1 | dict2
print(merged)
# Output: {'a': 1, 'b': 3, 'c': 4}
# In-place merge
dict1 |= {"d": 5}
print(dict1)
# Output: {'a': 1, 'b': 2, 'd': 5}
** Unpacking (Python 3.5+)
dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
merged = {**dict1, **dict2}
print(merged)
# Output: {'a': 1, 'b': 3, 'c': 4}
update() Method
dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
dict1.update(dict2)
print(dict1)
# Output: {'a': 1, 'b': 3, 'c': 4}
collections.ChainMap
from collections import ChainMap
defaults = {"color": "blue", "size": "medium"}
user_prefs = {"color": "red"}
runtime = {"debug": True}
# First match wins (runtime > user_prefs > defaults)
config = ChainMap(runtime, user_prefs, defaults)
print(config["color"]) # Output: red (from user_prefs)
print(config["size"]) # Output: medium (from defaults)
print(config["debug"]) # Output: True (from runtime)
Common Patterns
Counting with defaultdict
from collections import defaultdict
words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
# Without defaultdict
counts = {}
for word in words:
counts[word] = counts.get(word, 0) + 1
print(counts)
# Output: {'apple': 3, 'banana': 2, 'cherry': 1}
# With defaultdict
counts = defaultdict(int)
for word in words:
counts[word] += 1
print(dict(counts))
# Output: {'apple': 3, 'banana': 2, 'cherry': 1}
Grouping with defaultdict(list)
from collections import defaultdict
students = [
("Alice", "CS"),
("Bob", "Math"),
("Charlie", "CS"),
("Diana", "Math"),
("Eve", "CS")
]
groups = defaultdict(list)
for name, major in students:
groups[major].append(name)
print(dict(groups))
# Output: {'CS': ['Alice', 'Charlie', 'Eve'], 'Math': ['Bob', 'Diana']}
Sorting Dictionaries
scores = {"Alice": 95, "Bob": 87, "Charlie": 92, "Diana": 95}
# Sort by key
by_key = dict(sorted(scores.items()))
print(by_key)
# Output: {'Alice': 95, 'Bob': 87, 'Charlie': 92, 'Diana': 95}
# Sort by value (ascending)
by_value = dict(sorted(scores.items(), key=lambda x: x[1]))
print(by_value)
# Output: {'Bob': 87, 'Charlie': 92, 'Alice': 95, 'Diana': 95}
# Sort by value (descending)
by_value_desc = dict(sorted(scores.items(), key=lambda x: x[1], reverse=True))
print(by_value_desc)
# Output: {'Alice': 95, 'Diana': 95, 'Charlie': 92, 'Bob': 87}
Removing None Values
data = {"a": 1, "b": None, "c": 3, "d": None, "e": 5}
# Dict comprehension
clean = {k: v for k, v in data.items() if v is not None}
print(clean)
# Output: {'a': 1, 'c': 3, 'e': 5}
# Using filter
clean2 = dict(filter(lambda x: x[1] is not None, data.items()))
print(clean2)
# Output: {'a': 1, 'c': 3, 'e': 5}
Inverting a Dictionary
original = {"a": 1, "b": 2, "c": 3}
# Simple inversion (values must be unique and hashable)
inverted = {v: k for k, v in original.items()}
print(inverted)
# Output: {1: 'a', 2: 'b', 3: 'c'}
Performance
How Dictionaries Work
Python dictionaries use a hash table internally:
- When you insert a key, Python computes its hash
- The hash determines where the value is stored in memory
- To look up a value, Python computes the hash and goes directly to that location
# Hash values
print(hash("hello")) # Some integer
print(hash(42)) # Some integer
print(hash((1, 2, 3))) # Some integer
# This is why keys must be hashable
# Lists are NOT hashable:
# hash([1, 2, 3]) # TypeError: unhashable type: 'list'
Time Complexity
| Operation | Average Case | Worst Case |
|---|---|---|
Access d[key] | O(1) | O(n) |
Insert d[key] = val | O(1) | O(n) |
Delete del d[key] | O(1) | O(n) |
key in d | O(1) | O(n) |
| Iteration | O(n) | O(n) |
The worst case O(n) occurs with hash collisions, but this is extremely rare with well-distributed hash functions.
When to Use What
# Use dict when you need key-value mapping
user = {"name": "Alice", "age": 25}
# Use list when you need ordered sequence with indexing
items = ["apple", "banana", "cherry"]
# Use set when you need unique values and fast membership testing
unique = {"apple", "banana", "cherry"}
# Membership testing benchmarks (conceptual):
# dict: O(1) average
# list: O(n)
# set: O(1) average
Common Mistakes
1. KeyError vs get()
config = {"host": "localhost"}
# Bad - raises KeyError
# port = config["port"]
# Good - provides default
port = config.get("port", 8080)
print(port)
# Output: 8080
2. Modifying Dict During Iteration
# Bad - raises RuntimeError
data = {"a": 1, "b": 2, "c": 3}
# for key in data:
# if data[key] == 2:
# del data[key] # RuntimeError!
# Good - create new dict or collect keys first
data = {"a": 1, "b": 2, "c": 3}
data = {k: v for k, v in data.items() if v != 2}
print(data)
# Output: {'a': 1, 'c': 3}
3. Mutable Values as Keys
# Bad - lists are not hashable
# d = {[1, 2]: "value"} # TypeError: unhashable type: 'list'
# Good - use tuples instead
d = {(1, 2): "value"}
print(d[(1, 2)])
# Output: value
4. Shallow Copy Issues
original = {"a": [1, 2, 3]}
# Shallow copy - nested objects are shared
shallow = original.copy()
shallow["a"].append(4)
print(original)
# Output: {'a': [1, 2, 3, 4]} - original is modified!
# Deep copy - nested objects are independent
import copy
original = {"a": [1, 2, 3]}
deep = copy.deepcopy(original)
deep["a"].append(4)
print(original)
# Output: {'a': [1, 2, 3]} - original unchanged
5. Assuming Order Before Python 3.7
# In Python 3.6 and earlier, dict order was not guaranteed
# Always assume order matters in 3.7+
d = {"z": 1, "a": 2, "m": 3}
print(list(d.keys()))
# Python 3.7+: ['z', 'a', 'm'] (insertion order)
Practice Exercises
Exercise 1: Word Frequency Counter
Write a function that counts word frequencies in a sentence and returns the top 3 most common words.
def top_words(sentence):
"""Count word frequencies and return top 3."""
words = sentence.lower().split()
counts = {}
for word in words:
counts[word] = counts.get(word, 0) + 1
# Sort by frequency, then alphabetically for ties
sorted_words = sorted(counts.items(), key=lambda x: (-x[1], x[0]))
return sorted_words[:3]
text = "the cat sat on the mat the cat ate the rat"
print(top_words(text))
# Output: [('the', 4), ('cat', 2), ('ate', 1)]
Exercise 2: Merge Configuration Files
Write a function that merges two configuration dictionaries, where the second overrides the first for any conflicting keys.
def merge_configs(default, custom):
"""Merge two config dicts, custom overrides default."""
result = default.copy()
for key, value in custom.items():
if key in result and isinstance(result[key], dict) and isinstance(value, dict):
result[key] = merge_configs(result[key], value)
else:
result[key] = value
return result
default_config = {
"host": "localhost",
"port": 8080,
"db": {"name": "mydb", "timeout": 30}
}
custom_config = {"port": 9090, "db": {"timeout": 60}}
merged = merge_configs(default_config, custom_config)
print(merged)
# Output: {'host': 'localhost', 'port': 9090, 'db': {'name': 'mydb', 'timeout': 60}}
Exercise 3: Invert and Group
Write a function that inverts a dictionary, grouping original keys by their values.
def invert_and_group(d):
"""Invert dict, grouping keys by value."""
result = {}
for key, value in d.items():
result.setdefault(value, []).append(key)
return result
data = {"Alice": "A", "Bob": "B", "Charlie": "A", "Diana": "B", "Eve": "A"}
grouped = invert_and_group(data)
print(grouped)
# Output: {'A': ['Alice', 'Charlie', 'Eve'], 'B': ['Bob', 'Diana']}
Key Takeaways
- Dictionaries store key-value pairs with O(1) average lookup time
- Use
get()instead of[]to avoid KeyError when keys might be missing - Dictionary comprehensions are powerful for transforming and filtering data
- Nested dictionaries are common but require careful access patterns
- The
|operator (Python 3.9+) is the cleanest way to merge dictionaries - Dictionaries are ordered by insertion order since Python 3.7
- Keys must be hashable (strings, numbers, tuples — not lists or dicts)
- Use
defaultdictorsetdefaultto handle missing keys gracefully