Python Dictionaries — The Complete Guide to Key-Value Data

Python BasicsDictionariesFree Lesson

Advertisement

Python Dictionaries — The Complete Guide

Learning Objectives

By the end of this tutorial, you will be able to:

  • Create dictionaries using multiple approaches
  • Access, modify, and delete key-value pairs
  • Use all major dictionary methods
  • Write dictionary comprehensions for data transformation
  • Work with nested dictionaries safely
  • Choose the right data structure based on performance needs

What Are Dictionaries?

A dictionary is Python's built-in mapping type that stores key-value pairs. Think of it as a real-world dictionary: you look up a word (the key) to find its definition (the value).

# A simple dictionary
student = {
    "name": "Alice",
    "age": 21,
    "major": "Computer Science"
}

print(student)
# Output: {'name': 'Alice', 'age': 21, 'major': 'Computer Science'}

Core Properties

PropertyDescriptionExample
Key-value pairsData stored as associations"name": "Alice"
OrderedInsertion order preserved (Python 3.7+)Keys appear in creation order
MutableCan add, change, or remove itemsd["new"] = 42
Hashable keysKeys must be hashable typesStrings, numbers, tuples
Fast lookupO(1) average time complexityd["key"] is nearly instant
# Dictionaries are ordered (Python 3.7+)
d = {}
d["first"] = 1
d["second"] = 2
d["third"] = 3
print(list(d.keys()))
# Output: ['first', 'second', 'third']

Creating Dictionaries

Dictionary Literal

The most common way to create a dictionary:

# Empty dictionary
empty = {}

# Dictionary with initial values
config = {
    "host": "localhost",
    "port": 8080,
    "debug": True
}

print(config)
# Output: {'host': 'localhost', 'port': 8080, 'debug': True}

dict() Constructor

Use the dict() constructor for more explicit creation:

# From keyword arguments
person = dict(name="Bob", age=30, city="New York")
print(person)
# Output: {'name': 'Bob', 'age': 30, 'city': 'New York'}

# From a list of tuples
pairs = [("a", 1), ("b", 2), ("c", 3)]
mapping = dict(pairs)
print(mapping)
# Output: {'a': 1, 'b': 2, 'c': 3}

# From two lists
keys = ["x", "y", "z"]
values = [10, 20, 30]
coord = dict(zip(keys, values))
print(coord)
# Output: {'x': 10, 'y': 20, 'z': 30}

dict.fromkeys()

Create a dictionary with all keys set to the same value:

# All keys with default value
scores = dict.fromkeys(["alice", "bob", "charlie"], 0)
print(scores)
# Output: {'alice': 0, 'bob': 0, 'charlie': 0}

# All keys with None
keys = ["name", "age", "email"]
blank = dict.fromkeys(keys)
print(blank)
# Output: {'name': None, 'age': None, 'email': None}

Dictionary Comprehension

Create dictionaries with expressions:

# Squares of numbers
squares = {x: x**2 for x in range(1, 6)}
print(squares)
# Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# From two lists
names = ["Alice", "Bob", "Charlie"]
grades = ["A", "B+", "A-"]
report = {name: grade for name, grade in zip(names, grades)}
print(report)
# Output: {'Alice': 'A', 'Bob': 'B+', 'Charlie': 'A-'}

The ** Unpacking Operator

Merge dictionaries or create new ones with unpacking:

defaults = {"color": "blue", "size": "medium", "count": 1}
overrides = {"color": "red", "count": 5}

# Merge with overrides taking precedence
result = {**defaults, **overrides}
print(result)
# Output: {'color': 'red', 'size': 'medium', 'count': 5}

Accessing Values

Direct Access with []

user = {"name": "Alice", "age": 25}

print(user["name"])
# Output: Alice

# This raises a KeyError if key doesn't exist
# user["email"]  # KeyError: 'email'

Safe Access with get()

user = {"name": "Alice", "age": 25}

# Returns None if key doesn't exist
print(user.get("email"))
# Output: None

# Returns default value if key doesn't exist
print(user.get("email", "not provided"))
# Output: not provided

# Returns value if key exists
print(user.get("name", "unknown"))
# Output: Alice

setdefault() — Get or Set

config = {"timeout": 30}

# If key exists, return its value
value = config.setdefault("timeout", 60)
print(value)
# Output: 30
print(config)
# Output: {'timeout': 30}

# If key doesn't exist, set it and return the new value
value = config.setdefault("retries", 3)
print(value)
# Output: 3
print(config)
# Output: {'timeout': 30, 'retries': 3}

Dictionary Methods

Complete Method Reference

MethodDescriptionExample
keys()Return view of all keysd.keys()
values()Return view of all valuesd.values()
items()Return view of (key, value) pairsd.items()
get(key, default)Return value or defaultd.get("x", 0)
setdefault(key, default)Get or set valued.setdefault("x", 0)
update(other)Merge other dict into selfd.update({"a": 1})
pop(key, default)Remove and return valued.pop("a", None)
popitem()Remove and return last itemd.popitem()
clear()Remove all itemsd.clear()
copy()Shallow copyd.copy()
fromkeys(keys, value)Create dict from keysdict.fromkeys(["a"], 1)

Working with Views

user = {"name": "Alice", "age": 25, "city": "NYC"}

# Keys view
print(user.keys())
# Output: dict_keys(['name', 'age', 'city'])

# Values view
print(user.values())
# Output: dict_values(['Alice', 25, 'NYC'])

# Items view
print(user.items())
# Output: dict_items([('name', 'Alice'), ('age', 25), ('city', 'NYC')])

# Check membership efficiently
print("name" in user.keys())  # True
print("email" in user)        # Faster - checks keys directly

update() and pop()

config = {"host": "localhost", "port": 8080}

# Update with another dictionary
config.update({"debug": True, "port": 9090})
print(config)
# Output: {'host': 'localhost', 'port': 9090, 'debug': True}

# Update with keyword arguments
config.update(timeout=30, retries=3)
print(config)
# Output: {'host': 'localhost', 'port': 9090, 'debug': True, 'timeout': 30, 'retries': 3}

# Remove a key and get its value
port = config.pop("port")
print(port)
# Output: 9090

# Remove with default to avoid KeyError
timeout = config.pop("timeout", None)
print(timeout)
# Output: 30

missing = config.pop("nonexistent", "default")
print(missing)
# Output: default

# Remove and return last inserted item (Python 3.7+)
last = config.popitem()
print(last)
# Output: ('retries', 3)

Dictionary Comprehensions

Basic Syntax

# {key_expression: value_expression for item in iterable}

# Create a mapping of numbers to their squares
squares = {n: n**2 for n in range(1, 11)}
print(squares)
# Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100}

Filtering in Comprehensions

# Only include even numbers
even_squares = {n: n**2 for n in range(1, 11) if n % 2 == 0}
print(even_squares)
# Output: {2: 4, 4: 16, 6: 36, 8: 64, 10: 100}

# Filter by value condition
prices = {"apple": 1.0, "banana": 0.5, "steak": 15.0, "bread": 2.5}
expensive = {k: v for k, v in prices.items() if v > 5.0}
print(expensive)
# Output: {'steak': 15.0}

Transforming Keys and Values

# Transform values
prices = {"apple": 1.0, "banana": 0.5, "orange": 1.5}
with_tax = {k: round(v * 1.08, 2) for k, v in prices.items()}
print(with_tax)
# Output: {'apple': 1.08, 'banana': 0.54, 'orange': 1.62}

# Transform keys (make uppercase)
data = {"a": 1, "b": 2, "c": 3}
upper_keys = {k.upper(): v for k, v in data.items()}
print(upper_keys)
# Output: {'A': 1, 'B': 2, 'C': 3}

Inverting a Dictionary

# Swap keys and values
original = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in original.items()}
print(inverted)
# Output: {1: 'a', 2: 'b', 3: 'c'}

# Handle duplicate values by grouping
grades = {"Alice": "A", "Bob": "B", "Charlie": "A", "Diana": "B"}
by_grade = {}
for name, grade in grades.items():
    by_grade.setdefault(grade, []).append(name)
print(by_grade)
# Output: {'A': ['Alice', 'Charlie'], 'B': ['Bob', 'Diana']}

Nested Dictionaries

Creating Nested Structures

# Students with their courses and grades
students = {
    "alice": {
        "name": "Alice Johnson",
        "courses": {
            "math": 95,
            "english": 88,
            "physics": 92
        }
    },
    "bob": {
        "name": "Bob Smith",
        "courses": {
            "math": 78,
            "english": 85,
            "physics": 80
        }
    }
}

print(students["alice"]["courses"]["math"])
# Output: 95

Accessing Deep Values

config = {
    "database": {
        "host": "localhost",
        "credentials": {
            "user": "admin",
            "password": "secret"
        }
    }
}

# Direct access (risky)
host = config["database"]["host"]

# Safe access pattern
def deep_get(d, *keys, default=None):
    """Safely access nested dictionary values."""
    current = d
    for key in keys:
        if isinstance(current, dict) and key in current:
            current = current[key]
        else:
            return default
    return current

host = deep_get(config, "database", "host")
print(host)
# Output: localhost

missing = deep_get(config, "database", "nonexistent", default="N/A")
print(missing)
# Output: N/A

Flattening Nested Dictionaries

def flatten_dict(d, parent_key="", sep="."):
    """Flatten a nested dictionary."""
    items = []
    for key, value in d.items():
        new_key = f"{parent_key}{sep}{key}" if parent_key else key
        if isinstance(value, dict):
            items.extend(flatten_dict(value, new_key, sep).items())
        else:
            items.append((new_key, value))
    return dict(items)

nested = {
    "user": {
        "name": "Alice",
        "address": {
            "city": "NYC",
            "zip": "10001"
        }
    }
}

flat = flatten_dict(nested)
print(flat)
# Output: {'user.name': 'Alice', 'user.address.city': 'NYC', 'user.address.zip': '10001'}

Dictionary Merging

The | Operator (Python 3.9+)

# The cleanest way to merge
dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}

merged = dict1 | dict2
print(merged)
# Output: {'a': 1, 'b': 3, 'c': 4}

# In-place merge
dict1 |= {"d": 5}
print(dict1)
# Output: {'a': 1, 'b': 2, 'd': 5}

** Unpacking (Python 3.5+)

dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}

merged = {**dict1, **dict2}
print(merged)
# Output: {'a': 1, 'b': 3, 'c': 4}

update() Method

dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}

dict1.update(dict2)
print(dict1)
# Output: {'a': 1, 'b': 3, 'c': 4}

collections.ChainMap

from collections import ChainMap

defaults = {"color": "blue", "size": "medium"}
user_prefs = {"color": "red"}
runtime = {"debug": True}

# First match wins (runtime > user_prefs > defaults)
config = ChainMap(runtime, user_prefs, defaults)
print(config["color"])    # Output: red (from user_prefs)
print(config["size"])     # Output: medium (from defaults)
print(config["debug"])    # Output: True (from runtime)

Common Patterns

Counting with defaultdict

from collections import defaultdict

words = ["apple", "banana", "apple", "cherry", "banana", "apple"]

# Without defaultdict
counts = {}
for word in words:
    counts[word] = counts.get(word, 0) + 1
print(counts)
# Output: {'apple': 3, 'banana': 2, 'cherry': 1}

# With defaultdict
counts = defaultdict(int)
for word in words:
    counts[word] += 1
print(dict(counts))
# Output: {'apple': 3, 'banana': 2, 'cherry': 1}

Grouping with defaultdict(list)

from collections import defaultdict

students = [
    ("Alice", "CS"),
    ("Bob", "Math"),
    ("Charlie", "CS"),
    ("Diana", "Math"),
    ("Eve", "CS")
]

groups = defaultdict(list)
for name, major in students:
    groups[major].append(name)

print(dict(groups))
# Output: {'CS': ['Alice', 'Charlie', 'Eve'], 'Math': ['Bob', 'Diana']}

Sorting Dictionaries

scores = {"Alice": 95, "Bob": 87, "Charlie": 92, "Diana": 95}

# Sort by key
by_key = dict(sorted(scores.items()))
print(by_key)
# Output: {'Alice': 95, 'Bob': 87, 'Charlie': 92, 'Diana': 95}

# Sort by value (ascending)
by_value = dict(sorted(scores.items(), key=lambda x: x[1]))
print(by_value)
# Output: {'Bob': 87, 'Charlie': 92, 'Alice': 95, 'Diana': 95}

# Sort by value (descending)
by_value_desc = dict(sorted(scores.items(), key=lambda x: x[1], reverse=True))
print(by_value_desc)
# Output: {'Alice': 95, 'Diana': 95, 'Charlie': 92, 'Bob': 87}

Removing None Values

data = {"a": 1, "b": None, "c": 3, "d": None, "e": 5}

# Dict comprehension
clean = {k: v for k, v in data.items() if v is not None}
print(clean)
# Output: {'a': 1, 'c': 3, 'e': 5}

# Using filter
clean2 = dict(filter(lambda x: x[1] is not None, data.items()))
print(clean2)
# Output: {'a': 1, 'c': 3, 'e': 5}

Inverting a Dictionary

original = {"a": 1, "b": 2, "c": 3}

# Simple inversion (values must be unique and hashable)
inverted = {v: k for k, v in original.items()}
print(inverted)
# Output: {1: 'a', 2: 'b', 3: 'c'}

Performance

How Dictionaries Work

Python dictionaries use a hash table internally:

  1. When you insert a key, Python computes its hash
  2. The hash determines where the value is stored in memory
  3. To look up a value, Python computes the hash and goes directly to that location
# Hash values
print(hash("hello"))      # Some integer
print(hash(42))           # Some integer
print(hash((1, 2, 3)))    # Some integer

# This is why keys must be hashable
# Lists are NOT hashable:
# hash([1, 2, 3])  # TypeError: unhashable type: 'list'

Time Complexity

OperationAverage CaseWorst Case
Access d[key]O(1)O(n)
Insert d[key] = valO(1)O(n)
Delete del d[key]O(1)O(n)
key in dO(1)O(n)
IterationO(n)O(n)

The worst case O(n) occurs with hash collisions, but this is extremely rare with well-distributed hash functions.

When to Use What

# Use dict when you need key-value mapping
user = {"name": "Alice", "age": 25}

# Use list when you need ordered sequence with indexing
items = ["apple", "banana", "cherry"]

# Use set when you need unique values and fast membership testing
unique = {"apple", "banana", "cherry"}

# Membership testing benchmarks (conceptual):
# dict: O(1) average
# list: O(n)
# set: O(1) average

Common Mistakes

1. KeyError vs get()

config = {"host": "localhost"}

# Bad - raises KeyError
# port = config["port"]

# Good - provides default
port = config.get("port", 8080)
print(port)
# Output: 8080

2. Modifying Dict During Iteration

# Bad - raises RuntimeError
data = {"a": 1, "b": 2, "c": 3}
# for key in data:
#     if data[key] == 2:
#         del data[key]  # RuntimeError!

# Good - create new dict or collect keys first
data = {"a": 1, "b": 2, "c": 3}
data = {k: v for k, v in data.items() if v != 2}
print(data)
# Output: {'a': 1, 'c': 3}

3. Mutable Values as Keys

# Bad - lists are not hashable
# d = {[1, 2]: "value"}  # TypeError: unhashable type: 'list'

# Good - use tuples instead
d = {(1, 2): "value"}
print(d[(1, 2)])
# Output: value

4. Shallow Copy Issues

original = {"a": [1, 2, 3]}

# Shallow copy - nested objects are shared
shallow = original.copy()
shallow["a"].append(4)
print(original)
# Output: {'a': [1, 2, 3, 4]} - original is modified!

# Deep copy - nested objects are independent
import copy
original = {"a": [1, 2, 3]}
deep = copy.deepcopy(original)
deep["a"].append(4)
print(original)
# Output: {'a': [1, 2, 3]} - original unchanged

5. Assuming Order Before Python 3.7

# In Python 3.6 and earlier, dict order was not guaranteed
# Always assume order matters in 3.7+
d = {"z": 1, "a": 2, "m": 3}
print(list(d.keys()))
# Python 3.7+: ['z', 'a', 'm'] (insertion order)

Practice Exercises

Exercise 1: Word Frequency Counter

Write a function that counts word frequencies in a sentence and returns the top 3 most common words.

def top_words(sentence):
    """Count word frequencies and return top 3."""
    words = sentence.lower().split()
    counts = {}
    for word in words:
        counts[word] = counts.get(word, 0) + 1

    # Sort by frequency, then alphabetically for ties
    sorted_words = sorted(counts.items(), key=lambda x: (-x[1], x[0]))
    return sorted_words[:3]

text = "the cat sat on the mat the cat ate the rat"
print(top_words(text))
# Output: [('the', 4), ('cat', 2), ('ate', 1)]

Exercise 2: Merge Configuration Files

Write a function that merges two configuration dictionaries, where the second overrides the first for any conflicting keys.

def merge_configs(default, custom):
    """Merge two config dicts, custom overrides default."""
    result = default.copy()
    for key, value in custom.items():
        if key in result and isinstance(result[key], dict) and isinstance(value, dict):
            result[key] = merge_configs(result[key], value)
        else:
            result[key] = value
    return result

default_config = {
    "host": "localhost",
    "port": 8080,
    "db": {"name": "mydb", "timeout": 30}
}
custom_config = {"port": 9090, "db": {"timeout": 60}}

merged = merge_configs(default_config, custom_config)
print(merged)
# Output: {'host': 'localhost', 'port': 9090, 'db': {'name': 'mydb', 'timeout': 60}}

Exercise 3: Invert and Group

Write a function that inverts a dictionary, grouping original keys by their values.

def invert_and_group(d):
    """Invert dict, grouping keys by value."""
    result = {}
    for key, value in d.items():
        result.setdefault(value, []).append(key)
    return result

data = {"Alice": "A", "Bob": "B", "Charlie": "A", "Diana": "B", "Eve": "A"}
grouped = invert_and_group(data)
print(grouped)
# Output: {'A': ['Alice', 'Charlie', 'Eve'], 'B': ['Bob', 'Diana']}

Key Takeaways

  • Dictionaries store key-value pairs with O(1) average lookup time
  • Use get() instead of [] to avoid KeyError when keys might be missing
  • Dictionary comprehensions are powerful for transforming and filtering data
  • Nested dictionaries are common but require careful access patterns
  • The | operator (Python 3.9+) is the cleanest way to merge dictionaries
  • Dictionaries are ordered by insertion order since Python 3.7
  • Keys must be hashable (strings, numbers, tuples — not lists or dicts)
  • Use defaultdict or setdefault to handle missing keys gracefully

Advertisement

Need Expert Python Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement