Python Comments and Docstrings — Writing Professional Documentation

Python BasicsComments and DocstringsFree Lesson

Advertisement

Python Comments and Docstrings — Writing Professional Documentation

Clean, well-documented code is the hallmark of a professional developer. Python provides two primary mechanisms for documenting your code: comments for inline explanations and docstrings for structured documentation of modules, classes, and functions. This guide covers everything you need to write documentation that helps both you and your team.

Learning Objectives

By the end of this tutorial, you will be able to:

  • Write clear, purposeful single-line and multi-line comments
  • Understand when to comment and when code should be self-documenting
  • Create comprehensive docstrings following PEP 257
  • Use Google, NumPy, and Sphinx docstring styles effectively
  • Document modules, classes, and functions properly
  • Access documentation using __doc__ and help()
  • Use documentation tools like Sphinx and pdoc

Comments

Comments are annotations in your source code that are ignored by the Python interpreter. They exist solely for human readers.

Single-Line Comments

Single-line comments start with the # symbol and extend to the end of the line.

# This is a single-line comment

# Calculate the monthly interest rate
monthly_rate = annual_rate / 12

total = principal * (1 + monthly_rate) ** months  # compound interest formula

Best practices:

  • Always use a space after #
  • Capitalize the first letter if it's a complete sentence
  • Align inline comments with two spaces before #

Multi-Line Comments

Python doesn't have a dedicated multi-line comment syntax. Instead, you can use consecutive # lines or triple-quoted strings (though the latter are technically docstrings in module scope).

# This is a multi-line comment.
# It spans several lines to explain
# a complex piece of logic.
# Each line starts with # followed by a space.

result = complex_calculation(data)

When to Comment

Comments should explain why, not what. Good code is self-documenting for the "what."

# BAD: This comment adds no value
x = x + 1  # increment x

# GOOD: This explains the reasoning
x = x + 1  # offset by 1 to convert from 0-indexed to 1-indexed

Use comments for:

  • Explaining the rationale behind non-obvious decisions
  • Marking TODOs and FIXMEs
  • Temporarily disabling code during debugging
  • Providing context for complex algorithms
# TODO: Replace this with a proper database connection pool
# FIXME: This calculation doesn't account for leap years

# Uncomment the line below to enable debug mode
# DEBUG = True

Docstrings

Docstrings are special string literals that appear as the first statement in a module, function, class, or method. Unlike comments, docstrings are accessible at runtime through the __doc__ attribute.

What Is a Docstring?

A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. It becomes the __doc__ attribute of that object.

def greet(name):
    """Return a greeting message for the given person."""
    return f"Hello, {name}!"

Why Use Docstrings?

  • Discoverability: Tools can extract and format documentation automatically
  • IDE support: Editors show docstrings as hover tooltips and autocomplete hints
  • Runtime access: Developers can inspect objects interactively
  • Professional quality: Well-documented code is easier to maintain and share

Accessing Docstrings

You can access docstrings at runtime using the __doc__ attribute or the help() function.

def add(a, b):
    """Return the sum of two numbers."""
    return a + b

print(add.__doc__)  # Output: Return the sum of two numbers.
help(add)
# Output:
# Help on function add in module __main__:
#
# add(a, b)
#     Return the sum of two numbers.

PEP 257 — Docstring Conventions

PEP 257 is the official Python Enhancement Proposal that defines docstring conventions. Key rules include:

  1. Always use triple double quotes ("""...""") for docstrings
  2. One-line docstrings should be on a single line with no blank lines before or after
  3. Multi-line docstrings should have a summary line, a blank line, and then the detailed description

One-Line Docstrings

def square(x):
    """Return the square of a number."""
    return x ** 2

class Dog:
    """A simple class representing a dog."""
    pass

Multi-Line Docstrings

def fetch_user_data(user_id, include_deleted=False):
    """Fetch user data from the database.

    This function queries the database for a specific user's
    information including their profile, preferences, and
    activity history.

    Args:
        user_id (int): The unique identifier for the user.
        include_deleted (bool): Whether to include soft-deleted
            users. Defaults to False.

    Returns:
        dict: A dictionary containing user data with keys
        for 'profile', 'preferences', and 'activity'.

    Raises:
        ValueError: If user_id is negative.
        UserNotFoundError: If no user with the given ID exists.
    """
    if user_id < 0:
        raise ValueError("user_id must be non-negative")
    # ... implementation

Docstring Styles

Different organizations and projects use different docstring formats. The three most popular styles are Google, NumPy, and Sphinx (reStructuredText).

Google Style

Google's style guide emphasizes readability with simple formatting.

def connect(host, port, timeout=30.0, retries=3):
    """Establish a connection to the remote server.

    Creates a new connection to the specified host and port
    with configurable timeout and retry behavior.

    Args:
        host (str): The hostname or IP address to connect to.
        port (int): The port number to connect on.
        timeout (float): Connection timeout in seconds.
            Defaults to 30.0.
        retries (int): Maximum number of connection attempts.
            Defaults to 3.

    Returns:
        Connection: An active connection object ready for use.

    Raises:
        ConnectionError: If all retry attempts are exhausted.
        TimeoutError: If the connection times out.
    """
    pass

NumPy Style

NumPy's style is popular in the scientific Python community and uses explicit section markers.

def calculate_statistics(data, axis=None, weights=None):
    """Compute weighted mean and standard deviation.

    Parameters
    ----------
    data : array_like
        The input data for which to compute statistics.
    axis : int, optional
        Axis along which to compute. If None (default),
        compute over the flattened array.
    weights : array_like, optional
        Importance weights for each element. Must have the
        same shape as data. Defaults to None (uniform weights).

    Returns
    -------
    mean : float
        The weighted mean of the data.
    std : float
        The weighted standard deviation of the data.

    Notes
    -----
    This function uses the numpy library for computation.
    The standard deviation is the population standard deviation
    (ddof=0).

    Examples
    --------
    >>> data = [1, 2, 3, 4, 5]
    >>> mean, std = calculate_statistics(data)
    >>> print(f"Mean: {mean:.2f}, Std: {std:.2f}")
    Mean: 3.00, Std: 1.41
    """
    pass

Sphinx Style (reStructuredText)

Sphinx style uses reStructuredText directives and is the standard for Sphinx documentation.

def search(query, page=1, per_page=20):
    """Search the knowledge base for matching documents.

    :param query: The search query string.
    :type query: str
    :param page: Page number for pagination (1-indexed).
    :type page: int
    :param per_page: Number of results per page.
    :type per_page: int
    :returns: A dict with 'results' and 'total_count' keys.
    :rtype: dict
    :raises ValueError: If query is empty or per_page < 1.

    .. note::
       This search supports partial matching and fuzzy logic.

    .. warning::
       Large queries with per_page > 100 may be slow.
    """
    pass

Module, Class, and Function Docstrings

Module Docstrings

Place a docstring at the very top of your Python file to document the module's purpose.

"""
Authentication utilities for the web application.

This module provides functions for user authentication,
including password hashing, token generation, and session
management. It uses bcrypt for password hashing and JWT
for token-based authentication.

Typical usage example:

    from auth import authenticate, generate_token

    user = authenticate(username, password)
    if user:
        token = generate_token(user.id)
"""

import hashlib
from datetime import datetime, timedelta

__version__ = "2.1.0"
__author__ = "Your Name"

Class Docstrings

Document the class purpose, its attributes, and usage examples.

class CacheManager:
    """A thread-safe in-memory cache with TTL support.

    This class provides a simple key-value cache with automatic
    expiration. Items are stored in memory and will be evicted
    after their TTL expires.

    Attributes:
        max_size (int): Maximum number of items in the cache.
        default_ttl (int): Default time-to-live in seconds.

    Example:
        >>> cache = CacheManager(max_size=1000, default_ttl=300)
        >>> cache.set("user:123", {"name": "Alice"})
        >>> cache.get("user:123")
        {'name': 'Alice'}
    """

    def __init__(self, max_size=500, default_ttl=60):
        """Initialize the cache manager.

        Args:
            max_size (int): Maximum cache capacity. Defaults to 500.
            default_ttl (int): Default TTL in seconds. Defaults to 60.
        """
        self.max_size = max_size
        self.default_ttl = default_ttl
        self._cache = {}

Function and Method Docstrings

Document the function's purpose, parameters, return values, and exceptions.

def parse_config(filepath, validate=True):
    """Parse a YAML configuration file.

    Reads and validates a YAML configuration file, returning
    a structured dictionary of settings.

    Args:
        filepath (str): Path to the YAML configuration file.
        validate (bool): Whether to validate the configuration
            against the schema. Defaults to True.

    Returns:
        dict: Parsed configuration dictionary with all sections.

    Raises:
        FileNotFoundError: If the configuration file doesn't exist.
        yaml.YAMLError: If the file contains invalid YAML.
        ValidationError: If validation is enabled and the config
            fails schema validation.
    """
    pass

Common Documentation Mistakes

1. Repeating the Code in the Comment

# BAD
x = x + 1  # increment x by 1

# GOOD
x = x + 1  # compensate for 0-based array indexing

2. Outdated Documentation

Docstrings must be updated whenever the code changes. Outdated documentation is worse than no documentation.

# BAD - function was updated to accept a list, docstring wasn't
def get_user(user_id):
    """Return a single user by ID."""
    pass

# Actually accepts a list now
def get_user(user_id):
    """Return user(s) by ID or list of IDs.

    Args:
        user_id (int or list): Single ID or list of IDs.

    Returns:
        User or list of User objects.
    """
    pass

3. Over-Documenting Obvious Code

# BAD
def add(a, b):
    """Add two numbers together.

    Args:
        a (int): The first number.
        b (int): The second number.

    Returns:
        int: The sum of a and b.
    """
    return a + b

# GOOD
def add(a, b):
    """Return the sum of two numbers."""
    return a + b

4. Missing Return Value Documentation

# BAD
def calculate_tax(income):
    """Calculate income tax."""
    return income * 0.3

# GOOD
def calculate_tax(income):
    """Calculate income tax at the flat 30% rate.

    Args:
        income (float): Gross annual income.

    Returns:
        float: The calculated tax amount.
    """
    return income * 0.3

5. Inconsistent Docstring Style

Pick one style and use it consistently throughout your project. Mixing styles creates confusion and reduces readability.


Documentation Tools

Sphinx

Sphinx is the standard tool for building Python documentation from docstrings and standalone files.

pip install sphinx
sphinx-quickstart docs
cd docs
make html

Create a conf.py for your Sphinx project:

# docs/conf.py
extensions = [
    'sphinx.ext.autodoc',
    'sphinx.ext.napoleon',
    'sphinx.ext.viewcode',
]

napoleon_google_docstring = True
napoleon_numpy_docstring = True

pdoc

pdoc is a simpler alternative for quick API documentation.

pip install pdoc
pdoc --html mymodule.py

doctest

Python's built-in doctest module lets you embed executable examples in your docstrings.

def fibonacci(n):
    """Return the nth Fibonacci number.

    >>> fibonacci(0)
    0
    >>> fibonacci(1)
    1
    >>> fibonacci(10)
    55
    >>> fibonacci(-1)
    Traceback (most recent call last):
    ...
    ValueError: n must be non-negative
    """
    if n < 0:
        raise ValueError("n must be non-negative")
    if n < 2:
        return n
    a, b = 0, 1
    for _ in range(2, n + 1):
        a, b = b, a + b
    return b

if __name__ == "__main__":
    import doctest
    doctest.testmod()

Practice Exercises

Exercise 1: Document a Data Processing Function

Write a complete docstring for this function using Google style:

def process_sales_data(records, min_amount=0, region=None):
    processed = []
    for record in records:
        if record['amount'] < min_amount:
            continue
        if region and record['region'] != region:
            continue
        processed.append({
            'id': record['id'],
            'amount': record['amount'],
            'category': record.get('category', 'Uncategorized')
        })
    return processed

Solution:

def process_sales_data(records, min_amount=0, region=None):
    """Filter and normalize sales records.

    Processes a list of sales records, filtering by minimum
    amount and optionally by region. Normalizes output format
    by including a default category for records missing one.

    Args:
        records (list[dict]): List of sales record dictionaries.
            Each must contain 'id', 'amount', and 'region' keys.
        min_amount (float): Minimum sale amount to include.
            Defaults to 0 (include all).
        region (str, optional): Region code to filter by.
            Defaults to None (all regions).

    Returns:
        list[dict]: Filtered and normalized records with 'id',
        'amount', and 'category' keys.

    Example:
        >>> records = [
        ...     {'id': 1, 'amount': 100, 'region': 'US'},
        ...     {'id': 2, 'amount': 50, 'region': 'EU'},
        ... ]
        >>> process_sales_data(records, min_amount=60, region='US')
        [{'id': 1, 'amount': 100, 'category': 'Uncategorized'}]
    """
    processed = []
    for record in records:
        if record['amount'] < min_amount:
            continue
        if region and record['region'] != region:
            continue
        processed.append({
            'id': record['id'],
            'amount': record['amount'],
            'category': record.get('category', 'Uncategorized')
        })
    return processed

Exercise 2: Document a Class

Add a class docstring and method docstrings to this BankAccount class using NumPy style:

class BankAccount:
    def __init__(self, owner, balance=0):
        self.owner = owner
        self.balance = balance
        self.transactions = []

    def deposit(self, amount):
        if amount <= 0:
            raise ValueError("Deposit amount must be positive")
        self.balance += amount
        self.transactions.append(('deposit', amount))
        return self.balance

    def withdraw(self, amount):
        if amount <= 0:
            raise ValueError("Withdrawal amount must be positive")
        if amount > self.balance:
            raise ValueError("Insufficient funds")
        self.balance -= amount
        self.transactions.append(('withdraw', amount))
        return self.balance

Solution:

class BankAccount:
    """A simple bank account with transaction history.

    Manages deposits, withdrawals, and maintains a record
    of all transactions for auditing purposes.

    Attributes
    ----------
    owner : str
        The name of the account holder.
    balance : float
        Current account balance.
    transactions : list[tuple]
        List of (type, amount) tuples recording all activity.

    Examples
    --------
    >>> account = BankAccount("Alice", 1000)
    >>> account.deposit(500)
    1500
    >>> account.withdraw(200)
    1300
    """

    def __init__(self, owner, balance=0):
        """Initialize the bank account.

        Parameters
        ----------
        owner : str
            The account holder's name.
        balance : float, optional
            Initial deposit amount. Defaults to 0.
        """
        self.owner = owner
        self.balance = balance
        self.transactions = []

    def deposit(self, amount):
        """Deposit funds into the account.

        Parameters
        ----------
        amount : float
            The amount to deposit. Must be positive.

        Returns
        -------
        float
            The new account balance.

        Raises
        ------
        ValueError
            If amount is not positive.
        """
        if amount <= 0:
            raise ValueError("Deposit amount must be positive")
        self.balance += amount
        self.transactions.append(('deposit', amount))
        return self.balance

    def withdraw(self, amount):
        """Withdraw funds from the account.

        Parameters
        ----------
        amount : float
            The amount to withdraw. Must be positive and
            not exceed the current balance.

        Returns
        -------
        float
            The new account balance.

        Raises
        ------
        ValueError
            If amount is not positive or exceeds balance.
        """
        if amount <= 0:
            raise ValueError("Withdrawal amount must be positive")
        if amount > self.balance:
            raise ValueError("Insufficient funds")
        self.balance -= amount
        self.transactions.append(('withdraw', amount))
        return self.balance

Exercise 3: Write a Module Docstring

Create a comprehensive module docstring for a file called validators.py that contains email, URL, and phone number validation functions.

Solution:

"""
Data validation utilities.

Provides functions for validating common data formats
including email addresses, URLs, and phone numbers.

All validators return True if the input is valid and
raise ValueError with a descriptive message if not.

Typical usage example:

    from validators import validate_email, validate_phone

    if validate_email(user_input):
        process_registration(user_input)

    phone = validate_phone(raw_input, country='US')
"""

import re
from urllib.parse import urlparse


def validate_email(email):
    """Validate an email address format."""
    # implementation
    pass


def validate_url(url):
    """Validate a URL format."""
    # implementation
    pass


def validate_phone(phone, country='US'):
    """Validate a phone number for the given country."""
    # implementation
    pass

Key Takeaways

  1. Comments explain why; code explains what — Use comments to clarify reasoning, not to restate what the code does
  2. Docstrings are runtime documentation — They are accessible via __doc__ and help(), making them valuable for interactive exploration
  3. Follow PEP 257 — Use triple double quotes, write one-line and multi-line docstrings properly, and include a summary line
  4. Choose one docstring style — Google, NumPy, or Sphinx — and use it consistently across your project
  5. Document module, class, and function boundaries — Each should have its own docstring describing purpose, parameters, and return values
  6. Keep documentation updated — Outdated documentation is misleading; update docstrings whenever you change code
  7. Use tools like Sphinx and pdoc — Automate documentation generation to reduce manual effort
  8. Write testable docstrings — The doctest module can verify that your examples actually work

Professional documentation isn't overhead — it's an investment that pays dividends in maintainability, onboarding, and code quality.

Advertisement

Need Expert Python Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement