← Math|87 of 100
Numerical Methods

Root Finding

Master bisection, Newton's method, secant method, and fixed-point iteration for finding zeros.

📂 Roots📖 Lesson 87 of 100🎓 Free Course

Advertisement

Root Finding

ℹ️ Why It Matters

Root finding is one of the oldest and most fundamental problems in numerical computation. Every optimization algorithm reduces to finding where the gradient equals zero. Engineering simulations solve nonlinear equations at every timestep. Chemical equilibrium, financial pricing, and machine learning all depend on finding zeros of functions that cannot be solved analytically.

💡 Practical Impact

Finding the root of f(x) = 0 is equivalent to minimizing f(x)². Newton's method for root finding generalizes to Newton's method for optimization. The secant method is the backbone of quasi-Newton optimizers like BFGS used in almost every ML framework.


Definitions

DfRoot (Zero) of a Function

A root (or zero) of a function f is a value x* such that f(x*) = 0. Finding roots means solving the equation f(x) = 0, which may be transcendental (e.g., cos(x) = x), polynomial, or any other nonlinear equation.

DfBracket

A bracket is an interval [a, b] where f(a) and f(b) have opposite signs. By the Intermediate Value Theorem, if f is continuous on [a, b] and f(a)·f(b) < 0, then at least one root exists in (a, b).

DfConvergence Order

An iterative root-finding method converges with order p if |x_{n+1} − x*| ≤ C|x_n − x*|^p for some constant C. Order 1 is linear, order 2 is quadratic, order 1.618 (golden ratio) is superlinear (secant method).

DfSimple Root vs Multiple Root

A root x* of multiplicity m satisfies f(x*) = f'(x*) = ... = f^(m-1)(x*) = 0 but f^(m)(x*) ≠ 0. For simple roots (m = 1), Newton's method converges quadratically. For multiple roots, convergence degrades to linear unless modified.

DfFixed Point

A fixed point of a function g is a value x* such that g(x*) = x*. Finding a root of f(x) = 0 is equivalent to finding a fixed point of g(x) = x − f(x), or any g(x) = x + c·f(x) for suitable c.


Formulas

Bisection Method

c=a+b2,update: ac or bcc = \frac{a + b}{2}, \quad \text{update: } a \leftarrow c \text{ or } b \leftarrow c

Here,

  • a,ba, b=Current bracket endpoints with f(a)·f(b) < 0
  • cc=Midpoint of the bracket

Bisection Error Bound

xnxba2n|x_n - x^*| \leq \frac{b - a}{2^n}

Here,

  • xnx_n=Approximation after n iterations
  • xx*=True root
  • a,ba, b=Initial bracket

Newton-Raphson Method

xn+1=xnf(xn)f(xn)x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}

Here,

  • xnx_n=Current approximation
  • f(xn)f(x_n)=Function value at x_n
  • f(xn)f'(x_n)=Derivative at x_n (must be nonzero)

Newton's Method for Square Root

xn+1=12(xn+axn)x_{n+1} = \frac{1}{2}\left(x_n + \frac{a}{x_n}\right)

Here,

  • aa=Number whose square root is sought
  • xnx_n=Current approximation of √a

Secant Method

xn+1=xnf(xn)xnxn1f(xn)f(xn1)x_{n+1} = x_n - f(x_n) \cdot \frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}

Here,

  • xn,xn1x_n, x_{n-1}=Two most recent approximations
  • f(xn),f(xn1)f(x_n), f(x_{n-1})=Function values at those points

Fixed-Point Iteration

xn+1=g(xn)x_{n+1} = g(x_n)

Here,

  • gg=Fixed-point function (rewriting f(x) = 0 as x = g(x))
  • xnx_n=Current approximation

Fixed-Point Convergence Condition

g(x)<1|g'(x^*)| < 1

Here,

  • gg'=Derivative of the fixed-point function
  • xx*=The fixed point

Modified Newton for Multiple Roots

xn+1=xnmf(xn)f(xn)x_{n+1} = x_n - m \cdot \frac{f(x_n)}{f'(x_n)}

Here,

  • mm=Multiplicity of the root (known or estimated)

Bisection Method

ℹ️ How Bisection Works

Bisection is the simplest guaranteed-convergence method. Start with a bracket [a, b] where f changes sign. Compute the midpoint c = (a+b)/2. If f(c) has the same sign as f(a), the root is in [c, b]; otherwise it is in [a, c]. Repeat until the interval is smaller than the desired tolerance.

Properties

  • Guaranteed convergence: Always finds a root if one exists in the bracket
  • Linear convergence: Error halved each iteration (order 1)
  • Error bound: |x_n − x*| ≤ (b − a) / 2^n
  • Derivative-free: Only requires function evaluations
  • Drawback: Slow; cannot find roots without a bracket; cannot find multiple roots simultaneously

Newton-Raphson Method

ℹ️ How Newton's Method Works

Newton's method uses the tangent line at x_n to find the next approximation. The tangent line crosses the x-axis at x_{n+1} = x_n − f(x_n)/f'(x_n). This is equivalent to one step of Newton's method for finding the root of the linear approximation.

Properties

  • Quadratic convergence: Error squared each iteration near a simple root (order 2)
  • Requires derivative: f'(x) must be known and computable
  • Requires good initial guess: Can diverge if x_0 is far from the root
  • Division by zero: Fails when f'(x_n) = 0
  • Multiple roots: Convergence degrades to linear for roots with multiplicity > 1

Secant Method

ℹ️ How the Secant Method Works

The secant method approximates the derivative using two previous points: f'(x_n) ≈ (f(x_n) − f(x_{n-1})) / (x_n − x_{n-1}). Substituting into Newton's formula eliminates the need for derivatives. It requires two initial guesses but converges faster than bisection.

Properties

  • Superlinear convergence: Order φ ≈ 1.618 (golden ratio)
  • No derivative needed: Uses finite difference approximation
  • Two initial points: Requires x_0 and x_1
  • Not guaranteed to converge: Can diverge or cycle if initial guesses are poor
  • No bracket required: Unlike bisection, does not need a sign change

Fixed-Point Iteration

ℹ️ How Fixed-Point Iteration Works

Rewrite f(x) = 0 as x = g(x) for some function g. Starting from x_0, compute x_{n+1} = g(x_n). Convergence requires |g'(x*)| < 1 near the fixed point. The closer |g'(x*)| is to 0, the faster the convergence.

Choosing g(x)

For f(x) = x³ − x − 2 = 0, possible reformulations:

  1. g₁(x) = x³ − 2 → g₁'(x) = 3x², |g₁'(1.5)| = 6.75 > 1 (diverges)
  2. g₂(x) = (x + 2)^(1/3) → g₂'(x) = 1/[3(x+2)^(2/3)], |g₂'(1.5)| = 0.21 < 1 (converges)
  3. g₃(x) = x − f(x)/f'(x) → This is Newton's method

Examples

📝Finding a Root of x³ − x − 2 = 0

This polynomial has one real root near x ≈ 1.5214. We will apply all four methods and compare convergence.

💡Implementation: All Four Methods Compared

import numpy as np


def bisection(f, a, b, tol=1e-10, max_iter=100):
    """Bisection method with bracket validation."""
    if f(a) * f(b) > 0:
        raise ValueError("No sign change in [a, b]")

    history = []
    for i in range(max_iter):
        c = (a + b) / 2
        fc = f(c)
        history.append((i, c, abs(fc)))

        if abs(fc) < tol or (b - a) / 2 < tol:
            return c, history

        if f(a) * fc < 0:
            b = c
        else:
            a = c

    return (a + b) / 2, history


def newton(f, df, x0, tol=1e-10, max_iter=100):
    """Newton-Raphson method."""
    history = []
    x = x0

    for i in range(max_iter):
        fx = f(x)
        dfx = df(x)
        history.append((i, x, abs(fx)))

        if abs(fx) < tol:
            return x, history

        if abs(dfx) < 1e-15:
            raise RuntimeError("Derivative near zero; method fails")

        x = x - fx / dfx

    return x, history


def secant(f, x0, x1, tol=1e-10, max_iter=100):
    """Secant method."""
    history = []
    f0, f1 = f(x0), f(x1)

    for i in range(max_iter):
        history.append((i, x1, abs(f1)))

        if abs(f1) < tol:
            return x1, history

        if abs(f1 - f0) < 1e-15:
            raise RuntimeError("Function values too close")

        x_new = x1 - f1 * (x1 - x0) / (f1 - f0)
        x0, f0 = x1, f1
        x1 = x1 - f1 * (x1 - x0) / (f1 - f0) if False else x_new
        f1 = f(x1)

    return x1, history


def fixed_point(g, x0, tol=1e-10, max_iter=100):
    """Fixed-point iteration."""
    history = []
    x = x0

    for i in range(max_iter):
        x_new = g(x)
        history.append((i, x, abs(x_new - x)))

        if abs(x_new - x) < tol:
            return x_new, history

        x = x_new

    return x, history


# Define the function and its derivative
f = lambda x: x**3 - x - 2
df = lambda x: 3*x**2 - 1

# Fixed-point function (convergent)
g = lambda x: (x + 2)**(1/3)

print("Root Finding: x^3 - x - 2 = 0")
print("=" * 50)

# Bisection
root_bis, hist_bis = bisection(f, 1, 2)
print(f"Bisection: x = {root_bis:.12f} ({len(hist_bis)} iterations)")

# Newton
root_new, hist_new = newton(f, df, 1.5)
print(f"Newton:    x = {root_new:.12f} ({len(hist_new)} iterations)")

# Secant
root_sec, hist_sec = secant(f, 1.0, 2.0)
print(f"Secant:    x = {root_sec:.12f} ({len(hist_sec)} iterations)")

# Fixed-point
root_fp, hist_fp = fixed_point(g, 1.5)
print(f"Fixed-pt:  x = {root_fp:.12f} ({len(hist_fp)} iterations)")

# Verify
print(f"\nVerification: f({root_new:.12f}) = {f(root_new):.2e}")

📝Convergence Rate Comparison

The convergence order determines how many iterations are needed for a given accuracy. Quadratic convergence (Newton) roughly doubles the number of correct digits each iteration. Linear convergence (bisection) adds a constant number of digits.

💡Convergence Rate Demonstration

import numpy as np

def convergence_analysis():
    """Compare convergence rates of different methods."""
    f = lambda x: x**2 - 2  # Root is sqrt(2)
    df = lambda x: 2*x

    exact = np.sqrt(2)

    # Newton's method (quadratic)
    x = 1.5
    print("Newton's Method (quadratic convergence):")
    for i in range(6):
        error = abs(x - exact)
        print(f"  Iter {i}: x = {x:.15f}, error = {error:.2e}")
        x = x - f(x) / df(x)

    print()

    # Secant method (superlinear, order ~1.618)
    x0, x1 = 1.0, 2.0
    print("Secant Method (superlinear convergence):")
    for i in range(6):
        error = abs(x1 - exact)
        print(f"  Iter {i}: x = {x1:.15f}, error = {error:.2e}")
        x_new = x1 - f(x1) * (x1 - x0) / (f(x1) - f(x0))
        x0, x1 = x1, x_new

    print()

    # Bisection (linear)
    a, b = 1.0, 2.0
    print("Bisection Method (linear convergence):")
    for i in range(6):
        c = (a + b) / 2
        error = abs(c - exact)
        print(f"  Iter {i}: x = {c:.15f}, error = {error:.2e}")
        if f(a) * f(c) < 0:
            b = c
        else:
            a = c

convergence_analysis()

📝Newton's Method for Matrix Square Root

Newton's method extends to matrix equations. To find X such that X² = A (matrix square root), the iteration is X_{n+1} = 0.5(X_n + A·X_n⁻¹).

💡Matrix Square Root via Newton's Method

import numpy as np

def matrix_sqrt_newton(A, tol=1e-10, max_iter=100):
    """Newton's method for matrix square root: X² = A."""
    X = A.copy().astype(float)

    for i in range(max_iter):
        X_next = 0.5 * (X + A @ np.linalg.inv(X))

        if np.linalg.norm(X_next - X) < tol:
            return X_next, i + 1

        X = X_next

    return X, max_iter

# Example
A = np.array([[2.0, 1.0],
              [1.0, 3.0]])

X, iters = matrix_sqrt_newton(A)
print(f"Matrix square root:\n{X}")
print(f"\nVerification X²:\n{X @ X}")
print(f"\nOriginal A:\n{A}")
print(f"\nError: {np.linalg.norm(X @ X - A):.2e}")
print(f"Iterations: {iters}")

Theorems

ThIntermediate Value Theorem (Basis for Bisection)

If f is continuous on [a, b] and f(a) and f(b) have opposite signs (i.e., f(a)·f(b) < 0), then there exists at least one c ∈ (a, b) such that f(c) = 0.

ThNewton's Method Quadratic Convergence

If f is twice continuously differentiable, x* is a simple root (f'(x*) ≠ 0), and x_0 is sufficiently close to x*, then Newton's method converges quadratically: |x_{n+1} − x*| ≤ C|x_n − x*|² where C = |f''(x*)| / (2|f'(x*)|).

ThSecant Method Superlinear Convergence

The secant method converges superlinearly with order p = (1 + √5)/2 ≈ 1.618 (the golden ratio). The error satisfies: |e_{n+1}| ≤ C|e_n|^p where e_n = x_n − x*. This is faster than linear but slower than quadratic.

ThFixed-Point Convergence Theorem

If g is continuously differentiable on [a, b] and |g'(x)| ≤ L < 1 for all x ∈ [a, b], and g maps [a, b] into itself, then the fixed-point iteration x_{n+1} = g(x_n) converges to the unique fixed point x* for any x_0 ∈ [a, b].

ThContractive Mapping Theorem (Banach Fixed-Point Theorem)

If g is a contraction mapping on a complete metric space (i.e., |g(x) − g(y)| ≤ L|x − y| for some L < 1), then g has a unique fixed point x*, and the iteration converges to x* for any starting point.


Python Implementation

import numpy as np
from typing import Callable, Tuple, List, Optional


class RootFinder:
    """Comprehensive root-finding toolkit."""

    @staticmethod
    def bisection(f: Callable, a: float, b: float,
                  tol: float = 1e-10, max_iter: int = 100) -> Tuple[float, List]:
        """Bisection method with full history."""
        if f(a) * f(b) > 0:
            raise ValueError("No sign change in [a, b]")
        if f(a) * f(b) == 0:
            return a if f(a) == 0 else b, []

        history = []
        for i in range(max_iter):
            c = (a + b) / 2
            fc = f(c)
            history.append({'iter': i, 'x': c, 'f(x)': fc,
                           'interval': (a, b), 'width': b - a})

            if abs(fc) < tol or (b - a) / 2 < tol:
                return c, history

            if f(a) * fc < 0:
                b = c
            else:
                a = c

        return (a + b) / 2, history

    @staticmethod
    def newton(f: Callable, df: Callable, x0: float,
               tol: float = 1e-10, max_iter: int = 100) -> Tuple[float, List]:
        """Newton-Raphson method."""
        history = []
        x = x0

        for i in range(max_iter):
            fx = f(x)
            dfx = df(x)
            history.append({'iter': i, 'x': x, 'f(x)': fx, 'df(x)': dfx})

            if abs(fx) < tol:
                return x, history

            if abs(dfx) < 1e-15:
                raise RuntimeError(f"Derivative near zero at x={x}")

            x = x - fx / dfx

        return x, history

    @staticmethod
    def secant(f: Callable, x0: float, x1: float,
               tol: float = 1e-10, max_iter: int = 100) -> Tuple[float, List]:
        """Secant method."""
        history = []
        f0, f1 = f(x0), f(x1)

        for i in range(max_iter):
            history.append({'iter': i, 'x': x1, 'f(x)': f1})

            if abs(f1) < tol:
                return x1, history

            if abs(f1 - f0) < 1e-15:
                raise RuntimeError("Denominator too small")

            x_new = x1 - f1 * (x1 - x0) / (f1 - f0)
            x0, f0 = x1, f1
            x1 = x_new
            f1 = f(x1)

        return x1, history

    @staticmethod
    def fixed_point(g: Callable, x0: float,
                    tol: float = 1e-10, max_iter: int = 100) -> Tuple[float, List]:
        """Fixed-point iteration."""
        history = []
        x = x0

        for i in range(max_iter):
            x_new = g(x)
            history.append({'iter': i, 'x': x, 'x_new': x_new,
                           'diff': abs(x_new - x)})

            if abs(x_new - x) < tol:
                return x_new, history

            x = x_new

        return x, history

    @staticmethod
    def brentq(f: Callable, a: float, b: float,
               tol: float = 1e-10, max_iter: int = 100) -> Tuple[float, List]:
        """Brent's method — combines bisection with inverse quadratic interpolation."""
        if f(a) * f(b) > 0:
            raise ValueError("No sign change")

        fa, fb = f(a), f(b)
        history = []

        for i in range(max_iter):
            if abs(fb) < tol:
                return b, history

            # Try inverse quadratic interpolation
            c, fc = a, fa
            d = b - a
            e = d

            if abs(fa) > abs(fb):
                a, b = b, a
                fa, fb = fb, fa

            m = 0.5 * (a - b)

            if abs(m) < tol or fb == 0:
                return b, history

            if abs(e) >= tol and abs(fa) > abs(fb):
                s = fb / fa
                if a == c:
                    # Secant
                    p = 2 * m * s
                    q = 1 - s
                else:
                    # Inverse quadratic
                    q = fa / fc
                    r = fb / fc
                    p = s * (2 * m * q * (q - r) - (b - a) * (r - 1))
                    q = (q - 1) * (r - 1) * (s - 1)

                if p > 0:
                    q = -q
                else:
                    p = -p

                if 2 * p < min(3 * m * q - tol * abs(q),
                               abs(e * q)):
                    e = d
                    d = p / q
                else:
                    d = m
                    e = m
            else:
                d = m
                e = m

            a, fa = b, fb
            if abs(d) > tol:
                b = b + d
            else:
                b = b - tol if m > 0 else b + tol
            fb = f(b)
            history.append({'iter': i, 'x': b, 'f(x)': fb})

        return b, history


def demonstrate_methods():
    """Compare all methods on several test functions."""
    print("=" * 60)
    print("ROOT FINDING METHOD COMPARISON")
    print("=" * 60)

    tests = [
        ("x^3 - x - 2 = 0",
         lambda x: x**3 - x - 2,
         lambda x: 3*x**2 - 1,
         (1, 2), 1.5),
        ("cos(x) - x = 0",
         lambda x: np.cos(x) - x,
         lambda x: -np.sin(x) - 1,
         (0, 1), 0.5),
        ("e^x - 3 = 0",
         lambda x: np.exp(x) - 3,
         lambda x: np.exp(x),
         (0, 2), 1.0),
    ]

    finder = RootFinder()

    for name, f, df, bracket, x0 in tests:
        print(f"\n--- {name} ---")
        root, _ = finder.bisection(f, bracket[0], bracket[1])
        print(f"  Bisection: {root:.12f}")
        root, _ = finder.newton(f, df, x0)
        print(f"  Newton:    {root:.12f}")
        root, _ = finder.secant(f, bracket[0], bracket[1])
        print(f"  Secant:    {root:.12f}")

demonstrate_methods()

Applications in AI/ML

ℹ️ Root Finding in Machine Learning

Root finding is foundational to optimization. Every gradient-based optimizer finds where ∇f(x) = 0. Newton's method for optimization uses the Hessian: x_{n+1} = x_n − H⁻¹∇f(x_n). The secant method generalizes to quasi-Newton methods (BFGS, L-BFGS) that approximate the Hessian.

Optimization Connection

Root FindingOptimization Equivalent
Find x where f(x) = 0Find x where ∇f(x) = 0
Newton: x − f/f'Newton: x − H⁻¹∇f
Secant approximationBFGS Hessian approximation
BisectionGolden section search

Specific Applications

  • Logistic regression: Solving for weights requires finding roots of the gradient
  • Newton's method in neural networks: Second-order optimizers use Hessian inverse
  • Brent's method in scipy.optimize: Used for line search in L-BFGS
  • Fixed-point iteration: Expectation-Maximization (EM) algorithm is a fixed-point iteration
  • Eigenvalue problems: Finding eigenvalues reduces to finding roots of det(A − λI) = 0

Common Mistakes

MistakeProblemSolution
No bracket for bisectionMay miss the root entirelyVerify f(a)·f(b) < 0 before starting
Starting Newton far from rootMethod may divergeUse bisection to get close, then switch to Newton
Ignoring f'(x) = 0 in NewtonDivision by zero crashesCheck
Using fixed-point iteration blindlyMay diverge ifg'(x*)
Not checking convergenceMay return early or iterate foreverUse both
Assuming unique rootMay find wrong rootUse bracketing methods to isolate roots first
Too tight toleranceMore iterations than neededSet tolerance relative to problem scale
Ignoring floating point issuesCancellation in secant methodUse robust methods like Brent's when possible

Interview Questions

Q1: When would you use bisection vs Newton's method? A: Use bisection when you need guaranteed convergence and have a bracket, or when derivatives are unavailable. Use Newton's when you have a good initial guess and the derivative, for faster convergence. In practice, use a hybrid: bisection to bracket, Newton to refine.

Q2: Why does Newton's method fail for multiple roots? A: At a root of multiplicity m, f'(x*) = 0, so the tangent line is nearly horizontal. The iteration becomes x_{n+1} = x_n − m·f(x_n)/f'(x_n), converging linearly instead of quadratically. The modification requires knowing m, which is often unknown.

Q3: Explain the secant method's convergence order. A: The secant method has order p = (1+√5)/2 ≈ 1.618. This is proven by analyzing the error recurrence e_{n+1} ≈ C·e_n·e_{n-1}, which leads to the characteristic equation p² = p + 1, giving the golden ratio. It is faster than linear (bisection) but slower than quadratic (Newton).

Q4: How does Brent's method combine bisection and interpolation? A: Brent's method uses inverse quadratic interpolation when three points are available and the interpolation step stays within the current bracket. If the interpolation step would leave the bracket or fails to reduce the error sufficiently, it falls back to bisection. This gives the robustness of bisection with the speed of higher-order methods.

Q5: Give an example where fixed-point iteration diverges. A: For x² − 2 = 0, rewriting as g(x) = x² + 2 gives |g'(√2)| = 2√2 ≈ 2.83 > 1, so iteration diverges. Rewriting as g(x) = 2/x gives |g'(√2)| = 2/2 = 1, which is borderline. Only g(x) = (x + 2/x)/2 (Newton's method) has |g'(√2)| = 0, converging fastest.

Q6: How is root finding used in line search for optimization? A: Line search finds α that minimizes f(x + αd). This requires finding where d/dα f(x + αd) = 0, a one-dimensional root-finding problem. Brent's method is commonly used because it combines guaranteed convergence with speed. Wolfe conditions provide stopping criteria that ensure sufficient decrease.


Practice Problems

📝Problem 1: Find All Roots of a Polynomial

Find all real roots of x⁴ − 5x² + 4 = 0 using a combination of bracketing and Newton's method.

💡Solution

import numpy as np

def find_all_roots(f, df, search_range=(-10, 10), n_points=1000):
    """Find all roots by scanning for sign changes then refining with Newton."""
    roots = []
    x_vals = np.linspace(search_range[0], search_range[1], n_points)
    f_vals = [f(x) for x in x_vals]

    for i in range(len(f_vals) - 1):
        if f_vals[i] * f_vals[i+1] < 0:
            # Sign change found — use bisection to bracket, then Newton
            a, b = x_vals[i], x_vals[i+1]
            c = (a + b) / 2
            for _ in range(50):
                fc = f(c)
                if abs(fc) < 1e-12:
                    break
                if f(a) * fc < 0:
                    b = c
                else:
                    a = c
                c = (a + b) / 2

            # Refine with Newton
            x = c
            for _ in range(20):
                x = x - f(x) / df(x)
            roots.append(x)

    return roots

f = lambda x: x**4 - 5*x**2 + 4
df = lambda x: 4*x**3 - 10*x

roots = find_all_roots(f, df)
print(f"Found roots: {[f'{r:.6f}' for r in roots]}")
print(f"Values:      {[f'{f(r):.2e}' for r in roots]}")
# Expected: ±1, ±2

📝Problem 2: Newton's Method with Backtracking

Implement Newton's method with backtracking line search to ensure convergence even with poor initial guesses.

💡Solution

import numpy as np

def newton_backtracking(f, df, x0, alpha=0.5, beta=0.5,
                        tol=1e-10, max_iter=100):
    """Newton's method with backtracking line search."""
    x = x0

    for i in range(max_iter):
        fx = f(x)
        dfx = df(x)

        if abs(fx) < tol:
            return x, i

        # Newton direction
        dx = -fx / dfx

        # Backtracking line search
        t = 1.0
        while abs(f(x + t * dx)) > abs(fx) - 1e-4 * t * abs(dfx * dx):
            t *= beta
            if t < 1e-15:
                break

        x = x + t * dx

    return x, max_iter

# Test on a challenging function
f = lambda x: np.log(x) - 1  # Root at e
df = lambda x: 1/x

root, iters = newton_backtracking(f, df, x0=10.0)  # Far from root
print(f"Root: {root:.10f} (e = {np.e:.10f})")
print(f"Iterations: {iters}")

Quick Reference

📋Root Finding Quick Reference

MethodConvergenceDerivativeBracketInitial Guesses
BisectionLinear (order 1)NoRequired1 (bracket)
NewtonQuadratic (order 2)YesNo1
SecantSuperlinear (order 1.618)NoNo2
Fixed-pointLinear (order depends on g')NoNo1
Brent'sSuperlinear + robustNoRequired1 (bracket)

Error bounds:

  • Bisection after n iterations: |error| ≤ (b − a) / 2^n
  • Newton near simple root: |e_{n+1}| ≤ C|e_n|²
  • Secant: |e_{n+1}| ≤ C|e_n|^1.618

When to use each:

  • Bisection: Guaranteed convergence, no derivatives, have bracket
  • Newton: Fast convergence, derivative available, good initial guess
  • Secant: No derivative, faster than bisection, can risk divergence
  • Brent's: Production use — robust + fast

Cross-References

  • 086-Floating-Point — Rounding errors affect convergence criteria; machine epsilon determines achievable tolerance
  • 088-Integration — Quadrature methods use function evaluations subject to floating point error
  • 090-Error Analysis — Condition numbers determine sensitivity of root locations to perturbations in f
  • Optimization — Root finding of ∇f = 0 is the foundation of optimization; Newton's method generalizes to Newton's method for optimization
  • Linear Algebra — Eigenvalue computation requires finding roots of the characteristic polynomial
  • Differential Equations — Implicit ODE solvers require solving nonlinear equations at each timestep
Lesson Progress87 / 100