Interpolation
ℹ️ Why It Matters
Interpolation constructs new data points from a known set of points, enabling function approximation, data reconstruction, and efficient evaluation of expensive computations. It is essential for signal processing (reconstructing continuous signals from samples), computer graphics (smooth curves through control points), and scientific computing (evaluating special functions efficiently). Understanding interpolation methods is critical for avoiding Runge's phenomenon and choosing the right approach for your data.
💡 Practical Impact
GPS receivers interpolate satellite positions. Digital audio reconstructs analog waveforms via interpolation. Weather stations interpolate temperature measurements across a grid. In ML, kernel methods and Gaussian processes are fundamentally interpolation frameworks.
Definitions
DfInterpolation
Given n+1 data points (x₀, y₀), (x₁, y₁), ..., (xₙ, yₙ), interpolation finds a function p(x) that passes exactly through all points: p(xᵢ) = yᵢ for i = 0, 1, ..., n. The function p(x) is called the interpolant.
DfLagrange Interpolation
Lagrange interpolation constructs the interpolating polynomial as a linear combination of Lagrange basis polynomials: P(x) = Σⱼ yⱼ · Lⱼ(x), where Lⱼ(x) = Πᵢ≠ⱼ (x − xᵢ)/(xⱼ − xᵢ) Each basis polynomial Lⱼ equals 1 at xⱼ and 0 at all other nodes.
DfNewton Interpolation (Divided Differences)
DfCubic Spline
A cubic spline is a piecewise cubic polynomial S(x) that interpolates all data points with C² continuity — the function, first derivative, and second derivative are continuous at all interior knots. Between consecutive knots, S(x) is a cubic polynomial.
DfRunge's Phenomenon
When using equally-spaced nodes, high-degree polynomial interpolation (n large) can oscillate wildly near the endpoints of the interval. The classic example is f(x) = 1/(1+25x²), where the interpolating polynomial diverges from the function as n increases.
DfExtrapolation
Extrapolation estimates function values outside the range of known data points [x₀, xₙ]. Unlike interpolation, extrapolation is inherently unreliable because the function's behavior outside the data range is unconstrained.
DfNatural Cubic Spline
A natural cubic spline has boundary conditions S''(x₀) = 0 and S''(xₙ) = 0, meaning the second derivative vanishes at the endpoints. This produces the smoothest possible cubic spline interpolant.
Formulas
Lagrange Interpolating Polynomial
Here,
- =Degree of the polynomial (n+1 data points)
- =Data points
- =jth Lagrange basis polynomial
Divided Differences (Recursive)
Here,
- =Zeroth divided difference = f(x_i)
- =First divided difference = (f(x_j) − f(x_i))/(x_j − x_i)
Newton's Interpolating Polynomial
Here,
- =kth divided difference
Cubic Spline Coefficient System
Here,
- =Second derivative (moment) at knot i
- =Spacing: h_i = x_{i+1} − x_i
- =Function value at knot i
Spline Polynomial on [x_i, x_{i+1}]
Here,
- =(y_i/h_i) − m_i·h_i/6
- =(y_{i+1}/h_i) − m_{i+1}·h_i/6
Linear Extrapolation
Here,
- =Last known data point
Lagrange Interpolation
ℹ️ Properties of Lagrange Form
Lagrange interpolation is theoretically elegant but computationally expensive: adding a new data point requires recomputing all basis polynomials from scratch (O(n²) per evaluation). It is best suited for theoretical analysis and small datasets. For practical computation, Newton's form or spline interpolation is preferred.
Newton Interpolation
ℹ️ Advantages of Newton Form
Newton's divided difference form allows incremental addition of new data points without recomputing from scratch. The divided difference table is O(n²) to build but O(n) to evaluate via Horner's method. Adding a new point only requires computing one additional column of the table.
Spline Interpolation
ℹ️ Why Splines Work
Splines solve the fundamental problem of polynomial interpolation: high-degree polynomials oscillate (Runge's phenomenon). Cubic splines use piecewise low-degree polynomials, ensuring smoothness without oscillation. The cubic degree is the minimum needed for C² continuity. Splines have optimal approximation properties among piecewise polynomials.
Types of Splines
| Type | Continuity | Properties |
|---|---|---|
| Linear | C⁰ | Simple, non-smooth |
| Cubic | C² | Smooth, optimal |
| Natural cubic | C² + S''=0 at endpoints | Smoothest possible |
| Clamped cubic | C² + S' specified at endpoints | Matches derivative |
| Not-a-knot | C³ at interior knots | Smoothest possible C³ |
| Periodic | C² + periodic | For periodic functions |
Examples
📝Lagrange vs Newton Interpolation
Given the data points (0,1), (1,0), (2,1), (3,8), construct the interpolating polynomial using both Lagrange and Newton methods and verify they produce the same result.
💡Implementation: Both Methods Compared
import numpy as np
def lagrange_interpolation(x_data, y_data, x):
"""Lagrange interpolation evaluation."""
n = len(x_data)
result = 0.0
for j in range(n):
# Compute L_j(x)
L_j = 1.0
for i in range(n):
if i != j:
L_j *= (x - x_data[i]) / (x_data[j] - x_data[i])
result += y_data[j] * L_j
return result
def divided_difference_table(x_data, y_data):
"""Build the divided difference table."""
n = len(x_data)
dd = np.zeros((n, n))
dd[:, 0] = y_data
for j in range(1, n):
for i in range(n - j):
dd[i, j] = (dd[i+1, j-1] - dd[i, j-1]) / (x_data[i+j] - x_data[i])
return dd
def newton_interpolation(x_data, y_data, x):
"""Newton interpolation using divided differences."""
dd = divided_difference_table(x_data, y_data)
n = len(x_data)
# Evaluate using Horner's method
result = dd[0, n-1]
for k in range(n-2, -1, -1):
result = result * (x - x_data[k]) + dd[0, k]
return result
# Data points
x_data = np.array([0, 1, 2, 3], dtype=float)
y_data = np.array([1, 0, 1, 8], dtype=float)
# Print divided difference table
dd = divided_difference_table(x_data, y_data)
print("Divided Difference Table:")
print(f"{'x':<6} {'f[]':<10} {'f[,]':<10} {'f[,,]':<10} {'f[,,,]':<10}")
for i in range(len(x_data)):
row = f"{x_data[i]:<6.0f} "
for j in range(len(x_data) - i):
row += f"{dd[i,j]:<10.4f} "
print(row)
# Evaluate at test points
x_test = np.linspace(-0.5, 3.5, 100)
y_lagrange = [lagrange_interpolation(x_data, y_data, xi) for xi in x_test]
y_newton = [newton_interpolation(x_data, y_data, xi) for xi in x_test]
print(f"\nVerification: Lagrange == Newton at x=1.5:")
print(f" Lagrange: {lagrange_interpolation(x_data, y_data, 1.5):.10f}")
print(f" Newton: {newton_interpolation(x_data, y_data, 1.5):.10f}")
# The true function is f(x) = x^2 - 2x + 1 = (x-1)^2 for x in [0,2],
# but the last point (3,8) suggests a different function.
# Let's check what polynomial fits all 4 points
true_func = lambda x: x**3 - 3*x**2 + 2*x + 1
print(f"\nTrue function at x=1.5: {true_func(1.5):.10f}")
print(f"Interpolation at x=1.5: {newton_interpolation(x_data, y_data, 1.5):.10f}")
📝Runge's Phenomenon
Demonstrate how high-degree polynomial interpolation on equally-spaced nodes fails for the Runge function f(x) = 1/(1+25x²).
💡Runge's Phenomenon Demonstration
import numpy as np
def runge_function(x):
"""The classic Runge function."""
return 1.0 / (1.0 + 25 * x**2)
def polynomial_interpolation(x_data, y_data, x):
"""Evaluate interpolating polynomial using Newton form."""
n = len(x_data)
dd = np.zeros((n, n))
dd[:, 0] = y_data
for j in range(1, n):
for i in range(n - j):
dd[i, j] = (dd[i+1, j-1] - dd[i, j-1]) / (x_data[i+j] - x_data[i])
result = dd[0, n-1]
for k in range(n-2, -1, -1):
result = result * (x - x_data[k]) + dd[0, k]
return result
# Demonstrate Runge's phenomenon
print("Runge's Phenomenon: Polynomial Interpolation on [-1, 1]")
print(f"{'n (points)':<12} {'Max Error':<14}")
print("-" * 26)
x_fine = np.linspace(-1, 1, 1000)
y_true = runge_function(x_fine)
for n in [5, 10, 15, 20, 25]:
# Equally spaced nodes
x_nodes = np.linspace(-1, 1, n)
y_nodes = runge_function(x_nodes)
# Interpolate
y_interp = np.array([polynomial_interpolation(x_nodes, y_nodes, xi)
for xi in x_fine])
max_error = np.max(np.abs(y_interp - y_true))
print(f"{n:<12} {max_error:<14.6f}")
# Solution: use Chebyshev nodes instead
print(f"\nUsing Chebyshev nodes (avoids Runge's phenomenon):")
print(f"{'n (points)':<12} {'Max Error':<14}")
print("-" * 26)
for n in [5, 10, 15, 20, 25]:
# Chebyshev nodes
k = np.arange(n)
x_nodes = np.cos((2*k + 1) * np.pi / (2*n))
y_nodes = runge_function(x_nodes)
y_interp = np.array([polynomial_interpolation(x_nodes, y_nodes, xi)
for xi in x_fine])
max_error = np.max(np.abs(y_interp - y_true))
print(f"{n:<12} {max_error:<14.6f}")
📝Cubic Spline Construction
Construct a natural cubic spline through the data points (0,0), (1,1), (2,0), (3,1) and evaluate it.
💡Cubic Spline Implementation
import numpy as np
def cubic_spline_natural(x_data, y_data, x_eval):
"""Construct and evaluate a natural cubic spline."""
n = len(x_data) - 1 # Number of intervals
# Step 1: Compute spacings
h = np.diff(x_data)
# Step 2: Set up tridiagonal system for moments m = S''
# Natural spline boundary: m[0] = m[n] = 0
A = np.zeros((n+1, n+1))
b = np.zeros(n+1)
# Boundary conditions
A[0, 0] = 1
A[n, n] = 1
# Interior equations
for i in range(1, n):
A[i, i-1] = h[i-1]
A[i, i] = 2 * (h[i-1] + h[i])
A[i, i+1] = h[i]
b[i] = 6 * ((y_data[i+1] - y_data[i]) / h[i] -
(y_data[i] - y_data[i-1]) / h[i-1])
# Solve for moments
m = np.linalg.solve(A, b)
# Step 3: Evaluate spline
results = np.zeros_like(x_eval, dtype=float)
for j, x in enumerate(x_eval):
# Find which interval x belongs to
i = np.searchsorted(x_data, x, side='right') - 1
i = max(0, min(i, n-1))
xi = x_data[i]
xi1 = x_data[i+1]
hi = h[i]
mi = m[i]
mi1 = m[i+1]
yi = y_data[i]
yi1 = y_data[i+1]
# Spline formula
A_coeff = (mi1 - mi) / (6 * hi)
B_coeff = mi / 2
C_coeff = (yi1 - yi) / hi - (mi1 + 2*mi) * hi / 6
D_coeff = yi
dx = x - xi
results[j] = A_coeff * dx**3 + B_coeff * dx**2 + C_coeff * dx + D_coeff
return results, m
# Data
x_data = np.array([0, 1, 2, 3], dtype=float)
y_data = np.array([0, 1, 0, 1], dtype=float)
# Evaluate spline
x_fine = np.linspace(0, 3, 200)
y_spline, moments = cubic_spline_natural(x_data, y_data, x_fine)
# Compare with polynomial interpolation
def polynomial_interp(x_data, y_data, x):
n = len(x_data)
dd = np.zeros((n, n))
dd[:, 0] = y_data
for j in range(1, n):
for i in range(n - j):
dd[i, j] = (dd[i+1, j-1] - dd[i, j-1]) / (x_data[i+j] - x_data[i])
result = dd[0, n-1]
for k in range(n-2, -1, -1):
result = result * (x - x_data[k]) + dd[0, k]
return result
y_poly = np.array([polynomial_interp(x_data, y_data, xi) for xi in x_fine])
# Print spline details
print("Natural Cubic Spline")
print(f"Data points: {list(zip(x_data, y_data))}")
print(f"Moments (S''): {moments}")
print(f"\nSpline evaluation at x=0.5:")
y_05, _ = cubic_spline_natural(x_data, y_data, np.array([0.5]))
print(f" S(0.5) = {y_05[0]:.6f}")
print(f" Poly P(0.5) = {polynomial_interp(x_data, y_data, 0.5):.6f}")
print(f"\nSpline evaluation at x=1.5:")
y_15, _ = cubic_spline_natural(x_data, y_data, np.array([1.5]))
print(f" S(1.5) = {y_15[0]:.6f}")
print(f" Poly P(1.5) = {polynomial_interp(x_data, y_data, 1.5):.6f}")
📝Extrapolation Dangers
Show how extrapolation can produce wildly incorrect results, even when interpolation is accurate.
💡Extrapolation vs Interpolation
import numpy as np
# Known data from sin(x) on [0, 2π]
x_data = np.linspace(0, 2 * np.pi, 8)
y_data = np.sin(x_data)
def polynomial_extrapolate(x_data, y_data, x):
"""Evaluate Newton interpolating polynomial at x."""
n = len(x_data)
dd = np.zeros((n, n))
dd[:, 0] = y_data
for j in range(1, n):
for i in range(n - j):
dd[i, j] = (dd[i+1, j-1] - dd[i, j-1]) / (x_data[i+j] - x_data[i])
result = dd[0, n-1]
for k in range(n-2, -1, -1):
result = result * (x - x_data[k]) + dd[0, k]
return result
print("Extrapolation vs Interpolation for sin(x)")
print(f"{'x':<8} {'True':<12} {'Interp/Extrap':<16} {'Error':<12} {'Type':<12}")
print("-" * 60)
for x in [0.5, 1.0, 3.0, 6.0, 7.0, 8.0, 10.0]:
true_val = np.sin(x)
pred = polynomial_extrapolate(x_data, y_data, x)
error = abs(pred - true_val)
interp_type = "interpolation" if 0 <= x <= 2*np.pi else "EXTRAPOLATION"
print(f"{x:<8.1f} {true_val:<12.6f} {pred:<16.6f} {error:<12.6f} {interp_type}")
print("\nNote: Interpolation errors are small; extrapolation errors are huge!")
Theorems
ThUniqueness of Polynomial Interpolation
Given n+1 distinct data points, there exists exactly one polynomial of degree ≤ n that passes through all points. The Lagrange and Newton forms are simply two different representations of the same polynomial.
ThRunge's Theorem
For the function f(x) = 1/(1+25x²) on [−1, 1], the maximum error of polynomial interpolation with n+1 equally-spaced nodes grows without bound as n → ∞. Specifically, the error is O(2^n/n!) near the endpoints. This demonstrates that high-degree polynomial interpolation with equally-spaced nodes is not uniformly convergent.
ThSpline Optimality
Among all piecewise cubic polynomials with C² continuity that interpolate n data points, the natural cubic spline minimizes the integral of the square of the second derivative: ∫|S''(x)|²dx. This means natural cubic splines are the smoothest possible C² interpolant.
ThError of Polynomial Interpolation
If f is (n+1) times continuously differentiable on [a, b] and P_n is the polynomial interpolating f at n+1 distinct nodes x₀, ..., xₙ ∈ [a, b], then for any x ∈ [a, b]: f(x) − P_n(x) = f⁽ⁿ⁺¹⁾(ξ)/(n+1)! · Π(x − xᵢ) for some ξ ∈ [a, b]. The error depends on the (n+1)th derivative and the node product.
ThChebyshev Nodes Optimal
Using Chebyshev nodes xⱼ = cos((2j+1)π/(2(n+1))) minimizes the maximum of |Π(x − xᵢ)| on [−1, 1], giving the minimax polynomial interpolation. The error is O(1/2ⁿ), dramatically better than O(2ⁿ/n!) for equally-spaced nodes.
Python Implementation
import numpy as np
from typing import Callable, List, Tuple
class InterpolationToolkit:
"""Comprehensive interpolation toolkit."""
@staticmethod
def lagrange(x_data: np.ndarray, y_data: np.ndarray,
x: np.ndarray) -> np.ndarray:
"""Lagrange interpolation."""
n = len(x_data)
result = np.zeros_like(x, dtype=float)
for j in range(n):
L_j = np.ones_like(x, dtype=float)
for i in range(n):
if i != j:
L_j *= (x - x_data[i]) / (x_data[j] - x_data[i])
result += y_data[j] * L_j
return result
@staticmethod
def divided_differences(x_data: np.ndarray,
y_data: np.ndarray) -> np.ndarray:
"""Compute divided difference table."""
n = len(x_data)
dd = np.zeros((n, n))
dd[:, 0] = y_data
for j in range(1, n):
for i in range(n - j):
dd[i, j] = (dd[i+1, j-1] - dd[i, j-1]) / (x_data[i+j] - x_data[i])
return dd
@staticmethod
def newton(x_data: np.ndarray, y_data: np.ndarray,
x: np.ndarray) -> np.ndarray:
"""Newton interpolation using Horner's method."""
dd = InterpolationToolkit.divided_differences(x_data, y_data)
n = len(x_data)
result = np.full_like(x, dd[0, n-1], dtype=float)
for k in range(n-2, -1, -1):
result = result * (x - x_data[k]) + dd[0, k]
return result
@staticmethod
def natural_cubic_spline(x_data: np.ndarray, y_data: np.ndarray,
x_eval: np.ndarray) -> np.ndarray:
"""Natural cubic spline interpolation."""
n = len(x_data) - 1
h = np.diff(x_data)
# Set up tridiagonal system for moments
A = np.zeros((n+1, n+1))
b = np.zeros(n+1)
A[0, 0] = 1
A[n, n] = 1
for i in range(1, n):
A[i, i-1] = h[i-1]
A[i, i] = 2 * (h[i-1] + h[i])
A[i, i+1] = h[i]
b[i] = 6 * ((y_data[i+1] - y_data[i]) / h[i] -
(y_data[i] - y_data[i-1]) / h[i-1])
m = np.linalg.solve(A, b)
# Evaluate
results = np.zeros_like(x_eval, dtype=float)
for j, x in enumerate(x_eval):
i = np.searchsorted(x_data, x, side='right') - 1
i = max(0, min(i, n-1))
dx = x - x_data[i]
hi = h[i]
A_c = (m[i+1] - m[i]) / (6 * hi)
B_c = m[i] / 2
C_c = (y_data[i+1] - y_data[i]) / hi - (m[i+1] + 2*m[i]) * hi / 6
D_c = y_data[i]
results[j] = A_c * dx**3 + B_c * dx**2 + C_c * dx + D_c
return results
@staticmethod
def chebyshev_nodes(a: float, b: float, n: int) -> np.ndarray:
"""Generate Chebyshev nodes on [a, b]."""
k = np.arange(n)
nodes = np.cos((2*k + 1) * np.pi / (2*n))
return 0.5 * (b - a) * nodes + 0.5 * (b + a)
@staticmethod
def linear_spline(x_data: np.ndarray, y_data: np.ndarray,
x_eval: np.ndarray) -> np.ndarray:
"""Linear spline (piecewise linear interpolation)."""
return np.interp(x_eval, x_data, y_data)
def demonstrate_interpolation():
"""Comprehensive interpolation demonstration."""
print("=" * 60)
print("INTERPOLATION METHOD COMPARISON")
print("=" * 60)
toolkit = InterpolationToolkit()
# Test function
f = lambda x: np.sin(x) + 0.5 * np.cos(2 * x)
x_data = np.linspace(0, 2*np.pi, 8)
y_data = f(x_data)
x_fine = np.linspace(0, 2*np.pi, 200)
y_true = f(x_fine)
print("\n--- Method Comparison (8 data points) ---")
print(f"{'Method':<25} {'Max Error':<14}")
print("-" * 39)
y_lagrange = toolkit.lagrange(x_data, y_data, x_fine)
print(f"{'Lagrange':<25} {np.max(np.abs(y_lagrange - y_true)):<14.6e}")
y_newton = toolkit.newton(x_data, y_data, x_fine)
print(f"{'Newton':<25} {np.max(np.abs(y_newton - y_true)):<14.6e}")
y_spline = toolkit.natural_cubic_spline(x_data, y_data, x_fine)
print(f"{'Cubic Spline':<25} {np.max(np.abs(y_spline - y_true)):<14.6e}")
y_linear = toolkit.linear_spline(x_data, y_data, x_fine)
print(f"{'Linear Spline':<25} {np.max(np.abs(y_linear - y_true)):<14.6e}")
# Chebyshev nodes comparison
print("\n--- Chebyshev vs Equally-Spaced (Runge function) ---")
runge = lambda x: 1 / (1 + 25*x**2)
for n in [5, 10, 15]:
# Equally spaced
x_eq = np.linspace(-1, 1, n)
y_eq = runge(x_eq)
y_eq_interp = toolkit.lagrange(x_eq, y_eq, x_fine)
err_eq = np.max(np.abs(y_eq_interp - runge(x_fine)))
# Chebyshev
x_cheb = toolkit.chebyshev_nodes(-1, 1, n)
y_cheb = runge(x_cheb)
y_cheb_interp = toolkit.lagrange(x_cheb, y_cheb, x_fine)
err_cheb = np.max(np.abs(y_cheb_interp - runge(x_fine)))
print(f" n={n:>2}: Equally-spaced error = {err_eq:.6f}, "
f"Chebyshev error = {err_cheb:.6f}")
demonstrate_interpolation()
Applications in AI/ML
ℹ️ Interpolation in Machine Learning
Interpolation is fundamental to many ML techniques. Kernel methods (SVM, Gaussian processes) perform kernel interpolation in high-dimensional spaces. Neural networks are universal interpolators — they can memorize any finite dataset. Understanding interpolation theory helps explain generalization, overfitting, and the behavior of different model architectures.
Specific Applications
| Application | Method | Description |
|---|---|---|
| Gaussian processes | Kernel interpolation | Predict at new points using kernel-weighted average |
| RBF networks | Radial basis interpolation | Neural network with RBF activation functions |
| Data augmentation | Spline interpolation | Generate synthetic training examples |
| Feature engineering | Polynomial interpolation | Create interaction features |
| Time series | Spline smoothing | Reconstruct signals from irregular samples |
| Transfer learning | Function interpolation | Interpolate between model weights |
| Curriculum learning | Data interpolation | Interpolate between easy and hard examples |
Neural Tangent Kernel
In the infinite-width limit, neural networks perform kernel interpolation with the neural tangent kernel. The interpolating function is the minimum-norm solution in the RKHS (reproducing kernel Hilbert space) that passes through all training points.
Weight Space Interpolation
SLERP (Spherical Linear Interpolation) interpolates between model weight vectors on the unit hypersphere, used in fine-tuning and model merging.
Common Mistakes
| Mistake | Problem | Solution |
|---|---|---|
| Using high-degree polynomial | Runge's phenomenon causes oscillation | Use splines or Chebyshev nodes |
| Equally-spaced nodes for interpolation | Error grows exponentially near endpoints | Use Chebyshev or clustering nodes |
| Extrapolating far beyond data | Unreliable, often wildly wrong | Use domain knowledge; prefer interpolation |
| Not checking data monotonicity | Spline behavior may be unexpected | Verify x_data is sorted and unique |
| Using polynomial for periodic data | Polynomial can't capture periodicity | Use Fourier or periodic splines |
| Ignoring outlier data points | Polynomial/spline follows outliers | Use robust interpolation or pre-filter |
| Too few data points | Underfitting; missing features | Increase sampling density |
| Using spline for extrapolation | Splines extrapolate as cubic (often wrong) | Restrict to interpolation domain |
Interview Questions
Q1: What is Runge's phenomenon and how do you avoid it? A: Runge's phenomenon occurs when high-degree polynomial interpolation on equally-spaced nodes oscillates wildly near interval endpoints. For f(x) = 1/(1+25x²), the error grows as n increases. Solutions: (1) use cubic splines instead of high-degree polynomials, (2) use Chebyshev nodes (non-uniform spacing), (3) reduce the polynomial degree and accept piecewise approximation.
Q2: Compare Lagrange and Newton interpolation. A: Both produce the same polynomial (uniqueness theorem). Lagrange is theoretically cleaner but O(n²) to add a new point. Newton uses divided differences — O(n²) to build but O(n) to evaluate and incrementally updatable. Newton is preferred for computation; Lagrange for theory.
Q3: Why are cubic splines preferred over high-degree polynomials? A: Cubic splines use piecewise low-degree polynomials, avoiding Runge's phenomenon. They ensure C² continuity (smooth visually), have optimal approximation properties, and scale linearly with the number of data points. High-degree polynomials oscillate and are expensive to evaluate.
Q4: How does a natural cubic spline differ from a clamped cubic spline? A: Natural cubic spline: S''(x₀) = S''(xₙ) = 0 (second derivative zero at endpoints). Clamped cubic spline: S'(x₀) and S'(xₙ) are specified (usually matching the known derivative). Natural splines are smoother; clamped splines better approximate the true function when derivatives are known.
Q5: When is extrapolation appropriate? A: Extrapolation is appropriate only when: (1) you have strong physical/theoretical reasons to believe the trend continues, (2) the extrapolation distance is small relative to the data range, (3) the function is known to be monotone or have known asymptotic behavior. Never extrapolate black-box functions.
Q6: Explain the connection between interpolation and approximation theory. A: Interpolation requires the approximant to pass exactly through data points, while approximation (e.g., least squares) minimizes overall error. Interpolation is a special case of approximation. The key trade-off: interpolation has zero error at nodes but may oscillate between them; approximation distributes error evenly but doesn't match exactly.
Practice Problems
📝Problem 1: Implement Barycentric Lagrange Interpolation
Barycentric Lagrange interpolation is O(n) per evaluation after O(n²) preprocessing, and is numerically stable.
💡Solution
import numpy as np
def barycentric_lagrange(x_data, y_data, x):
"""Barycentric Lagrange interpolation — O(n) per evaluation."""
n = len(x_data)
# Precompute barycentric weights
weights = np.ones(n)
for j in range(n):
for i in range(n):
if i != j:
weights[j] /= (x_data[j] - x_data[i])
# Evaluate at each point in x
x = np.atleast_1d(x)
result = np.zeros_like(x, dtype=float)
for k, xi in enumerate(x):
# Check if xi is exactly a data point
exact_match = np.where(np.isclose(xi, x_data))[0]
if len(exact_match) > 0:
result[k] = y_data[exact_match[0]]
else:
# Barycentric formula
terms = weights / (xi - x_data)
result[k] = np.sum(terms * y_data) / np.sum(terms)
return result
# Test
x_data = np.array([0, 1, 2, 3, 4], dtype=float)
y_data = np.array([1, 3, 7, 13, 21], dtype=float) # y = x^2 + x + 1
x_test = np.array([0.5, 1.5, 2.5, 3.5])
result = barycentric_lagrange(x_data, y_data, x_test)
expected = x_test**2 + x_test + 1
print("Barycentric Lagrange Interpolation")
for xi, yi, ei in zip(x_test, result, expected):
print(f" f({xi}) = {yi:.6f} (expected {ei:.6f}, error {abs(yi-ei):.2e})")
📝Problem 2: Monotone Interpolation (PCHIP)
Implement Piecewise Cubic Hermite Interpolating Polynomial (PCHIP) that preserves monotonicity between data points.
💡Solution
import numpy as np
def pchip_slopes(x_data, y_data):
"""Compute slopes for monotone cubic interpolation."""
n = len(x_data)
h = np.diff(x_data)
delta = np.diff(y_data) / h
d = np.zeros(n)
# Interior slopes (Fritsch-Carlson method)
for i in range(1, n-1):
if delta[i-1] * delta[i] <= 0:
d[i] = 0
else:
d[i] = (delta[i-1] + delta[i]) / 2
# End slopes
d[0] = delta[0]
d[-1] = delta[-1]
# Monotonicity correction
for i in range(n-1):
if abs(delta[i]) < 1e-12:
d[i] = d[i+1] = 0
else:
alpha = d[i] / delta[i]
beta = d[i+1] / delta[i]
s = alpha**2 + beta**2
if s > 9:
tau = 3 / np.sqrt(s)
d[i] = tau * alpha * delta[i]
d[i+1] = tau * beta * delta[i]
return d
def pchip_interpolate(x_data, y_data, x_eval):
"""PCHIP interpolation (monotone-preserving)."""
d = pchip_slopes(x_data, y_data)
n = len(x_data) - 1
results = np.zeros_like(x_eval, dtype=float)
for j, x in enumerate(x_eval):
i = np.searchsorted(x_data, x, side='right') - 1
i = max(0, min(i, n-1))
h_i = x_data[i+1] - x_data[i]
t = (x - x_data[i]) / h_i
# Hermite basis functions
h00 = 2*t**3 - 3*t**2 + 1
h10 = t**3 - 2*t**2 + t
h01 = -2*t**3 + 3*t**2
h11 = t**3 - t**2
results[j] = (h00 * y_data[i] + h10 * h_i * d[i] +
h01 * y_data[i+1] + h11 * h_i * d[i+1])
return results
# Test with monotone data
x_data = np.array([0, 1, 2, 3, 4], dtype=float)
y_data = np.array([0, 1, 4, 9, 16], dtype=float) # Monotonically increasing
x_fine = np.linspace(0, 4, 100)
y_pchip = pchip_interpolate(x_data, y_data, x_fine)
# Check monotonicity
diffs = np.diff(y_pchip)
print("PCHIP Monotonicity Check:")
print(f" All differences positive: {np.all(diffs >= -1e-10)}")
print(f" Min diff: {np.min(diffs):.6f}")
print(f" Max diff: {np.max(diffs):.6f}")
Quick Reference
📋Interpolation Quick Reference
| Method | Complexity (eval) | Degree | Continuity | Oscillation |
|---|---|---|---|---|
| Lagrange | O(n²) | n | C∞ (polynomial) | Yes (Runge) |
| Newton | O(n) after O(n²) build | n | C∞ (polynomial) | Yes (Runge) |
| Linear spline | O(log n) | 1 | C⁰ | No |
| Cubic spline | O(log n) | 3 | C² | Minimal |
| Barycentric Lagrange | O(n) after O(n²) build | n | C∞ (polynomial) | Yes (Runge) |
When to use:
- Few points (n ≤ 5): Polynomial (Lagrange or Newton)
- Many points, smooth data: Natural cubic spline
- Must preserve monotonicity: PCHIP (Hermite)
- Must avoid Runge's: Chebyshev nodes or splines
- Periodic data: Fourier interpolation or periodic spline
- Extrapolation: Avoid if possible; use linear or physical model
Python shortcuts:
np.interp(x, x_data, y_data)— linear interpolationscipy.interpolate.lagrange(x_data, y_data)— Lagrange polynomialscipy.interpolate.CubicSpline(x_data, y_data)— cubic splinescipy.interpolate.PchipInterpolator(x_data, y_data)— monotone PCHIPscipy.interpolate.interp1d(x_data, y_data)— general 1D interpolation
Cross-References
- 086-Floating-Point — Polynomial evaluation in interpolation requires careful attention to floating point error; Horner's method minimizes operations and error
- 087-Root-Finding — Divided differences in Newton interpolation are computed via recursive formulas similar to secant method differences
- 088-Integration — Quadrature rules are derived by integrating interpolating polynomials; Gaussian quadrature uses optimal node placement from approximation theory
- 090-Error Analysis — Interpolation error analysis depends on derivatives of the function and the distribution of nodes
- Approximation Theory: Interpolation is a special case of function approximation; the Weierstrass theorem guarantees polynomial approximation exists
- Signal Processing: Shannon sampling theorem relates interpolation to signal reconstruction; Nyquist rate determines minimum sampling density