Numerical Methods — Computing with Math
ℹ️ Why It Matters
Computers can't do exact math with real numbers. Numerical methods bridge the gap between mathematical theory and computer implementation.
Floating Point Arithmetic
How Computers Store Numbers
IEEE 754 Standard:
ℹ️ IEEE 754 Standard
32-bit float:
- 1 bit sign + 8 bits exponent + 23 bits mantissa
64-bit double:
- 1 bit sign + 11 bits exponent + 52 bits mantissa
Precision Limits:
⚠️ Precision Limits
- Float: ~7 decimal digits
- Double: ~15 decimal digits
Example: (in floating point!)
Common Numerical Issues
⚠️ Common Numerical Issues
Loss of significance: Subtracting nearly equal numbers
But with limited precision: might lose significant digits
Overflow: Number too large to represent Underflow: Number too small (close to zero) to represent Catastrophic cancellation: When subtraction leads to huge relative error
Root Finding
Bisection Method
ℹ️ Bisection Method
- Start with where and have opposite signs
- Compute midpoint
- If has same sign as , replace with Otherwise, replace with
- Repeat until converged
Convergence: Linear (slow but guaranteed)
Newton's Method
Newton's Method
Here,
- =Current estimate
- =Derivative at current estimate
ℹ️ Newton's Method Properties
- Requires:
- Convergence: Quadratic (very fast)
- Can fail if
Secant Method
ℹ️ Secant Method
Like Newton's but approximates the derivative:
- Doesn't need the derivative
- Superlinear convergence
Numerical Integration
Riemann Sum
Riemann Sum
Here,
- =Width of each subinterval
ℹ️ Riemann Sum
Simple but inaccurate
Trapezoidal Rule
Trapezoidal Rule
Here,
- =Width of each subinterval
ℹ️ Trapezoidal Rule
Better accuracy than Riemann sum
Simpson's Rule
Simpson's Rule
Here,
- =Width of each subinterval
ℹ️ Simpson's Rule
- Even better accuracy
- Uses parabolic approximation
Monte Carlo Integration
Monte Carlo Integration
Here,
- =Random sample points
- =Number of samples
💡 Monte Carlo Integration
Works well in high dimensions!
Used in: Ray tracing, physics simulations, Bayesian inference
Linear System Solving
Gaussian Elimination
ℹ️ Gaussian Elimination
- Write augmented matrix
- Forward elimination: Convert to upper triangular
- Back substitution: Solve from bottom up
Time:
LU Decomposition
LU Decomposition
Here,
- =Lower triangular matrix
- =Upper triangular matrix
ℹ️ LU Decomposition
- Solve (forward substitution)
- Solve (back substitution)
Time: for decomposition, for each solve Efficient when solving for multiple vectors
Iterative Methods
Jacobi Method:
Jacobi Method
Here,
- =Updated value
- =Previous values
ℹ️ Jacobi Method
Simple but slow convergence
Gauss-Seidel Method:
ℹ️ Gauss-Seidel Method
Like Jacobi but uses updated values immediately. Faster convergence than Jacobi.
Conjugate Gradient:
ℹ️ Conjugate Gradient
- For symmetric positive definite matrices
- Much faster than Gauss-Seidel
- Time: where is the condition number
Eigenvalue Algorithms
Power Method:
ℹ️ Power Method
Find the largest eigenvalue:
- Start with random vector
- Repeat:
- Eigenvalue
Simple but only finds the largest eigenvalue
QR Algorithm:
ℹ️ QR Algorithm
- Start with
- Decompose
- converges to upper triangular (eigenvalues on diagonal)
Finds all eigenvalues. Time: per iteration
Optimization Algorithms
Line Search
ℹ️ Line Search
Given direction , find step size :
Methods:
- Exact line search (expensive)
- Backtracking line search (cheap, practical)
Trust Region Methods
ℹ️ Trust Region Methods
- Build a model of near current point
- Find the best point within a "trust region"
- If model is accurate, expand trust region
- If not, shrink trust region
Conjugate Gradient for Optimization
Conjugate Gradient Optimization
Here,
- =Symmetric positive definite matrix
- =Target vector
ℹ️ Conjugate Gradient Algorithm
- Start with ,
- Repeat:
Very efficient for large sparse systems
📋Key Takeaways
-
Floating point arithmetic has precision limits. IEEE 754 doubles give ~15 decimal digits; in floating point — always use tolerances when comparing real numbers.
-
Newton's Method converges quadratically. doubles correct digits each iteration, but requires the derivative and can fail if .
-
Integration methods trade accuracy for simplicity. Riemann sums are simple but inaccurate; Simpson's rule uses parabolic approximations for better精度; Monte Carlo integration excels in high dimensions.
-
LU Decomposition efficiently solves linear systems. Factor once in , then solve and in each — ideal when solving for multiple right-hand sides.
-
The Conjugate Gradient method solves sparse systems in . For symmetric positive definite matrices, it's far more efficient than direct methods for large-scale problems common in ML.
-
The condition number predicts numerical stability. Large means small input changes cause large output changes — always check conditioning before solving linear systems in production.