Integration Fundamentals

ℹ️ Why It Matters

Integration is the reverse operation of differentiation and one of the two pillars of calculus. While derivatives measure rates of change, integrals accumulate quantities: areas under curves, volumes of solids, total probability, and expected values. In machine learning, integration is indispensable — probability density functions are normalized by integrating to 1, expected values are computed as integrals, Bayesian inference requires marginalizing over posterior distributions, and variational inference approximates intractable integrals. Every time you compute the probability that a continuous random variable falls in a range, you are performing integration. Every time you sample from a distribution or estimate an expectation, integration is happening underneath. Mastering integration means understanding the mathematical foundation behind probability, statistics, and the training of probabilistic models.

What is an Integral

DfIndefinite Integral (Antiderivative)

The indefinite integral of a function $f(x)$ is a family of functions $F(x)$ whose derivative is $f(x)$ . It is written $\int f(x)\,dx = F(x) + C$ , where $C$ is an arbitrary constant of integration representing the fact that infinitely many antiderivatives differ only by a constant.

DfDefinite Integral

The definite integral of $f(x)$ from $a$ to $b$ is the signed area between the curve $y = f(x)$ and the $x$ -axis over the interval $[a, b]$ . It is defined as the limit of Riemann sums:

Riemann Sum Definition

\int_a^b f(x)\,dx = \lim_{n \to \infty} \sum_{i=1}^{n} f(x_i^*) \Delta x

Here,

$\Delta x$ =Width of each subinterval: (b - a) / n
$x_i^*$ =A sample point in the i-th subinterval
$f(x_i^*)$ =Function value at the sample point
$a, b$ =The lower and upper limits of integration

💡 Definite vs Indefinite

An indefinite integral produces a family of functions ( $F(x) + C$ ), while a definite integral produces a number (the signed area). The Fundamental Theorem of Calculus connects them: the definite integral equals the antiderivative evaluated at the bounds.

⚠️ Not All Functions Have Elementary Antiderivatives

Many common functions — such as $e^{-x^2}$ , $\frac{\sin x}{x}$ , and $\sqrt{\sin x}$ — do not have closed-form antiderivatives in terms of elementary functions. For these, we rely on numerical integration or special functions (e.g., the error function $\text{erf}(x)$ ).

Fundamental Theorem of Calculus

ThFundamental Theorem of Calculus (Part 1)

If $f$ is continuous on $[a, b]$ and $F(x) = \int_a^x f(t)\,dt$ , then $F$ is differentiable on $(a, b)$ and $F'(x) = f(x)$ . In other words, differentiation undoes integration.

ThFundamental Theorem of Calculus (Part 2)

If $f$ is continuous on $[a, b]$ and $F$ is any antiderivative of $f$ (i.e., $F'(x) = f(x)$ ), then:

Fundamental Theorem of Calculus

\int_a^b f(x)\,dx = F(b) - F(a)

Here,

$f(x)$ =The integrand — the function being integrated
$F(x)$ =An antiderivative of f, where F'(x) = f(x)
$a$ =The lower limit of integration
$b$ =The upper limit of integration
$F(b) - F(a)$ =The net change of F over [a, b]

ℹ️ Intuition

Part 1 says that if you build a function by integrating $f$ from a fixed point $a$ to a variable point $x$ , the rate at which this accumulated area grows is exactly $f(x)$ . Part 2 says that to compute the accumulated area, you only need any antiderivative — evaluate it at the top bound and subtract its value at the bottom bound. This transforms the hard problem of summing infinitely many infinitesimal contributions into the easy problem of evaluating a function at two points.

📝Applying the Fundamental Theorem

Problem: Compute $\int_0^2 (3x^2 + 2x)\,dx$ .

💡Solution

Find an antiderivative: $F(x) = x^3 + x^2$ (since $F'(x) = 3x^2 + 2x$ ).

Evaluate at the bounds: $F(2) - F(0) = (8 + 4) - (0 + 0) = 12$ .

The area under $3x^2 + 2x$ from $x = 0$ to $x = 2$ is $12$ .

Properties of Definite Integrals

Property	Formula	Description
Reversing bounds	$\int_a^b f(x)\,dx = -\int_b^a f(x)\,dx$	Swapping limits negates the integral
Additivity	$\int_a^b f(x)\,dx = \int_a^c f(x)\,dx + \int_c^b f(x)\,dx$	Split at any intermediate point $c$
Linearity	$\int_a^b [\alpha f(x) + \beta g(x)]\,dx = \alpha\int_a^b f(x)\,dx + \beta\int_a^b g(x)\,dx$	Constants factor out, integrals add
Zero-width	$\int_a^a f(x)\,dx = 0$	Integrating over a point gives zero
Positivity	If $f(x) \geq 0$ on $[a,b]$ , then $\int_a^b f(x)\,dx \geq 0$	Non-negative functions have non-negative integrals
Comparison	If $f(x) \leq g(x)$ on $[a,b]$ , then $\int_a^b f(x)\,dx \leq \int_a^b g(x)\,dx$	Inequality preserved under integration
Triangle Inequality	$\left\|\int_a^b f(x)\,dx\right\| \leq \int_a^b \|f(x)\|\,dx$	The integral of the absolute value bounds the absolute value of the integral

Basic Integration Rules

The following rules allow us to integrate complex functions by breaking them into simpler parts.

💡 Power Rule Exception

The power rule $\int x^n\,dx = \frac{x^{n+1}}{n+1} + C$ fails when $n = -1$ . In that case, $\int x^{-1}\,dx = \int \frac{1}{x}\,dx = \ln|x| + C$ . This is because $\frac{d}{dx}\ln|x| = \frac{1}{x}$ .

Integration by Substitution

DfIntegration by Substitution (u-substitution)

The substitution rule is the reverse of the chain rule for differentiation. If $u = g(x)$ is a differentiable function whose range is an interval, and $f$ is continuous on that interval, then:

Substitution Rule

\int f(g(x)) \cdot g'(x)\,dx = \int f(u)\,du \quad \text{where } u = g(x)

Here,

$u = g(x)$ =The substitution — the inner function
$du = g'(x)dx$ =The differential of u
$f(g(x)) \cdot g'(x)$ =The original integrand with chain rule factor

💡 When to Use Substitution

Use substitution when the integrand contains a function and its derivative (or a constant multiple of it). Look for a "inner function" $g(x)$ whose derivative $g'(x)$ appears as a factor. Common patterns: $\int \sin(\cos x) \cos x\,dx$ , $\int xe^{x^2}\,dx$ , $\int \frac{x}{x^2+1}\,dx$ .

📝Substitution Example 1

Problem: Compute $\int 2x \cos(x^2)\,dx$ .

💡Solution

Let $u = x^2$ , so $du = 2x\,dx$ .

$\int \cos(u)\,du = \sin(u) + C = \sin(x^2) + C$ .

📝Substitution Example 2

Problem: Compute $\int_0^{\pi/2} \sin^3(x) \cos(x)\,dx$ .

💡Solution

Let $u = \sin(x)$ , so $du = \cos(x)\,dx$ .

When $x = 0$ , $u = 0$ . When $x = \pi/2$ , $u = 1$ .

$\int_0^1 u^3\,du = \left[\frac{u^4}{4}\right]_0^1 = \frac{1}{4}$ .

⚠️ Don't Forget to Change the Bounds

When performing substitution on a definite integral, you must either change the limits of integration to match the new variable or back-substitute before evaluating. Forgetting to update the bounds is one of the most common errors.

Integration by Parts

DfIntegration by Parts

Integration by parts is the reverse of the product rule for differentiation. It is used to integrate products of functions:

Integration by Parts

\int u\,dv = uv - \int v\,du

Here,

$u$ =The part you differentiate (choose something that simplifies)
$dv$ =The part you integrate (choose something easy to integrate)
$du$ =The derivative of u
$v$ =The antiderivative of dv

Definite Integral Version

\int_a^b u\,dv = [uv]_a^b - \int_a^b v\,du

Here,

$[uv]_a^b$ =The boundary term: u(b)v(b) - u(a)v(a)
$\int_a^b v\,du$ =The remaining integral (often simpler)

💡 LIATE Rule for Choosing u

A helpful heuristic for choosing $u$ : Logarithmic → Inverse trig → Algebraic → Trigonometric → Exponential. Choose $u$ as whichever comes first in this list. The remaining factor becomes $dv$ .

📝Integration by Parts Example 1

Problem: Compute $\int x e^x\,dx$ .

💡Solution

Let $u = x$ (algebraic), $dv = e^x\,dx$ (exponential). Then $du = dx$ , $v = e^x$ .

$\int x e^x\,dx = xe^x - \int e^x\,dx = xe^x - e^x + C = e^x(x - 1) + C$ .

📝Integration by Parts Example 2

Problem: Compute $\int \ln x\,dx$ .

💡Solution

Let $u = \ln x$ , $dv = dx$ . Then $du = \frac{1}{x}\,dx$ , $v = x$ .

$\int \ln x\,dx = x\ln x - \int x \cdot \frac{1}{x}\,dx = x\ln x - \int 1\,dx = x\ln x - x + C$ .

📝Integration by Parts Example 3

Problem: Compute $\int e^x \sin x\,dx$ .

💡Solution

Apply integration by parts twice:

Let $u = \sin x$ , $dv = e^x\,dx$ . Then $du = \cos x\,dx$ , $v = e^x$ .

I = e^x \sin x - \int e^x \cos x\,dx

Now integrate $\int e^x \cos x\,dx$ by parts again:

Let $u = \cos x$ , $dv = e^x\,dx$ . Then $du = -\sin x\,dx$ , $v = e^x$ .

\int e^x \cos x\,dx = e^x \cos x + \int e^x \sin x\,dx = e^x \cos x + I

Substitute back: $I = e^x \sin x - (e^x \cos x + I) = e^x(\sin x - \cos x) - I$

2I = e^x(\sin x - \cos x)

I = \frac{e^x(\sin x - \cos x)}{2} + C

⚠️ Cyclic Integration by Parts

When integration by parts returns you to the original integral (as in the $e^x \sin x$ example), you can solve for the integral algebraically. This "cyclic" pattern occurs with products of exponentials and trigonometric functions.

Common Integrals

Integral	Result	Notes
$\int x^n\,dx$	$\frac{x^{n+1}}{n+1} + C$	$n \neq -1$
$\int \frac{1}{x}\,dx$	$\ln\|x\| + C$	The $n = -1$ case
$\int e^x\,dx$	$e^x + C$	Its own antiderivative
$\int a^x\,dx$	$\frac{a^x}{\ln a} + C$	$a > 0$ , $a \neq 1$
$\int \sin x\,dx$	$-\cos x + C$
$\int \cos x\,dx$	$\sin x + C$
$\int \sec^2 x\,dx$	$\tan x + C$
$\int \csc^2 x\,dx$	$-\cot x + C$
$\int \sec x \tan x\,dx$	$\sec x + C$
$\int \csc x \cot x\,dx$	$-\csc x + C$
$\int \frac{1}{\sqrt{1-x^2}}\,dx$	$\arcsin x + C$
$\int \frac{1}{1+x^2}\,dx$	$\arctan x + C$
$\int \tan x\,dx$	$\ln\|\sec x\| + C$	Rewrite as $\int \frac{\sin x}{\cos x}\,dx$
$\int \sec x\,dx$	$\ln\|\sec x + \tan x\| + C$
$\int \frac{1}{x^2+a^2}\,dx$	$\frac{1}{a}\arctan\frac{x}{a} + C$
$\int \frac{1}{\sqrt{a^2-x^2}}\,dx$	$\arcsin\frac{x}{a} + C$
$\int \frac{1}{x^2-a^2}\,dx$	$\frac{1}{2a}\ln\left\|\frac{x-a}{x+a}\right\| + C$	Partial fractions
$\int \frac{1}{\sqrt{x^2 \pm a^2}}\,dx$	$\ln\|x + \sqrt{x^2 \pm a^2}\| + C$
$\int \sinh x\,dx$	$\cosh x + C$
$\int \cosh x\,dx$	$\sinh x + C$
$\int \text{sech}^2 x\,dx$	$\tanh x + C$
$\int \text{csch}^2 x\,dx$	$-\text{coth } x + C$

Improper Integrals

DfImproper Integral

An improper integral is an integral where either the interval of integration is infinite or the integrand has an infinite discontinuity (vertical asymptote) within the interval. We evaluate it as a limit.

Infinite Upper Limit

\int_a^\infty f(x)\,dx = \lim_{b \to \infty} \int_a^b f(x)\,dx

Here,

$a$ =The finite lower bound
$b \to \infty$ =The upper bound approaches infinity

Infinite Lower Limit

\int_{-\infty}^b f(x)\,dx = \lim_{a \to -\infty} \int_a^b f(x)\,dx

Here,

$a \to -\infty$ =The lower bound approaches negative infinity
$b$ =The finite upper bound

Double Infinite

\int_{-\infty}^{\infty} f(x)\,dx = \int_{-\infty}^{c} f(x)\,dx + \int_{c}^{\infty} f(x)\,dx

Here,

$c$ =Any finite point (often 0)

DfConvergence and Divergence

An improper integral converges if the limit exists and is finite. It diverges if the limit does not exist or is infinite. Both parts must converge independently for the full integral to converge.

📝Improper Integral That Converges

Problem: Evaluate $\int_1^\infty \frac{1}{x^2}\,dx$ .

💡Solution

$\int_1^\infty \frac{1}{x^2}\,dx = \lim_{b \to \infty} \int_1^b x^{-2}\,dx = \lim_{b \to \infty} \left[-\frac{1}{x}\right]_1^b = \lim_{b \to \infty}\left(-\frac{1}{b} + 1\right) = 1$ .

The integral converges to $1$ .

📝Improper Integral That Diverges

Problem: Evaluate $\int_1^\infty \frac{1}{x}\,dx$ .

💡Solution

$\int_1^\infty \frac{1}{x}\,dx = \lim_{b \to \infty} \int_1^b \frac{1}{x}\,dx = \lim_{b \to \infty} [\ln x]_1^b = \lim_{b \to \infty} \ln b = \infty$ .

The integral diverges (grows without bound).

ℹ️ p-Integral Test

$\int_1^\infty \frac{1}{x^p}\,dx$ converges if and only if $p > 1$ . This is a quick way to determine convergence for power-type integrands. For example, $\int_1^\infty \frac{1}{x^{1.01}}\,dx$ converges (just barely), while $\int_1^\infty \frac{1}{x^{0.99}}\,dx$ diverges.

📝Improper Integral with Discontinuity

Problem: Evaluate $\int_0^1 \frac{1}{\sqrt{x}}\,dx$ .

💡Solution

The integrand has a vertical asymptote at $x = 0$ .

$\int_0^1 x^{-1/2}\,dx = \lim_{a \to 0^+} \int_a^1 x^{-1/2}\,dx = \lim_{a \to 0^+} [2\sqrt{x}]_a^1 = \lim_{a \to 0^+}(2 - 2\sqrt{a}) = 2$ .

The integral converges to $2$ .

Numerical Integration

When an antiderivative cannot be found in closed form, or when the integrand is defined only by data points, we approximate the integral numerically.

Trapezoidal Rule

DfTrapezoidal Rule

Approximates the area under the curve by dividing the interval into $n$ subintervals and approximating each strip as a trapezoid. The error is proportional to $h^2$ (second-order method).

Trapezoidal Rule

\int_a^b f(x)\,dx \approx \frac{h}{2}\left[f(x_0) + 2\sum_{i=1}^{n-1}f(x_i) + f(x_n)\right]

Here,

$h = \frac{b-a}{n}$ =Width of each subinterval
$x_i = a + ih$ =The i-th grid point
$n$ =Number of subintervals

Simpson's Rule

DfSimpson's Rule

Approximates the area using parabolic arcs instead of straight lines. Requires an even number of subintervals. The error is proportional to $h^4$ (fourth-order method), making it significantly more accurate than the trapezoidal rule for smooth functions.

Simpson's Rule

\int_a^b f(x)\,dx \approx \frac{h}{3}\left[f(x_0) + 4\sum_{i \text{ odd}} f(x_i) + 2\sum_{i \text{ even}} f(x_i) + f(x_n)\right]

Here,

$h = \frac{b-a}{n}$ =Width of each subinterval (n must be even)
$x_i = a + ih$ =The i-th grid point
$4 \sum_{i \text{ odd}}$ =Weight 4 for odd-indexed points
$2 \sum_{i \text{ even}}$ =Weight 2 for even-indexed points (except endpoints)

Gaussian Quadrature

DfGaussian Quadrature

A higher-order numerical integration method that chooses both the nodes and weights optimally. An $n$ -point Gaussian quadrature rule integrates polynomials of degree $2n - 1$ exactly. It is particularly efficient for smooth functions.

Method	Order of Accuracy	Best For	Nodes Required
Trapezoidal	$O(h^2)$	Rough data, quick estimates	Uniform grid
Simpson's	$O(h^4)$	Smooth functions, moderate precision	Uniform grid (even $n$ )
Gaussian	$O(h^{2n})$	High-precision integration of smooth functions	Optimally placed nodes
Monte Carlo	$O(1/\sqrt{n})$	High-dimensional integrals	Random samples

💡 Monte Carlo Integration in High Dimensions

For integrals over high-dimensional spaces (common in Bayesian inference and physics simulations), grid-based methods suffer from the curse of dimensionality — the number of grid points grows exponentially with dimension. Monte Carlo integration converges at rate $O(1/\sqrt{n})$ regardless of dimension, making it the only practical choice for high-dimensional integrals.

Python Implementation

Basic Integration with scipy.integrate

import numpy as np
from scipy import integrate

# Define the function to integrate
def f(x):
    return x**2 * np.exp(-x)

# Compute the integral from 0 to infinity (improper integral)
result, error = integrate.quad(f, 0, np.inf)
print(f"Integral of x^2 * exp(-x) from 0 to inf:")
print(f"  Result: {result:.6f}")
print(f"  Error estimate: {error:.2e}")
print(f"  Exact (2! = 2): 2.000000")

# Definite integral from 0 to 1
result, error = integrate.quad(lambda x: np.sin(x), 0, np.pi)
print(f"\nIntegral of sin(x) from 0 to pi: {result:.6f} (exact: 2)")

Numerical Methods Comparison

import numpy as np
from scipy import integrate

# Define test function: f(x) = x^2
f = lambda x: x**2
a, b = 0, 1
exact = 1/3

# Trapezoidal rule
for n in [10, 100, 1000]:
    x = np.linspace(a, b, n + 1)
    trap_result = np.trapz(f(x), x)
    print(f"Trapezoidal (n={n:4d}): {trap_result:.8f}  error: {abs(trap_result - exact):.2e}")

print()

# Simpson's rule
for n in [10, 100, 1000]:
    x = np.linspace(a, b, n + 1)
    simp_result = integrate.simpson(f(x), x=x)
    print(f"Simpson's   (n={n:4d}): {simp_result:.8f}  error: {abs(simp_result - exact):.2e}")

print()

# Gaussian quadrature (scipy)
for n in [5, 10, 20]:
    result, error = integrate.fixed_quad(lambda x: x**2, a, b, n=n)
    print(f"Gauss quad  (n={n:4d}): {result:.8f}  error: {abs(result - exact):.2e}")

# scipy.integrate.quad (adaptive)
result, error = integrate.quad(f, a, b)
print(f"\nAdaptive quad:      {result:.8f}  error: {error:.2e}")

Symbolic Integration with SymPy

import sympy as sp

x = sp.Symbol('x')

# Symbolic indefinite integral
f = x**2 * sp.exp(x)
F = sp.integrate(f, x)
print(f"Indefinite integral of x^2 * e^x: {F}")

# Definite integral
result = sp.integrate(x**2, (x, 0, 1))
print(f"Definite integral of x^2 from 0 to 1: {result}")

# Improper integral
result = sp.integrate(sp.exp(-x**2), (x, -sp.oo, sp.oo))
print(f"Integral of e^(-x^2) from -inf to inf: {result}")
print(f"  = sqrt(pi) = {sp.sqrt(sp.pi)}")

# Verify Fundamental Theorem
F = sp.integrate(sp.sin(x), x)
print(f"\nAntiderivative of sin(x): {F}")
print(f"FTC: F(pi) - F(0) = {F.subs(x, sp.pi) - F.subs(x, 0)}")

High-Dimensional Integration (Monte Carlo)

import numpy as np

def monte_carlo_integrate(f, bounds, n_samples=100000):
    """Monte Carlo integration for arbitrary dimensions."""
    dim = len(bounds)
    samples = np.random.uniform(
        low=[b[0] for b in bounds],
        high=[b[1] for b in bounds],
        size=(n_samples, dim)
    )
    values = np.array([f(s) for s in samples])
    volume = np.prod([b[1] - b[0] for b in bounds])
    mean_val = np.mean(values)
    std_err = np.std(values) / np.sqrt(n_samples)
    return mean_val * volume, std_err

# Example: integral of x^2 + y^2 over [0,1] x [0,1]
f = lambda p: p[0]**2 + p[1]**2
result, error = monte_carlo_integrate(f, [(0, 1), (0, 1)])
exact = 2/3  # integral of x^2 + y^2 over [0,1]^2
print(f"Monte Carlo estimate: {result:.6f} +/- {error:.6f}")
print(f"Exact value:          {exact:.6f}")

Applications in AI/ML

Probability Density Functions

ℹ️ Integration and Probability

The entire foundation of continuous probability rests on integration. A probability density function $f_X(x)$ must satisfy $\int_{-\infty}^{\infty} f_X(x)\,dx = 1$ (normalization), and the probability of any event is computed as an integral of the density.

Probability from PDF

P(a \leq X \leq b) = \int_a^b f_X(x)\,dx

Here,

$f_X(x)$ =The probability density function of the random variable X
$a, b$ =The interval bounds
$P(a \leq X \leq b)$ =The probability that X falls in [a, b]

Normalization Condition

\int_{-\infty}^{\infty} f_X(x)\,dx = 1

Here,

$f_X(x)$ =A valid PDF must integrate to 1 over its support

Expected Values and Moments

Expected Value

E[X] = \int_{-\infty}^{\infty} x \cdot f_X(x)\,dx

Here,

$E[X]$ =The expected value (mean) of X
$x \cdot f_X(x)$ =Each value weighted by its probability density

Variance

\text{Var}(X) = E[(X - \mu)^2] = \int_{-\infty}^{\infty} (x - \mu)^2 f_X(x)\,dx

Here,

$\mu = E[X]$ =The mean of X
$(x - \mu)^2$ =Squared deviation from the mean

General Moment

E[g(X)] = \int_{-\infty}^{\infty} g(x) \cdot f_X(x)\,dx

Here,

$g(X)$ =Any function of the random variable
$E[g(X)]$ =The expected value of g(X) — a weighted average

Bayesian Inference

ℹ️ Integrals in Bayesian Methods

Bayesian inference is built on integration. The posterior distribution requires the evidence (marginal likelihood), which is an integral: $p(\theta | \text{data}) = \frac{p(\text{data} | \theta) p(\theta)}{\int p(\text{data} | \theta) p(\theta)\,d\theta}$ . This integral is often intractable, which is why methods like MCMC, variational inference, and Laplace approximation exist — they all approximate this integral.

Evidence (Marginal Likelihood)

p(\text{data}) = \int p(\text{data} | \theta) \cdot p(\theta)\,d\theta

Here,

$p(\text{data} | \theta)$ =The likelihood of the data given parameters
$p(\theta)$ =The prior distribution over parameters
$p(\text{data})$ =The evidence — obtained by integrating over all parameter values

Common Probability Integrals

Distribution	PDF	Key Integral
Normal $\mathcal{N}(\mu, \sigma^2)$	$\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$	$\int_{-\infty}^{\infty} f(x)\,dx = 1$
Standard Normal	$\frac{1}{\sqrt{2\pi}}e^{-x^2/2}$	$\int_{-\infty}^{\infty} \frac{e^{-x^2/2}}{\sqrt{2\pi}}\,dx = 1$
Exponential $\text{Exp}(\lambda)$	$\lambda e^{-\lambda x}$ , $x \geq 0$	$\int_0^\infty \lambda e^{-\lambda x}\,dx = 1$
Beta $\text{Beta}(\alpha, \beta)$	$\frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)}$	$\int_0^1 f(x)\,dx = 1$
Gamma $\text{Gamma}(\alpha, \beta)$	$\frac{\beta^\alpha x^{\alpha-1}e^{-\beta x}}{\Gamma(\alpha)}$	$\int_0^\infty f(x)\,dx = 1$

Common Mistakes

Mistake	Incorrect	Correct	Explanation
Forgetting the constant of integration	$\int x^2\,dx = \frac{x^3}{3}$	$\int x^2\,dx = \frac{x^3}{3} + C$	Always include $+ C$ for indefinite integrals
Wrong sign on trig integral	$\int \sin x\,dx = \cos x + C$	$\int \sin x\,dx = -\cos x + C$	The integral of sine is negative cosine
Power rule on $1/x$	$\int \frac{1}{x}\,dx = \frac{x^0}{0} + C$	$\int \frac{1}{x}\,dx = \ln\|x\| + C$	Power rule fails at $n = -1$ ; use $\ln\|x\|$
Not changing bounds on substitution	Evaluate with old bounds after $u$ -sub	Update bounds when substituting	If $u = g(x)$ , new bounds are $g(a)$ and $g(b)$
Dropping absolute value in $\ln$	$\int \frac{1}{x}\,dx = \ln(x) + C$	$\int \frac{1}{x}\,dx = \ln\|x\| + C$	$\frac{1}{x}$ is defined for $x < 0$ too
Confusing integration with differentiation rules	Treating integrals like derivatives	Integration follows different rules	E.g., $\int \sin x \cos x\,dx \neq \sin x \cdot \sin x$
Forgetting boundary term in integration by parts	$\int u\,dv = -\int v\,du$	$\int u\,dv = uv - \int v\,du$	The $uv$ term is essential
Divergent improper integrals	Assuming $\int_1^\infty \frac{1}{x^p}\,dx$ always converges	Converges only for $p > 1$	Check convergence before evaluating
Wrong $du$ in substitution	Forgetting to include the derivative	$du = g'(x)\,dx$ not just $du = g(x)$	The differential $du$ must include the derivative
Splitting integrals incorrectly	$\int_{-1}^{1} \frac{1}{x}\,dx = 0$ (by symmetry)	$\int_{-1}^{1} \frac{1}{x}\,dx$ diverges	The integrand has a singularity at $x = 0$

⚠️ Double-Check Your Work

After computing an integral, verify by differentiating your answer. If $\int f(x)\,dx = F(x) + C$ , then $F'(x)$ should equal $f(x)$ . For definite integrals, check boundary values and sign consistency.

Interview Questions

Q1: State the Fundamental Theorem of Calculus and explain its significance.

💡Answer

The Fundamental Theorem of Calculus has two parts:

Part 1: If $F(x) = \int_a^x f(t)\,dt$ and $f$ is continuous, then $F'(x) = f(x)$ . This shows differentiation undoes integration.

Part 2: If $F$ is any antiderivative of $f$ , then $\int_a^b f(x)\,dx = F(b) - F(a)$ . This allows us to compute definite integrals using antiderivatives.

Significance: It connects the two branches of calculus (differential and integral), transforms the problem of computing areas into evaluating antiderivatives, and provides the theoretical foundation for the entire field.

Q2: When would you use integration by parts vs. substitution?

💡Answer

Substitution is used when the integrand contains a composition of functions where the inner function's derivative appears as a factor (reverse of the chain rule). Example: $\int 2x\cos(x^2)\,dx$ .
Integration by parts is used for products of different types of functions (reverse of the product rule). Example: $\int x e^x\,dx$ .
A good heuristic: if you see a "inner-derivative" pattern, use substitution. If you see a product of different function types (algebraic × exponential, algebraic × trig, etc.), use by parts. The LIATE rule helps choose $u$ .

Q3: Why does $\int_{-\infty}^{\infty} e^{-x^2}\,dx = \sqrt{\pi}$ ? Why is this important in ML?

💡Answer

This is the Gaussian integral. It cannot be computed by finding an antiderivative (since $e^{-x^2}$ has no elementary antiderivative). The classic proof squares the integral and converts to polar coordinates:

$\left(\int_{-\infty}^{\infty} e^{-x^2}\,dx\right)^2 = \int_{-\infty}^{\infty}\int_{-\infty}^{\infty} e^{-(x^2+y^2)}\,dx\,dy = \int_0^{2\pi}\int_0^{\infty} e^{-r^2} r\,dr\,d\theta = \pi$ .

Taking the square root gives $\sqrt{\pi}$ .

Importance in ML: This integral is the normalization constant for the Gaussian distribution, the most important distribution in statistics and ML. It ensures $\int_{-\infty}^{\infty} \frac{1}{\sigma\sqrt{2\pi}}e^{-(x-\mu)^2/(2\sigma^2)}\,dx = 1$ , which is required for any valid probability distribution.

Q4: Explain the difference between convergence and divergence for improper integrals. Give an example of each.

💡Answer

An improper integral converges if the limit exists and is finite; it diverges if the limit does not exist or is infinite.

Convergent example: $\int_1^\infty \frac{1}{x^2}\,dx = \lim_{b\to\infty}\left[-\frac{1}{x}\right]_1^b = 1$ . The area is finite despite the infinite interval.

Divergent example: $\int_1^\infty \frac{1}{x}\,dx = \lim_{b\to\infty}\ln b = \infty$ . The area grows without bound.

Key insight: The p-test says $\int_1^\infty \frac{1}{x^p}\,dx$ converges iff $p > 1$ . For integrals near a singularity like $\int_0^1 \frac{1}{x^p}\,dx$ , it converges iff $p < 1$ .

Q5: How is numerical integration used when analytical solutions are unavailable?

💡Answer

When the antiderivative cannot be expressed in closed form (e.g., $e^{-x^2}$ , $\frac{\sin x}{x}$ , or integrands defined only by data), we use numerical methods:

Trapezoidal rule: Approximate area with trapezoids. Simple, $O(h^2)$ convergence. Good for rough data.
Simpson's rule: Use parabolic arcs. $O(h^4)$ convergence. Better for smooth functions.
Gaussian quadrature: Optimally placed nodes. Integrates degree $2n-1$ polynomials exactly with $n$ points.
Monte Carlo integration: Random sampling. Converges at $O(1/\sqrt{n})$ but is the only practical method for high-dimensional integrals (critical in Bayesian inference).
Adaptive quadrature: Automatically refines the grid where the integrand is difficult. scipy.integrate.quad uses this.

Q6: What role does integration play in training probabilistic models?

💡Answer

Integration is essential in several aspects of probabilistic model training:

Normalization: Every PDF must integrate to 1. For complex models, this constant is often intractable.
Marginalization: To compute $p(x) = \int p(x, \theta)\,d\theta$ , we integrate out latent variables.
Evidence computation: $p(\text{data}) = \int p(\text{data}|\theta)p(\theta)\,d\theta$ is needed for model comparison and Bayesian model selection.
Expected loss: $\mathbb{E}[\mathcal{L}] = \int \mathcal{L}(\theta) p(\theta|\text{data})\,d\theta$ — computing the expected loss over the posterior.
Variational inference: Approximates intractable integrals with tractable ones by minimizing KL divergence.
MCMC: Draws samples from posteriors by constructing Markov chains whose stationary distribution is the target — a way to estimate integrals via sampling.

Q7: Prove that $\int_0^1 x^n\,dx = \frac{1}{n+1}$ using the definition of the integral.

💡Answer

Using the Riemann sum with a uniform partition into $n$ subintervals of width $\Delta x = 1/n$ :

\int_0^1 x^n\,dx = \lim_{n\to\infty} \sum_{i=1}^{n} \left(\frac{i}{n}\right)^n \cdot \frac{1}{n} = \lim_{n\to\infty} \frac{1}{n^{n+1}} \sum_{i=1}^{n} i^n

Using Faulhaber's formula $\sum_{i=1}^{n} i^n \approx \frac{n^{n+1}}{n+1}$ for large $n$ :

\int_0^1 x^n\,dx = \lim_{n\to\infty} \frac{1}{n^{n+1}} \cdot \frac{n^{n+1}}{n+1} = \frac{1}{n+1}

Alternatively, by the Fundamental Theorem: $\int_0^1 x^n\,dx = \left[\frac{x^{n+1}}{n+1}\right]_0^1 = \frac{1}{n+1}$ .

Practice Problems

📝Problem 1: Substitution

Compute $\int x\sqrt{1 + x^2}\,dx$ .

💡Solution

Let $u = 1 + x^2$ , so $du = 2x\,dx$ and $x\,dx = \frac{du}{2}$ .

$\int \sqrt{u} \cdot \frac{du}{2} = \frac{1}{2} \int u^{1/2}\,du = \frac{1}{2} \cdot \frac{u^{3/2}}{3/2} + C = \frac{(1+x^2)^{3/2}}{3} + C$ .

📝Problem 2: Integration by Parts

Compute $\int x^2 e^x\,dx$ .

💡Solution

First application: $u = x^2$ , $dv = e^x\,dx$ . Then $du = 2x\,dx$ , $v = e^x$ .

\int x^2 e^x\,dx = x^2 e^x - 2\int xe^x\,dx

Second application: $u = x$ , $dv = e^x\,dx$ . Then $du = dx$ , $v = e^x$ .

\int xe^x\,dx = xe^x - \int e^x\,dx = xe^x - e^x

Combine: $x^2 e^x - 2(xe^x - e^x) + C = e^x(x^2 - 2x + 2) + C$ .

📝Problem 3: Definite Integral with Substitution

Compute $\int_0^1 \frac{e^x}{1 + e^x}\,dx$ .

💡Solution

Let $u = 1 + e^x$ , so $du = e^x\,dx$ .

When $x = 0$ : $u = 2$ . When $x = 1$ : $u = 1 + e$ .

$\int_2^{1+e} \frac{1}{u}\,du = [\ln|u|]_2^{1+e} = \ln(1+e) - \ln 2 = \ln\frac{1+e}{2}$ .

📝Problem 4: Improper Integral

Determine whether $\int_1^\infty \frac{\ln x}{x^2}\,dx$ converges, and if so, evaluate it.

💡Solution

Use integration by parts: $u = \ln x$ , $dv = x^{-2}\,dx$ . Then $du = \frac{1}{x}\,dx$ , $v = -\frac{1}{x}$ .

\int \frac{\ln x}{x^2}\,dx = -\frac{\ln x}{x} + \int \frac{1}{x^2}\,dx = -\frac{\ln x}{x} - \frac{1}{x}

Evaluate the improper integral:

\int_1^\infty \frac{\ln x}{x^2}\,dx = \lim_{b\to\infty}\left[-\frac{\ln x}{x} - \frac{1}{x}\right]_1^b = \lim_{b\to\infty}\left(-\frac{\ln b}{b} - \frac{1}{b} + 0 + 1\right) = 0 - 0 + 1 = 1

The integral converges to $1$ .

📝Problem 5: Probability Application

Let $X$ have PDF $f_X(x) = \frac{1}{2}e^{-|x|}$ for $x \in \mathbb{R}$ (Laplace distribution). Find $E[X]$ and $\text{Var}(X)$ .

💡Solution

Expected value: By symmetry of $f_X(x)$ (even function) and $x \cdot f_X(x)$ (odd function):

$E[X] = \int_{-\infty}^{\infty} x \cdot \frac{1}{2}e^{-|x|}\,dx = 0$ (odd integrand over symmetric interval).

Variance: $\text{Var}(X) = E[X^2] - (E[X])^2 = E[X^2]$

E[X^2] = \int_{-\infty}^{\infty} x^2 \cdot \frac{1}{2}e^{-|x|}\,dx = 2\int_0^{\infty} x^2 \cdot \frac{1}{2}e^{-x}\,dx = \int_0^{\infty} x^2 e^{-x}\,dx

Using integration by parts twice (or the gamma function $\Gamma(3) = 2! = 2$ ):

$\text{Var}(X) = 2$ .

Quick Reference

📋Key Takeaways

Indefinite Integral: $\int f(x)\,dx = F(x) + C$ where $F'(x) = f(x)$ — a family of antiderivatives.
Definite Integral: $\int_a^b f(x)\,dx = F(b) - F(a)$ — the signed area under the curve, computed via the Fundamental Theorem.
Fundamental Theorem: Connects differentiation and integration: $\frac{d}{dx}\int_a^x f(t)\,dt = f(x)$ and $\int_a^b f(x)\,dx = F(b) - F(a)$ .
Power Rule: $\int x^n\,dx = \frac{x^{n+1}}{n+1} + C$ for $n \neq -1$ ; for $n = -1$ : $\int \frac{1}{x}\,dx = \ln|x| + C$ .
Substitution: $\int f(g(x))g'(x)\,dx = \int f(u)\,du$ — reverse of the chain rule.
Integration by Parts: $\int u\,dv = uv - \int v\,du$ — reverse of the product rule; use LIATE to choose $u$ .
Improper Integrals: Evaluate as limits; converges if the limit is finite, diverges otherwise.
Numerical Methods: Trapezoidal ( $O(h^2)$ ), Simpson's ( $O(h^4)$ ), Gaussian quadrature ( $O(h^{2n})$ ), Monte Carlo ( $O(1/\sqrt{n})$ ).
Probability: $P(a \leq X \leq b) = \int_a^b f_X(x)\,dx$ , $E[X] = \int x f_X(x)\,dx$ , normalization requires $\int f_X(x)\,dx = 1$ .
Bayesian Integration: Evidence $p(\text{data}) = \int p(\text{data}|\theta)p(\theta)\,d\theta$ is often intractable, motivating MCMC and variational methods.

Cross-References

Limits and Continuity: The foundation for understanding integrals as limits of Riemann sums → Limits and Continuity
Derivatives and Differentiation: Integration is the reverse operation of differentiation → Derivatives and Differentiation
Chain Rule: Integration by substitution is the reverse of the chain rule → Chain Rule and Implicit Differentiation
Multivariable Calculus: Double and triple integrals extend integration to multiple dimensions → Multivariable Calculus
Taylor Series: Polynomial approximations used in deriving integration rules → Taylor Series
Probability Foundations: Integration is the backbone of continuous probability → Probability Foundations
Probability Distributions: Common distributions and their integral properties → Probability Distributions
Expectation and Variance: Expected values are integrals of functions against PDFs → Expectation and Variance
Numerical Integration: In-depth coverage of numerical methods → Numerical Methods
Differential Equations: Many differential equations are solved by integration → Differential Equations
Optimization: Integration in the context of optimization and Lagrange multipliers → Optimization Fundamentals
Statistics (MLE): MLE involves integrals over likelihood functions → Maximum Likelihood Estimation
Bayesian Statistics: Posterior computation requires integration → Bayesian Statistics

Rule	Formula	Example
Constant	$\int c\,dx = cx + C$	$\int 5\,dx = 5x + C$
Power Rule	$\int x^n\,dx = \frac{x^{n+1}}{n+1} + C$ ( $n \neq -1$ )	$\int x^3\,dx = \frac{x^4}{4} + C$
Reciprocal	$\int \frac{1}{x}\,dx = \ln\|x\| + C$	$\int \frac{1}{x}\,dx = \ln\|x\| + C$
Exponential	$\int e^x\,dx = e^x + C$	$\int e^x\,dx = e^x + C$
General Exponential	$\int a^x\,dx = \frac{a^x}{\ln a} + C$	$\int 3^x\,dx = \frac{3^x}{\ln 3} + C$
Sine	$\int \sin x\,dx = -\cos x + C$	$\int \sin x\,dx = -\cos x + C$
Cosine	$\int \cos x\,dx = \sin x + C$	$\int \cos x\,dx = \sin x + C$
Secant	$\int \sec^2 x\,dx = \tan x + C$	$\int \sec^2 x\,dx = \tan x + C$
Cosecant	$\int \csc^2 x\,dx = -\cot x + C$	$\int \csc^2 x\,dx = -\cot x + C$
Secant-Tangent	$\int \sec x \tan x\,dx = \sec x + C$	$\int \sec x \tan x\,dx = \sec x + C$
Cosecant-Cotangent	$\int \csc x \cot x\,dx = -\csc x + C$	$\int \csc x \cot x\,dx = -\csc x + C$
Inverse Sine	$\int \frac{1}{\sqrt{1-x^2}}\,dx = \arcsin x + C$	$\int \frac{1}{\sqrt{1-x^2}}\,dx = \arcsin x + C$
Inverse Tangent	$\int \frac{1}{1+x^2}\,dx = \arctan x + C$	$\int \frac{1}{1+x^2}\,dx = \arctan x + C$
Hyperbolic Sine	$\int \sinh x\,dx = \cosh x + C$	$\int \sinh x\,dx = \cosh x + C$
Hyperbolic Cosine	$\int \cosh x\,dx = \sinh x + C$	$\int \cosh x\,dx = \sinh x + C$

Integration Fundamentals