Skip to content

Integral Calculus

Integral calculus accumulates quantities over intervals, turning local rates back into totals. This file covers definite and indefinite integrals, the Fundamental Theorem of Calculus, integration techniques, and applications to probability densities and expected values in ML.

  • Differentiation tells us the rate of change at a single point. Integration goes the other way: it accumulates many tiny pieces to compute a total.

  • If the derivative answers "how fast?", the integral answers "how much?"

  • The simplest way to think about integration is as the area under a curve. If you plot a function \(f(x)\) and shade the region between the curve and the x-axis from \(x = a\) to \(x = b\), the integral gives the signed area of that region.

Integration computes the area under a curve by summing thin rectangles

  • Why "signed"? Regions above the x-axis contribute positive area, regions below contribute negative area. This makes physical sense: if \(f(x)\) represents velocity, the integral gives net displacement (forward minus backward), not total distance.

  • To compute this area, imagine slicing the region into \(n\) thin vertical rectangles, each of width \(\Delta x\). The height of each rectangle is the function value at some point in that slice. Sum them up:

\[\text{Area} \approx \sum_{i=1}^{n} f(x_i^\ast) \, \Delta x\]
  • As we make the rectangles thinner and thinner (\(n \to \infty\), \(\Delta x \to 0\)), the sum becomes exact. This limiting process defines the definite integral:
\[\int_a^b f(x)\, dx = \lim_{n \to \infty} \sum_{i=1}^{n} f(x_i^\ast) \, \Delta x\]
  • The \(\int\) symbol is an elongated "S" for "sum." The \(dx\) reminds us that we are summing infinitesimally thin slices along the x-axis.

  • An indefinite integral (or antiderivative) is a function \(F(x)\) whose derivative is \(f(x)\). We write:

\[\int f(x)\, dx = F(x) + C\]
  • The \(+ C\) is the constant of integration. Since the derivative of any constant is zero, there are infinitely many antiderivatives that differ only by a constant. For example, \(\int 2x\, dx = x^2 + C\), because the derivative of \(x^2 + 7\) or \(x^2 - 3\) is still \(2x\).

  • The Fundamental Theorem of Calculus is the bridge that connects differentiation and integration. It has two parts:

  • Part 1: If \(F(x)\) is an antiderivative of \(f(x)\), then the definite integral equals the difference of \(F\) at the endpoints:

\[\int_a^b f(x)\, dx = F(b) - F(a)\]
  • This is remarkably practical. Instead of computing a limit of sums (which is hard), we find an antiderivative and evaluate it at two points (which is usually easy).

  • Part 2: If we define \(F(x) = \int_a^x f(t)\, dt\), then \(F'(x) = f(x)\). Differentiation and integration are inverse operations, they undo each other.

  • For example, to compute \(\int_1^3 x^2\, dx\): the antiderivative of \(x^2\) is \(\frac{x^3}{3}\). So \(\int_1^3 x^2\, dx = \frac{27}{3} - \frac{1}{3} = \frac{26}{3} \approx 8.67\).

  • Just as differentiation has rules, integration has corresponding rules that reverse them:

Function Integral Condition
\(x^n\) \(\frac{x^{n+1}}{n+1} + C\) \(n \neq -1\)
\(\frac{1}{x}\) \(\ln\|x\| + C\)
\(e^x\) \(e^x + C\)
\(a^x\) \(\frac{a^x}{\ln a} + C\)
\(\sin x\) \(-\cos x + C\)
\(\cos x\) \(\sin x + C\)
\(k\) (constant) \(kx + C\)
  • The sum/difference rule carries over: \(\int [f(x) \pm g(x)]\, dx = \int f(x)\, dx \pm \int g(x)\, dx\). Constants can be pulled out: \(\int k\, f(x)\, dx = k \int f(x)\, dx\).

  • When a function is too complex to integrate directly, we have techniques to simplify it.

  • u-substitution is the reverse of the chain rule. If you spot a composite function \(f(g(x))\) multiplied by \(g'(x)\), substitute \(u = g(x)\) so that \(du = g'(x)\, dx\), and the integral simplifies.

  • For example: \(\int 2x \cos(x^2)\, dx\). Let \(u = x^2\), so \(du = 2x\, dx\). The integral becomes \(\int \cos(u)\, du = \sin(u) + C = \sin(x^2) + C\).

  • Integration by parts is the reverse of the product rule. If the integrand is a product of two functions:

\[\int u\, dv = uv - \int v\, du\]
  • Choose \(u\) and \(dv\) strategically so that the remaining integral \(\int v\, du\) is simpler than the original. A common mnemonic for choosing \(u\) is LIATE: Logarithmic, Inverse trig, Algebraic, Trigonometric, Exponential (pick \(u\) from the earlier category).

  • For example: \(\int x\, e^x\, dx\). Let \(u = x\) (algebraic) and \(dv = e^x\, dx\). Then \(du = dx\) and \(v = e^x\). So: \(\int x\, e^x\, dx = x\, e^x - \int e^x\, dx = x\, e^x - e^x + C = e^x(x - 1) + C\).

  • In ML, integration appears in probability theory (computing probabilities by integrating density functions), in expected values (weighted averages over continuous distributions), and in computing the area under ROC curves. While we rarely integrate by hand in practice, understanding what integration means helps interpret these quantities.

Coding Tasks (use CoLab or notebook)

  1. Numerically approximate \(\int_0^1 x^2\, dx\) using a Riemann sum with increasing numbers of rectangles. Compare with the exact answer \(\frac{1}{3}\).

    import jax.numpy as jnp
    
    for n in [10, 100, 1000, 10000]:
        x = jnp.linspace(0, 1, n, endpoint=False)
        dx = 1.0 / n
        area = jnp.sum(x**2 * dx)
        print(f"n={n:5d}  approx: {area:.6f}  exact: {1/3:.6f}")
    

  2. Verify the Fundamental Theorem of Calculus numerically. Define \(F(x) = \int_0^x t^2\, dt = \frac{x^3}{3}\) and check that its derivative (computed via jax.grad) equals \(x^2\).

    import jax
    import jax.numpy as jnp
    
    F = lambda x: x**3 / 3
    dF = jax.grad(F)
    
    for x in [0.5, 1.0, 2.0, 3.0]:
        print(f"x={x:.1f}  F'(x)={dF(x):.4f}  x^2={x**2:.4f}")
    

  3. Visualise the area under \(f(x) = \sin(x)\) from \(0\) to \(\pi\). Use plt.fill_between to shade the area and compute it numerically with a Riemann sum.

    import jax.numpy as jnp
    import matplotlib.pyplot as plt
    
    x = jnp.linspace(0, jnp.pi, 500)
    y = jnp.sin(x)
    
    plt.plot(x, y, color="purple", linewidth=2)
    plt.fill_between(x, y, alpha=0.2, color="purple")
    plt.title(f"Area = {jnp.sum(jnp.sin(x) * (jnp.pi / 500)):.4f}  (exact: 2.0)")
    plt.show()