Using parameterized models to estimate unknown functions from observed data.
Function approximation is the practice of using a parameterized model to represent an unknown or intractable function based on observed input-output pairs. In machine learning, this arises constantly: a regression model approximates a continuous target function, a classifier approximates a decision boundary, and a neural network approximates the mapping from raw inputs to predictions. The core assumption is that even though the true underlying function may be impossibly complex to express analytically, a sufficiently flexible model trained on enough data can capture its essential behavior.
The mechanics of function approximation involve choosing a model family — polynomials, kernel methods, decision trees, or neural networks — and then optimizing its parameters to minimize some measure of discrepancy between the model's outputs and the true function values observed in training data. The choice of model family encodes inductive biases about the function's structure, such as smoothness or linearity, and heavily influences generalization. Regularization techniques help prevent the approximator from overfitting to noise rather than learning the true underlying function.
Function approximation is especially critical in reinforcement learning, where agents must estimate value functions or policies over enormous or continuous state spaces that cannot be enumerated explicitly. Tabular methods break down in these settings, and function approximators — particularly neural networks — allow agents to generalize across similar states. This combination, deep reinforcement learning, enabled landmark achievements such as superhuman game-playing agents and robotic control systems.
The theoretical backbone of neural function approximation is the universal approximation theorem, which establishes that feedforward networks with sufficient capacity can approximate any well-behaved function to arbitrary precision. While this result guarantees expressive power in principle, it says nothing about how easily that approximation can be learned from finite data or how well it will generalize. Understanding the gap between approximation capacity and practical learnability remains an active area of research, touching on generalization theory, optimization landscapes, and the geometry of high-dimensional function spaces.