When a model is too simple to capture meaningful patterns in data.
Underfitting occurs when a machine learning model is insufficiently complex to capture the underlying structure of the data it is trained on. The result is a model with high bias — one whose assumptions are too rigid or simplistic to represent the true relationships in the data. Unlike overfitting, where a model memorizes noise and spurious patterns, an underfit model fails to learn even the genuine signal, producing poor performance on both the training set and any new data it encounters.
The root causes of underfitting typically include using a model with too few parameters, training for too few iterations, or applying excessive regularization that penalizes complexity so heavily that the model cannot adapt to the data. For example, fitting a straight line to data with a clearly nonlinear relationship will almost always underfit, regardless of how much training data is available. The model's capacity — its ability to represent a wide range of functions — is simply too limited for the task.
Underfitting is best understood within the bias-variance tradeoff, a foundational concept in statistical learning theory. High bias corresponds to underfitting: the model makes strong, often incorrect assumptions that cause it to systematically miss the true pattern. Reducing underfitting generally requires increasing model capacity — switching to a more expressive architecture, engineering more informative features, reducing regularization strength, or training longer. The goal is to find the sweet spot where the model is complex enough to learn real patterns but not so complex that it overfits.
Recognizing underfitting in practice is relatively straightforward: if training error remains high after reasonable optimization, the model is likely underfit. This distinguishes it from overfitting, where training error is low but validation or test error is high. Monitoring both training and validation performance throughout the learning process is the standard diagnostic approach, and tools like learning curves make it easy to identify whether a model suffers from excessive bias or variance.