The tendency of ML models to favor simpler patterns or hypotheses over complex ones.
Simplicity bias refers to the empirically observed tendency of machine learning models—particularly neural networks—to preferentially learn simpler functions or representations when multiple hypotheses are consistent with the training data. Rather than treating all solutions that fit the data equally, models trained with standard gradient-based methods tend to converge toward lower-complexity solutions first, even when more complex ones would achieve similar or better training performance. This phenomenon has been studied extensively in the context of neural networks, where it manifests as a preference for low-frequency functions, sparse representations, and generalizable patterns over high-frequency noise.
The mechanism behind simplicity bias is closely tied to the geometry of the loss landscape and the implicit regularization effects of optimization algorithms like stochastic gradient descent. Research has shown that SGD, by virtue of its update dynamics and the structure of overparameterized models, tends to find solutions with low effective complexity—solutions that generalize well despite the model having far more parameters than training examples. This helps explain the so-called "generalization puzzle" of deep learning: why massively overparameterized networks avoid overfitting in practice even without explicit regularization.
Simplicity bias connects directly to classical statistical learning principles, including Occam's Razor and the bias-variance tradeoff. Regularization techniques such as L1 and L2 penalties, dropout, and early stopping are deliberate interventions that reinforce simplicity bias by penalizing model complexity. However, the bias is not always beneficial—when the true underlying relationship in data is genuinely complex, an overly strong preference for simple solutions leads to underfitting and poor predictive performance on real-world tasks.
Understanding simplicity bias has become increasingly important as researchers seek to explain why deep learning models generalize, how they behave on out-of-distribution data, and where they fail. Models with strong simplicity bias may latch onto spurious correlations that happen to be statistically simple, making them brittle in deployment. This has motivated work on inductive biases, data augmentation, and architectural design aimed at aligning a model's simplicity preferences with the actual structure of the target problem.