Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Overfitting

Overfitting

When a model memorizes training data noise instead of learning generalizable patterns.

Year: 1990Generality: 875
Back to Vocab

Overfitting occurs when a machine learning model learns the training data too precisely — capturing not just the true underlying signal but also its random noise and idiosyncratic quirks. The result is a model that performs impressively on training examples but fails to generalize to new, unseen data. Intuitively, the model has memorized rather than learned: it has tuned itself so tightly to one dataset that its "knowledge" doesn't transfer. This is one of the most fundamental failure modes in machine learning and affects virtually every class of model, from linear regression to deep neural networks.

Overfitting typically emerges when a model has too much capacity relative to the amount of training data available. A neural network with millions of parameters trained on a few thousand examples has ample room to memorize every sample rather than extract meaningful structure. It can also arise from training for too many epochs, using overly expressive feature sets, or failing to apply any form of regularization. The telltale diagnostic sign is a growing gap between training loss and validation loss as training progresses — the model keeps improving on data it has seen while degrading on data it hasn't.

Practitioners have developed a rich toolkit for combating overfitting. Regularization techniques like L1 (lasso) and L2 (ridge) penalize large parameter weights, discouraging the model from fitting noise. Dropout randomly deactivates neurons during training, forcing the network to learn redundant representations. Early stopping halts training when validation performance begins to decline. Data augmentation artificially expands the training set, and cross-validation provides more reliable estimates of true generalization performance. Choosing a simpler model architecture — one with fewer parameters — is often the most direct remedy.

Understanding overfitting is inseparable from the broader bias-variance tradeoff: reducing model complexity decreases variance (overfitting) but risks increasing bias (underfitting). Finding the right balance is central to building models that are useful in production. As deep learning pushed model sizes into the billions of parameters, overfitting research intensified, yielding techniques like batch normalization, weight decay schedules, and massive dataset curation as standard practice.

Related

Related

Underfitting
Underfitting

When a model is too simple to capture meaningful patterns in data.

Generality: 720
Overparameterized
Overparameterized

A model with more parameters than available training data points.

Generality: 590
Overparameterization Regime
Overparameterization Regime

When a model has more parameters than training samples, yet still generalizes well.

Generality: 520
Regularization
Regularization

A technique that penalizes model complexity to prevent overfitting and improve generalization.

Generality: 876
Bias-Variance Trade-off
Bias-Variance Trade-off

The fundamental tension between model complexity and generalization that governs prediction error.

Generality: 875
Generalization
Generalization

A model's ability to perform accurately on new, previously unseen data.

Generality: 913