Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Dropout

Dropout

A regularization technique that randomly deactivates neurons during training to prevent overfitting.

Year: 2012Generality: 796
Back to Vocab

Dropout is a regularization technique for neural networks that works by randomly setting a fraction of neuron activations to zero during each forward pass of training. Rather than always propagating signals through every unit, the network temporarily "drops" a randomly selected subset of neurons — typically with a probability between 0.2 and 0.5 — forcing the remaining units to compensate. At inference time, all neurons are active, but their outputs are scaled down proportionally to account for the larger number of active units, ensuring consistent expected activation magnitudes.

The core intuition behind dropout is that it prevents neurons from co-adapting too closely to one another. When any given neuron can be absent at any training step, the network cannot rely on specific combinations of neurons to encode a pattern. Instead, it must learn more distributed, redundant representations. This is loosely analogous to training an ensemble of exponentially many different network architectures simultaneously and averaging their predictions — a perspective that helps explain why dropout so reliably improves generalization.

Dropout proved especially impactful in the deep learning era, where large networks with millions of parameters are highly susceptible to memorizing training data. Its introduction coincided with the rise of convolutional and recurrent architectures, and it became a standard component in models achieving state-of-the-art results across image recognition, speech recognition, and natural language processing. Variants such as spatial dropout (dropping entire feature maps in CNNs) and variational dropout (connecting dropout to Bayesian inference) have since extended the original idea to more specialized settings.

While newer architectures — particularly transformers — often rely more heavily on other regularization strategies like weight decay and layer normalization, dropout remains widely used and is still a default tool in practitioners' toolkits. Its simplicity, low computational overhead, and consistent empirical benefits have cemented it as one of the most influential ideas in modern deep learning.

Related

Related

Regularization
Regularization

A technique that penalizes model complexity to prevent overfitting and improve generalization.

Generality: 876
Weight Decay
Weight Decay

A regularization method that penalizes large weights to prevent overfitting.

Generality: 750
Vanishing Gradient
Vanishing Gradient

A training failure where gradients shrink exponentially, preventing early network layers from learning.

Generality: 720
Early Stopping
Early Stopping

A regularization technique that halts model training when validation performance begins degrading.

Generality: 794
Batch Normalization
Batch Normalization

A technique that normalizes layer inputs to accelerate and stabilize neural network training.

Generality: 794
Overfitting
Overfitting

When a model memorizes training data noise instead of learning generalizable patterns.

Generality: 875