Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Observatory
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. DRL (Deep Residual Learning)

DRL (Deep Residual Learning)

A neural network design using skip connections so layers learn residual mappings, enabling much deeper models.

Year: 2015Generality: 752
Back to Vocab

Deep residual learning is an architectural design principle in which each block of layers learns a residual function F(x) rather than attempting to directly approximate a desired underlying mapping H(x). The block's output is computed as F(x) + x, where x is passed through an identity shortcut connection that bypasses the learned layers entirely. When input and output dimensions differ, a linear projection replaces the identity. This reformulation means layers only need to learn what to add to the input, not reconstruct the full target representation from scratch.

The practical motivation stems from the degradation problem: as plain networks grow deeper, training accuracy paradoxically worsens—not due to overfitting, but due to optimization difficulty. Residual connections address this by giving gradients a direct path backward through the network, substantially reducing vanishing and exploding gradient issues. In practice, residual blocks are typically built from convolutions, batch normalization, and ReLU activations, often arranged in bottleneck configurations that reduce computational cost while preserving representational capacity. These design choices allow stable training of networks with hundreds or even thousands of layers.

The concept was introduced by Kaiming He and colleagues at Microsoft Research in their 2015 ResNet paper, which won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) that year by a significant margin. The result was immediately influential: ResNet and its variants became the default backbone for computer vision tasks including image classification, object detection, and semantic segmentation. Residual-style connections subsequently appeared in speech recognition, natural language processing, and generative models, and they are a foundational structural element in many modern architectures.

Beyond empirical success, residual networks carry theoretical significance. They can be interpreted through the lens of dynamical systems, where each block approximates a small update step in an iterative refinement process—an analogy that connects deep networks to numerical ODE solvers and motivates continuous-depth models like Neural ODEs. The inductive bias toward incremental representation refinement, combined with improved gradient flow, makes residual connections one of the most broadly adopted and theoretically grounded ideas in contemporary machine learning.

Related

Related

ResNet (Residual Network)
ResNet (Residual Network)

A CNN architecture using skip connections to enable training of very deep networks.

Generality: 795
Residual Connections
Residual Connections

Shortcut connections in deep networks that enable training of much deeper architectures.

Generality: 834
DL (Deep Learning)
DL (Deep Learning)

A machine learning approach using multi-layered neural networks to model complex data patterns.

Generality: 928
DRL (Deep Reinforcement Learning)
DRL (Deep Reinforcement Learning)

Neural networks combined with reinforcement learning to master complex sequential decision-making tasks.

Generality: 796
Vanishing Gradient
Vanishing Gradient

A training failure where gradients shrink exponentially, preventing early network layers from learning.

Generality: 720
DNN (Deep Neural Network)
DNN (Deep Neural Network)

Neural networks with many layers that learn hierarchical representations from raw data.

Generality: 871