
DRL
Deep Residual Learning
Deep Residual Learning
A neural-network design that inserts identity-based shortcut (skip) connections so layers learn residual mappings, which stabilizes optimization and enables training of much deeper models.
Deep residual learning frames each stack of layers to learn a residual function F(x) such that the block output is F(x) + x (or a projected x when dimensions differ), rather than trying to directly approximate the desired underlying mapping H(x). By converting the learning target into residual form, gradients propagate more directly through identity shortcuts, mitigating vanishing/exploding gradients and the degradation problem (where deeper plain networks perform worse). Practically, residual blocks—often implemented with convolution, batch normalization, ReLU, and optional bottleneck structures—permit stable training of networks with hundreds of layers, improve convergence, and act as an inductive bias toward iterative refinement of representations. The approach has broad implications: it underpins the ResNet family (and their pre-activation and bottleneck variants), is a standard building block in modern vision models (classification, detection, segmentation), appears in speech and some NLP architectures, and conceptually connects to dynamical-systems and iterative-estimation interpretations of deep nets. Design choices—identity vs. projection shortcuts, depth/width trade-offs, and placement of normalization/activation—affect optimization and generalization, and residual-style connections remain a central structural element in contemporary ML architectures.
First used in 2015 (He et al., ResNet); it gained widespread popularity immediately after ResNet won ILSVRC 2015 and became a standard architectural pattern from 2015–2016 onward.

