Neural networks with many layers that learn hierarchical representations from raw data.
A deep neural network (DNN) is a machine learning model composed of multiple successive layers of interconnected nodes, or neurons, that transform input data through a series of learned nonlinear operations. Unlike shallow networks with only one or two hidden layers, DNNs stack many such layers — sometimes dozens or hundreds — allowing each layer to build increasingly abstract representations of the input. The first layers might detect simple features like edges or phonemes, while deeper layers combine these into complex concepts like faces or words. This hierarchical feature learning is what gives DNNs their remarkable expressive power across diverse data types.
Training a DNN involves feeding labeled examples forward through the network to produce predictions, computing a loss that measures prediction error, and then propagating gradients backward through all layers via backpropagation to update weights. This process requires careful choices of activation functions, optimizers, and regularization strategies. Key innovations that made deep training practical include ReLU activations (which mitigate vanishing gradients), dropout (which reduces overfitting), batch normalization (which stabilizes training dynamics), and the availability of GPU hardware capable of parallelizing the enormous number of floating-point operations involved.
DNNs became the dominant paradigm in machine learning after 2012, when a deep convolutional network dramatically outperformed all competitors on the ImageNet visual recognition benchmark. Since then, deep architectures have achieved state-of-the-art results in speech recognition, natural language processing, protein structure prediction, game playing, and generative modeling. The success of DNNs is closely tied to the availability of large labeled datasets and scalable compute — both of which have grown substantially over the past decade.
Despite their power, DNNs present real challenges: they require significant data and compute, can be opaque in their decision-making, and are sensitive to adversarial inputs. Active research areas include interpretability, sample efficiency, robustness, and reducing the carbon footprint of training large models. Nevertheless, DNNs remain the foundational building block of virtually all modern AI systems, from voice assistants to medical imaging tools.