Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. AlexNet

AlexNet

Landmark deep convolutional network that ignited the modern deep learning revolution in 2012.

Year: 2012Generality: 703
Back to Vocab

AlexNet is a deep convolutional neural network architecture that achieved a breakthrough performance on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012, reducing the top-5 error rate to 15.3%—nearly 11 percentage points better than the runner-up. Developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton at the University of Toronto, the network demonstrated conclusively that deep learning could dramatically outperform hand-engineered feature extraction methods on large-scale visual recognition tasks, catalyzing what is widely regarded as the modern deep learning era.

The architecture consists of five convolutional layers followed by three fully connected layers, processing 224×224 RGB images into 1,000 class probability scores. Several design choices were novel and influential at the time: the use of Rectified Linear Unit (ReLU) activations instead of sigmoid or tanh functions accelerated training significantly; overlapping max-pooling reduced spatial dimensions while preserving salient features; and local response normalization provided a form of lateral inhibition inspired by neuroscience. Critically, the entire network was trained on two NVIDIA GTX 580 GPUs in parallel—an early demonstration that commodity graphics hardware could make large-scale deep learning tractable.

AlexNet also popularized dropout as a regularization technique, randomly deactivating neurons during training to prevent co-adaptation and reduce overfitting on the relatively small (by modern standards) 1.2-million-image dataset. Data augmentation through random cropping, flipping, and color jittering further improved generalization. Together, these techniques formed a practical recipe that subsequent architectures—VGGNet, GoogLeNet, ResNet—would refine and build upon.

The broader significance of AlexNet extends well beyond its benchmark results. It shifted the research community's attention toward end-to-end learned representations, spurred massive investment in GPU computing infrastructure, and established ImageNet competition performance as the de facto benchmark for computer vision progress for nearly a decade. The 2012 paper, "ImageNet Classification with Deep Convolutional Neural Networks," remains one of the most cited works in the history of machine learning.

Related

Related

CNN (Convolutional Neural Network)
CNN (Convolutional Neural Network)

A deep learning architecture that learns spatial hierarchies of features from visual data.

Generality: 875
ResNet (Residual Network)
ResNet (Residual Network)

A CNN architecture using skip connections to enable training of very deep networks.

Generality: 795
VGG (Visual Geometry Group)
VGG (Visual Geometry Group)

Oxford's 2014 deep CNN architecture using small filters that became a foundational vision backbone.

Generality: 550
FCN (Fully Convolutional Network)
FCN (Fully Convolutional Network)

A neural network architecture that produces pixel-wise predictions for image segmentation.

Generality: 694
DNN (Deep Neural Network)
DNN (Deep Neural Network)

Neural networks with many layers that learn hierarchical representations from raw data.

Generality: 871
Image Recognition
Image Recognition

AI systems that identify and categorize objects, scenes, and content within images.

Generality: 871