Massively parallel processor that accelerates deep learning by handling thousands of simultaneous computations.
A Graphics Processing Unit (GPU) is a specialized processor originally designed to accelerate the rendering of images and video for display output. Unlike a CPU, which contains a small number of powerful cores optimized for sequential tasks, a GPU contains thousands of smaller, more efficient cores designed to handle many operations simultaneously. This massively parallel architecture makes GPUs exceptionally well-suited for the kinds of mathematical operations that dominate machine learning workloads — particularly matrix multiplications and convolutions — where the same computation must be applied to enormous arrays of numbers at once.
In deep learning, training a neural network requires iterating over millions or billions of numerical operations across large datasets, often repeatedly for many epochs. GPUs dramatically accelerate this process by distributing these operations across their thousands of cores, reducing training times from weeks on a CPU to hours or even minutes. The same parallel advantage applies during inference, where GPUs can process large batches of inputs simultaneously, enabling real-time applications in computer vision, natural language processing, and generative AI.
The pivotal moment for GPU adoption in AI came with Nvidia's release of CUDA (Compute Unified Device Architecture) in 2007, which gave researchers a practical programming interface for running general-purpose computations on GPU hardware. The landmark 2012 ImageNet competition — where AlexNet, trained on GPUs, dramatically outperformed CPU-based competitors — demonstrated the transformative potential of GPU-accelerated deep learning and triggered an industry-wide shift. Since then, GPU clusters have become the standard infrastructure for AI research and production systems alike.
Today, GPUs remain central to the AI hardware ecosystem, with Nvidia's data center GPUs such as the A100 and H100 serving as the workhorses of large-scale model training. Competitors including AMD and Intel have developed their own GPU and accelerator offerings, while cloud providers offer GPU instances on demand. The insatiable compute demands of modern foundation models and generative AI systems have made GPU availability a strategic resource, shaping research timelines, business models, and even geopolitical policy around semiconductor supply chains.