Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Batch

Batch

A fixed subset of training data processed together in one forward-backward pass.

Year: 1986Generality: 796
Back to Vocab

In machine learning, a batch is a fixed-size subset of the training dataset that is processed together during a single iteration of model training. Rather than updating model parameters after every individual example (stochastic gradient descent) or after the entire dataset (full-batch gradient descent), mini-batch training strikes a practical balance: the model performs a forward pass on all samples in the batch, computes a combined loss, and then runs backpropagation to update weights once per batch. This approach has become the dominant training paradigm for neural networks.

The mechanics of batching are tightly coupled to how modern hardware operates. GPUs and TPUs are designed to execute large matrix multiplications in parallel, and batching naturally expresses training as a sequence of matrix operations across multiple samples simultaneously. A batch of inputs becomes a matrix where each row is one sample, and operations like linear transformations and activation functions apply across the entire matrix at once. This parallelism means that processing 64 samples in a single batch is far faster in wall-clock time than processing 64 samples sequentially, even though the total computation is similar.

Batch size is a critical hyperparameter with meaningful effects on both training dynamics and final model quality. Smaller batches introduce more noise into gradient estimates, which can act as a regularizer and help models escape sharp local minima — but they also make training less stable and slower to converge per epoch. Larger batches produce smoother, more accurate gradient estimates and train faster per epoch, but often generalize worse and require careful learning rate scaling to compensate. Research has shown that very large batches can lead models to converge to sharp minima that perform poorly on held-out data, a phenomenon sometimes called the "generalization gap."

Batching matters beyond raw efficiency: it shapes the entire training workflow, from memory constraints that determine the maximum feasible batch size on a given GPU, to learning rate schedules that must account for how frequently weights are updated. Frameworks like PyTorch and TensorFlow have built batch processing into their core data-loading abstractions, making it a foundational concept that every practitioner encounters immediately when training any neural network.

Related

Related

Batch Size
Batch Size

The number of training examples processed together before updating model parameters.

Generality: 796
Continuous Batching
Continuous Batching

A technique that dynamically groups incoming requests into batches for efficient ML inference.

Generality: 339
Batch Inference
Batch Inference

Running a trained model on many inputs simultaneously to generate predictions efficiently.

Generality: 694
Batch Normalization
Batch Normalization

A technique that normalizes layer inputs to accelerate and stabilize neural network training.

Generality: 794
Training
Training

The iterative process of optimizing a model's parameters using data.

Generality: 950
Step
Step

A single parameter update iteration within a model training optimization algorithm.

Generality: 720