Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Stride Length

Stride Length

The step size by which a convolutional filter moves across an input during convolution.

Year: 2012Generality: 550
Back to Vocab

In convolutional neural networks (CNNs), stride length is a hyperparameter that controls how many positions a convolutional filter shifts at each step as it slides across an input image or feature map. A stride of 1 means the filter moves one pixel at a time, producing maximum overlap between adjacent receptive fields. A stride of 2 means the filter jumps two pixels at a time, skipping intermediate positions. This seemingly simple parameter has significant downstream effects on the shape and content of the resulting feature maps.

The relationship between stride and output dimensions is straightforward: larger strides produce smaller output feature maps. For a given input of size W, a filter of size F, padding P, and stride S, the output dimension is ⌊(W − F + 2P) / S⌋ + 1. This means stride acts as a form of spatial downsampling, reducing the resolution of feature maps as information flows deeper into the network. In this sense, strided convolutions can serve a similar function to pooling layers, and many modern architectures use strided convolutions in place of explicit pooling to reduce spatial dimensions while learning the downsampling operation rather than applying a fixed one.

Stride length directly governs the trade-off between spatial resolution and computational efficiency. A stride of 1 preserves fine-grained spatial detail but requires more computation and produces larger intermediate representations. Larger strides reduce memory and compute requirements but may discard spatially precise information. In practice, strides of 1 and 2 are most common — stride 1 for layers where preserving resolution matters, and stride 2 for intentional downsampling. The choice of stride interacts closely with filter size, padding strategy, and network depth to determine the overall receptive field and representational capacity of the model.

Stride became a central design consideration with the rise of deep CNN architectures in the early 2010s. AlexNet (2012) notably used a stride of 4 in its first convolutional layer to aggressively reduce spatial dimensions from large input images, a design choice that influenced subsequent architectures. Later work, including the VGG and ResNet families, refined stride usage as a principled tool for controlling information flow through deep networks.

Related

Related

Step Size
Step Size

A hyperparameter controlling how large each parameter update is during optimization.

Generality: 720
Convolution
Convolution

A sliding filter operation that extracts spatial patterns from input data.

Generality: 871
Step
Step

A single parameter update iteration within a model training optimization algorithm.

Generality: 720
Local Weight Sharing
Local Weight Sharing

Reusing the same weights across spatial positions to detect patterns regardless of location.

Generality: 694
Max Pooling
Max Pooling

A downsampling operation that retains the maximum value within each local region.

Generality: 694
CNN (Convolutional Neural Network)
CNN (Convolutional Neural Network)

A deep learning architecture that learns spatial hierarchies of features from visual data.

Generality: 875