Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Attention Network

Attention Network

A neural network that dynamically weights input elements to capture relevant context.

Year: 2015Generality: 796
Back to Vocab

An attention network is a type of neural network architecture that learns to selectively focus on different parts of its input when producing each element of its output. Rather than compressing all input information into a single fixed-length representation, attention mechanisms compute a weighted distribution over input positions, allowing the model to dynamically retrieve the most relevant information at each step. These weights, called attention scores, are typically derived by measuring compatibility between a query vector and a set of key vectors, then using the resulting distribution to form a weighted sum of value vectors — a formulation that generalizes across text, images, audio, and multimodal data.

The most influential variant is self-attention, in which a sequence attends to itself to capture internal dependencies regardless of positional distance. This is the core operation in the Transformer architecture, where multiple attention heads run in parallel, each learning to track different types of relationships within the data. Unlike recurrent networks, which process sequences step-by-step and struggle with long-range dependencies, self-attention operates over the entire sequence simultaneously, making it both more expressive and more parallelizable on modern hardware.

Attention networks became central to machine learning following Bahdanau et al.'s 2014 work on neural machine translation, which showed that allowing a decoder to attend over encoder states dramatically improved translation quality on long sentences. The 2017 Transformer paper by Vaswani et al. then demonstrated that attention alone — without any recurrence or convolution — could achieve state-of-the-art results, triggering a paradigm shift across NLP and beyond. Models like BERT, GPT, and Vision Transformers (ViT) are all built on this foundation.

The practical impact of attention networks is difficult to overstate. They have enabled breakthroughs in machine translation, text generation, question answering, protein structure prediction, and image recognition. Their ability to model arbitrary pairwise relationships within and across sequences makes them a highly general tool, and ongoing research continues to address their main limitations — particularly the quadratic computational cost of full attention over long sequences — through sparse, linear, and hierarchical attention variants.

Related

Related

Attention
Attention

A mechanism enabling neural networks to dynamically focus on relevant parts of input.

Generality: 875
Attention Mechanisms
Attention Mechanisms

Neural network components that dynamically weight input elements by their contextual relevance.

Generality: 865
Attention Mechanism
Attention Mechanism

A neural network technique that dynamically weights input elements by their relevance to the task.

Generality: 875
Self-Attention
Self-Attention

A mechanism that lets neural networks weigh relationships between all parts of an input simultaneously.

Generality: 794
Attention Block
Attention Block

A neural network module that selectively weighs input elements by their contextual relevance.

Generality: 752
Attention Pattern
Attention Pattern

A mechanism that lets neural networks selectively focus on relevant parts of input.

Generality: 752