Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Encoder-Decoder Models

Encoder-Decoder Models

Deep learning architectures that compress input into a representation and generate output.

Year: 2014Generality: 792
Back to Vocab

Encoder-decoder models are a class of neural network architectures built around two cooperating components: an encoder that transforms raw input into a compact internal representation, and a decoder that uses that representation to produce a desired output. The encoder progressively distills the input—whether text, images, or audio—into a latent vector or sequence of vectors that captures its essential structure. The decoder then conditions on this representation to generate output step by step, making the architecture naturally suited to tasks where input and output differ in form, length, or modality.

The architecture gained widespread attention in 2014 with the introduction of sequence-to-sequence learning, which applied recurrent neural networks to tasks like machine translation. In that setting, an RNN encoder reads a source-language sentence token by token, producing a context vector, while an RNN decoder generates the target-language translation autoregressively. A critical refinement came with the addition of attention mechanisms, which allowed the decoder to selectively focus on different parts of the encoder's output at each generation step rather than relying on a single fixed-length vector—dramatically improving performance on long sequences.

The encoder-decoder paradigm extends well beyond text. In image segmentation, convolutional encoders downsample spatial features while decoders upsample them back to pixel-level predictions. In speech synthesis and recognition, the architecture maps between acoustic signals and linguistic representations. The 2017 Transformer model, built entirely on attention, recast the encoder-decoder framework in a highly parallelizable form and became the backbone of modern large language models, translation systems, and multimodal AI.

Encoder-decoder models matter because they provide a principled way to handle the fundamental challenge of cross-domain mapping: taking structured information in one space and producing coherent, contextually appropriate output in another. Their modularity also enables transfer learning—pretrained encoders can be paired with task-specific decoders—making them central to contemporary approaches in NLP, computer vision, and generative AI.

Related

Related

Encoder-Decoder Transformer
Encoder-Decoder Transformer

A transformer architecture that encodes input sequences and decodes them into outputs.

Generality: 722
Autoencoder
Autoencoder

A neural network that compresses data into a compact representation, then reconstructs it.

Generality: 795
Seq2Seq (Sequence-to-Sequence)
Seq2Seq (Sequence-to-Sequence)

A neural architecture that maps variable-length input sequences to variable-length output sequences.

Generality: 794
Sequential Models
Sequential Models

AI models that process ordered data by capturing dependencies across time or position.

Generality: 795
Transformer
Transformer

A neural network architecture using self-attention to process sequential data in parallel.

Generality: 900
Sequence Model
Sequence Model

A model that learns patterns and dependencies within ordered data sequences.

Generality: 840