Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Adapter Layer

Adapter Layer

Small trainable modules inserted into pre-trained models to enable efficient task adaptation.

Year: 2019Generality: 384
Back to Vocab

Adapter layers are lightweight, trainable components inserted between the existing layers of a pre-trained neural network, enabling the model to be fine-tuned for new tasks without modifying its original weights. Rather than retraining an entire model — which can involve billions of parameters and enormous computational cost — adapter layers introduce a small bottleneck structure that learns task-specific transformations. The pre-trained weights remain frozen, while only the adapter parameters are updated during training. This design preserves the rich representations learned during pre-training while allowing meaningful specialization to downstream tasks.

The typical adapter architecture consists of a down-projection matrix that compresses the hidden representation into a lower-dimensional space, a nonlinear activation function, and an up-projection matrix that restores the original dimensionality. A residual connection around this bottleneck ensures that the adapter can learn to be nearly identity-mapped at initialization, making training stable. Because adapters add only a fraction of the original model's parameters — often less than 1% — multiple task-specific adapters can be trained and swapped in and out of a single shared backbone, making deployment across many tasks highly practical.

Adapter layers became prominent in NLP following the 2019 paper by Houlsby et al., which demonstrated that adapters could match full fine-tuning performance on GLUE benchmarks while updating far fewer parameters. This was particularly impactful as large language models like BERT and GPT were becoming standard foundations for applied NLP, creating strong demand for parameter-efficient adaptation strategies. The approach has since expanded beyond NLP into computer vision and multimodal models, where similar efficiency pressures apply.

The broader significance of adapter layers lies in democratizing access to large pre-trained models. Organizations without the resources to fine-tune massive models end-to-end can instead train compact adapters on modest hardware. Adapters also reduce catastrophic forgetting, since the backbone remains unchanged, and they enable modular, composable model behavior. They are a foundational technique within the growing field of parameter-efficient fine-tuning (PEFT), which includes related methods such as LoRA, prefix tuning, and prompt tuning.

Related

Related

Adapter
Adapter

Small trainable modules added to frozen pre-trained models for efficient task-specific fine-tuning.

Generality: 520
Self-Adaptive LLMs (Large Language Models)
Self-Adaptive LLMs (Large Language Models)

LLMs that autonomously adjust their behavior at runtime without full retraining.

Generality: 511
LoRA (Low-Rank Adaptation)
LoRA (Low-Rank Adaptation)

A parameter-efficient method for fine-tuning large pre-trained models using low-rank matrices.

Generality: 398
Fine-Tuning
Fine-Tuning

Adapting a pre-trained model to a specific task by continuing training on new data.

Generality: 796
Pretrained Model
Pretrained Model

A model trained on large data, reused or fine-tuned for new tasks.

Generality: 838
Base Model
Base Model

A pre-trained model used as a starting point for task-specific adaptation.

Generality: 794