Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Adapter

Adapter

Small trainable modules added to frozen pre-trained models for efficient task-specific fine-tuning.

Year: 2019Generality: 520
Back to Vocab

Adapters are lightweight, modular components inserted into a pre-trained neural network to enable task-specific fine-tuning without modifying the original model's parameters. Rather than updating all weights during fine-tuning — a process that is computationally expensive and requires storing a separate full model copy per task — adapters introduce small bottleneck layers or modules at strategic points within the network architecture. Only these adapter parameters are trained on the downstream task, while the backbone model remains frozen. This design dramatically reduces the number of trainable parameters, often to less than 1% of the original model's total weights.

In practice, adapters are typically inserted between the layers of a transformer-based model. A common design uses a down-projection matrix to compress the hidden representation into a lower-dimensional space, applies a nonlinearity, then uses an up-projection to restore the original dimensionality — with a residual connection bypassing the module. This bottleneck structure keeps the adapter small while still giving the model enough expressive capacity to adapt to new tasks. The original model weights are shared across all tasks, and only the task-specific adapter weights need to be swapped out at inference time.

Adapters matter because they make large pre-trained models practical to deploy across many tasks simultaneously. In a traditional fine-tuning paradigm, serving ten tasks would require ten full copies of a large model. With adapters, a single shared backbone can serve all tasks by loading only the relevant adapter weights — a significant reduction in storage and memory overhead. This modularity also enables continual learning scenarios where new tasks can be added without risking catastrophic forgetting of previously learned ones.

The approach gained traction in NLP following the 2019 paper "Parameter-Efficient Transfer Learning for NLP" by Houlsby et al., which demonstrated competitive performance on GLUE benchmarks using adapters with BERT while training only a small fraction of total parameters. Since then, adapter-based methods have expanded into computer vision, multimodal models, and speech, and have inspired a broader family of parameter-efficient fine-tuning (PEFT) techniques including LoRA, prefix tuning, and prompt tuning.

Related

Related

Adapter Layer
Adapter Layer

Small trainable modules inserted into pre-trained models to enable efficient task adaptation.

Generality: 384
Fine-Tuning
Fine-Tuning

Adapting a pre-trained model to a specific task by continuing training on new data.

Generality: 796
LoRA (Low-Rank Adaptation)
LoRA (Low-Rank Adaptation)

A parameter-efficient method for fine-tuning large pre-trained models using low-rank matrices.

Generality: 398
Self-Adaptive LLMs (Large Language Models)
Self-Adaptive LLMs (Large Language Models)

LLMs that autonomously adjust their behavior at runtime without full retraining.

Generality: 511
Pretrained Model
Pretrained Model

A model trained on large data, reused or fine-tuned for new tasks.

Generality: 838
NeuMeta (Neural Metamorphosis)
NeuMeta (Neural Metamorphosis)

A framework enabling neural networks to structurally and functionally transform across tasks without retraining.

Generality: 102