Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Iterated Amplification

Iterated Amplification

A recursive AI training technique combining task decomposition and human oversight to safely scale capability.

Year: 2018Generality: 339
Back to Vocab

Iterated amplification is an AI alignment technique designed to train increasingly capable systems while keeping their behavior aligned with human values. The core challenge it addresses is that as AI systems become more powerful, it becomes harder for humans to directly supervise and evaluate their outputs. Iterated amplification sidesteps this problem by breaking complex tasks into simpler sub-tasks that humans can reliably assess, then using those assessments to train a stronger model — which in turn becomes the baseline for the next round of amplification.

The process works iteratively: a human operator, assisted by a current version of the AI, decomposes a difficult problem into manageable pieces. Each piece is evaluated or solved using the existing model, and the combined result is used as a training signal for an improved version. Over successive rounds, the model's effective capability grows, but each individual training step remains grounded in human-verifiable judgments. This recursive bootstrapping allows the system to eventually handle tasks far beyond what a human could evaluate directly, while theoretically preserving alignment throughout.

Iterated amplification is closely related to debate and other scalable oversight proposals, all of which grapple with the same fundamental question: how do you supervise an AI that is smarter than you? The technique is often paired with distillation — where the amplified system's behavior is compressed back into a simpler model — forming a training loop that alternates between expanding capability and consolidating it. This pairing, sometimes called amplification-distillation, is central to how the approach scales in practice.

The significance of iterated amplification lies in its attempt to provide a principled, constructive path toward superintelligent AI that remains under meaningful human control. Rather than relying on post-hoc interpretability or hard-coded constraints, it embeds human judgment directly into the training process at every stage. While empirical validation at scale remains an open research challenge, iterated amplification has become a foundational concept in the AI safety literature and continues to influence how researchers think about aligning powerful future systems.

Related

Related

Recursive Self-Improvement
Recursive Self-Improvement

An AI system that autonomously and iteratively enhances its own intelligence and capabilities.

Generality: 703
Super Alignment
Super Alignment

Ensuring superintelligent AI systems reliably align with human values at scale.

Generality: 550
Abliteration
Abliteration

Removes alignment restrictions from language models by targeting refusal directions in activations.

Generality: 79
Debate
Debate

An AI alignment technique where competing agents argue opposing positions to surface truth.

Generality: 293
RLAIF (Reinforcement Learning with AI Feedback)
RLAIF (Reinforcement Learning with AI Feedback)

Training AI agents using feedback generated by other AI models instead of humans.

Generality: 487
Scaffolding
Scaffolding

A training strategy that incrementally increases task complexity to build AI capability.

Generality: 485