Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Observatory
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Amortized Variational Inference

Amortized Variational Inference

Training technique using an inference network to approximate posterior over latent reasoning trajectories.

Year: 2024Generality: 750
Back to Vocab

Amortized variational inference is a training technique that uses a learned inference network to approximate the posterior distribution over latent variables, avoiding the need for expensive per-example optimization. In the context of recursive reasoning models like GRAM, it enables training by providing a tractable approximation to the true posterior over reasoning trajectories. The inference network takes observations and produces parameters of an approximate posterior distribution, which is then used to compute evidence lower bound (ELBO) gradients for training the generative model.

The term amortized refers to the key efficiency gain: instead of running an iterative optimization for each training example to find the optimal latent variables, the inference network learns to predict good latent variables from inputs directly. Once trained, inference is a single forward pass — amortizing the cost of inference across all examples. For GRAM, this means the latent reasoning trajectory is produced by a learned inference network rather than optimized from scratch at test time.

In GRAM specifically, amortized variational inference enables training by providing a differentiable lower bound objective that replaces per-trajectory optimization with a forward pass through the inference network. The variational posterior approximates the distribution over reasoning trajectories given inputs and outputs, while the generative model defines the prior over latent trajectories and the likelihood of outputs given trajectories. The ELBO objective balances reconstruction quality with proximity of the posterior to the prior, encouraging diverse and valid reasoning paths.

The main limitation is the amortization gap: the inference network is constrained to produce distributions from a parametric family, which may not perfectly match the true posterior for any given example. This approximation error can cause underestimation of uncertainty and mode collapse in some distributions. More sophisticated inference networks can reduce this gap but increase training complexity. Whether the amortization gap meaningfully degrades performance on reasoning tasks with sparse latent structure is not yet well understood.