Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. CALM (Continuous Autoregressive Language Models)

CALM (Continuous Autoregressive Language Models)

Language models that generate continuous-valued embeddings instead of discrete tokens.

Year: 2023Generality: 187
Back to Vocab

CALM, or Continuous Autoregressive Language Models, refers to a class of generative language models that operate in a continuous embedding space rather than producing discrete token predictions at each step. Traditional autoregressive language models generate text by predicting one token at a time from a fixed vocabulary, selecting the next token based on a probability distribution over all possible discrete symbols. CALM breaks from this paradigm by having the model output a continuous vector at each step, which is then fed back as input for the next generation step — bypassing the need to project back into a discrete vocabulary during intermediate computations.

The core mechanism involves training a model to autoregressively predict continuous representations — typically embeddings or latent vectors — rather than softmax distributions over tokens. This allows the model to propagate richer, higher-dimensional information between steps without the information bottleneck imposed by discretization. During inference, the final continuous output can be decoded into human-readable text through a separate decoding head or diffusion-based process. This architecture draws inspiration from continuous diffusion models applied to language and connects to broader research on latent-space generation, where the generative process unfolds in a smooth, differentiable manifold.

CALM-style approaches offer several potential advantages over discrete token models. Because the intermediate representations are continuous and differentiable, gradients can flow more freely through the generation process, potentially enabling more expressive and coherent long-range dependencies. They also sidestep some limitations of tokenization, such as sensitivity to subword segmentation and the inability to represent fine-grained semantic nuance within a single token slot. This makes them particularly appealing for tasks requiring nuanced semantic generation or for integration with other continuous modalities like audio and vision.

The relevance of CALM to modern machine learning grew significantly as researchers sought alternatives to the discrete bottleneck in large language models, especially in the context of scaling and multimodal generation. While discrete autoregressive models like GPT-style architectures remain dominant, continuous autoregressive approaches represent an active research frontier exploring whether language generation can be made more fluid, efficient, and expressive by embracing the geometry of continuous latent spaces rather than the combinatorics of token vocabularies.

Related

Related

Large Language Diffusion Models
Large Language Diffusion Models

Generative architectures applying diffusion-based denoising processes to large-scale natural language generation.

Generality: 337
DLMs (Deep Language Models)
DLMs (Deep Language Models)

Deep neural networks trained to understand, generate, and translate human language.

Generality: 796
LLM (Large Language Model)
LLM (Large Language Model)

Massive neural networks trained on text to understand and generate human language.

Generality: 905
LCMs (Large Concept Models)
LCMs (Large Concept Models)

Large-scale models that represent and reason over abstract, compositional concepts rather than raw tokens.

Generality: 381
VLM (Visual Language Model)
VLM (Visual Language Model)

AI models that jointly understand and generate both visual and textual information.

Generality: 720
Long-Context Modeling
Long-Context Modeling

Architectures and techniques enabling AI models to process and reason over very long sequences.

Generality: 694