Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Autoregressive Prediction

Autoregressive Prediction

A modeling approach that predicts each sequence element from its preceding values.

Year: 1990Generality: 792
Back to Vocab

Autoregressive prediction is a fundamental technique in machine learning and statistical modeling in which each value in a sequence is predicted as a function of its previous values. The term "autoregressive" reflects the self-referential nature of the approach: the model regresses a variable on its own past observations rather than on independent external inputs. This structure makes autoregressive models naturally suited to any domain where order and temporal context matter, including language, audio, time series, and video.

In practice, autoregressive models generate sequences one step at a time. At each step, the model takes previously generated or observed tokens as input and produces a probability distribution over the next possible value, from which a prediction or sample is drawn. This output then becomes part of the conditioning context for the next step. Classic statistical formulations such as AR, ARMA, and ARIMA models capture this idea with linear equations and explicit lag terms. Modern deep learning approaches replace these linear assumptions with neural networks—recurrent architectures like LSTMs, and more recently, Transformer-based models—that can learn complex, long-range dependencies across sequences.

Autoregressive prediction became central to large-scale language modeling with the rise of models like GPT, which are trained to predict the next token in a sequence across massive text corpora. This simple objective, applied at scale, yields models capable of coherent text generation, code synthesis, translation, and reasoning. Similar autoregressive frameworks underpin WaveNet for audio synthesis and PixelCNN for image generation, demonstrating the approach's versatility across modalities.

The appeal of autoregressive models lies in their tractability: because each prediction conditions only on observed history, the joint probability of a sequence can be decomposed into a product of conditional probabilities, making training via maximum likelihood straightforward. The key limitation is sequential generation—each step depends on the last, making parallelization during inference difficult. Despite this, autoregressive prediction remains one of the most powerful and widely deployed paradigms in modern generative AI.

Related

Related

Autoregressive
Autoregressive

A model that predicts future sequence values from weighted combinations of past values.

Generality: 794
Autoregressive Generation
Autoregressive Generation

Generating sequences by predicting each element conditioned on all previous outputs.

Generality: 794
Autoregressive Sequence Generator
Autoregressive Sequence Generator

A model that predicts each next output using its own previous outputs as inputs.

Generality: 752
Sequence Prediction
Sequence Prediction

Forecasting the next item(s) in a sequence by learning patterns from prior observations.

Generality: 794
Next Word Prediction
Next Word Prediction

A training objective where models learn to predict the next token in a sequence.

Generality: 794
Multi-Token Prediction
Multi-Token Prediction

A generation strategy where language models predict multiple output tokens simultaneously.

Generality: 380