Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Surprisal

Surprisal

A measure of how unexpected an event is, based on its probability.

Year: 2003Generality: 620
Back to Vocab

Surprisal is an information-theoretic quantity that captures how unexpected a particular outcome is, defined mathematically as the negative logarithm of that outcome's probability: −log₂(p). When an event is highly probable, its surprisal is low — it carries little new information. Conversely, a rare event carries high surprisal, signaling that something informative and unexpected has occurred. The choice of logarithm base determines the unit: base 2 yields bits, while natural log yields nats. This measure is closely related to Shannon entropy, which can be understood as the expected surprisal across all possible outcomes of a distribution.

In machine learning, surprisal appears most prominently in language modeling and natural language processing. The cross-entropy loss used to train language models is essentially the average surprisal assigned by the model to observed tokens — minimizing this loss pushes the model to assign higher probability to the actual next word. Perplexity, a standard evaluation metric for language models, is the exponentiated average surprisal per token, making it an intuitive measure of how "confused" a model is by a given text. Lower perplexity indicates the model finds the text more predictable and has learned a better representation of the underlying language distribution.

Beyond language modeling, surprisal plays a role in reinforcement learning and curiosity-driven exploration. Agents can use surprisal as an intrinsic reward signal, actively seeking out states or transitions that their current world model finds unexpected. This encourages exploration of novel regions of the environment without requiring dense external rewards, and has proven effective in sparse-reward settings. Similarly, in active learning and Bayesian inference, high-surprisal data points are often the most informative for updating beliefs, making surprisal a natural criterion for selecting which examples to label or query next.

The concept originates in Claude Shannon's 1948 foundational paper on information theory, though the specific term "surprisal" was popularized by Myron Tribus in the 1960s. Its adoption in machine learning accelerated alongside the rise of probabilistic and neural language models in the 2000s and 2010s, cementing it as a core interpretive and training tool across many modern AI systems.

Related

Related

Surprise
Surprise

A measure of how unexpected or novel an outcome is given a model's predictions.

Generality: 620
Interestingness
Interestingness

A measure of how novel, surprising, or valuable information is to a learner or system.

Generality: 520
Perplexity
Perplexity

A metric quantifying how well a language model predicts a text sequence.

Generality: 713
Semantic Entropy
Semantic Entropy

A measure of uncertainty in the meaning of language model outputs.

Generality: 380
Unigram Entropy
Unigram Entropy

A measure of word-level unpredictability in text, assuming each word occurs independently.

Generality: 450
Artificial Curiosity
Artificial Curiosity

An intrinsic motivation mechanism that drives AI agents to explore novel environments autonomously.

Generality: 592