Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Semantic Entropy

Semantic Entropy

A measure of uncertainty in the meaning of language model outputs.

Year: 2023Generality: 380
Back to Vocab

Semantic entropy is a framework for quantifying uncertainty in the meanings produced by language models, rather than in their raw token-level predictions. While classical information-theoretic entropy measures the unpredictability of discrete symbols, semantic entropy operates at the level of meaning: two different generated strings that express the same proposition are treated as equivalent, and uncertainty is computed over these equivalence classes of meaning rather than over surface-level text. This distinction matters because a model might generate many paraphrases of the same idea with high token-level diversity but low semantic uncertainty, or conversely produce outputs that are superficially similar yet semantically contradictory.

In practice, semantic entropy is estimated by sampling multiple outputs from a language model for a given prompt, clustering those outputs by semantic equivalence (often using natural language inference models or embedding similarity to judge whether two responses mean the same thing), and then computing entropy over the resulting clusters. A high semantic entropy score signals that the model is genuinely uncertain about the correct answer — it is generating responses with meaningfully different content — while low semantic entropy suggests the model is consistently expressing the same underlying claim, even if the wording varies.

The concept gained traction in the context of hallucination detection in large language models. Because LLMs can produce fluent, confident-sounding text even when they are factually wrong, identifying when a model is uncertain about meaning — as opposed to merely uncertain about phrasing — provides a more reliable signal for flagging unreliable outputs. Semantic entropy has been shown to correlate with factual accuracy across question-answering benchmarks, making it a practical tool for selective prediction: systems can abstain or escalate to human review when semantic entropy exceeds a threshold.

More broadly, semantic entropy connects to longstanding challenges in natural language processing around ambiguity, polysemy, and context-dependence. Its value lies in grounding uncertainty estimation in the semantics of language rather than its statistics, offering a more interpretable and task-relevant measure of model confidence for high-stakes applications such as medical question answering, legal document analysis, and automated fact-checking.

Related

Related

Unigram Entropy
Unigram Entropy

A measure of word-level unpredictability in text, assuming each word occurs independently.

Generality: 450
Flexible Semantics
Flexible Semantics

A system's ability to interpret meaning dynamically based on context and linguistic nuance.

Generality: 521
Surprisal
Surprisal

A measure of how unexpected an event is, based on its probability.

Generality: 620
Semantic Indexing
Semantic Indexing

Organizing data by meaning rather than keywords to enable intelligent search and retrieval.

Generality: 695
Uncertainty Estimation
Uncertainty Estimation

Quantifying how confident a model is in its own predictions.

Generality: 720
Incidental Polysemanticity
Incidental Polysemanticity

When a single neuron encodes multiple unrelated concepts due to representational compression.

Generality: 166