A measure of uncertainty in the meaning of language model outputs.
Semantic entropy is a framework for quantifying uncertainty in the meanings produced by language models, rather than in their raw token-level predictions. While classical information-theoretic entropy measures the unpredictability of discrete symbols, semantic entropy operates at the level of meaning: two different generated strings that express the same proposition are treated as equivalent, and uncertainty is computed over these equivalence classes of meaning rather than over surface-level text. This distinction matters because a model might generate many paraphrases of the same idea with high token-level diversity but low semantic uncertainty, or conversely produce outputs that are superficially similar yet semantically contradictory.
In practice, semantic entropy is estimated by sampling multiple outputs from a language model for a given prompt, clustering those outputs by semantic equivalence (often using natural language inference models or embedding similarity to judge whether two responses mean the same thing), and then computing entropy over the resulting clusters. A high semantic entropy score signals that the model is genuinely uncertain about the correct answer — it is generating responses with meaningfully different content — while low semantic entropy suggests the model is consistently expressing the same underlying claim, even if the wording varies.
The concept gained traction in the context of hallucination detection in large language models. Because LLMs can produce fluent, confident-sounding text even when they are factually wrong, identifying when a model is uncertain about meaning — as opposed to merely uncertain about phrasing — provides a more reliable signal for flagging unreliable outputs. Semantic entropy has been shown to correlate with factual accuracy across question-answering benchmarks, making it a practical tool for selective prediction: systems can abstain or escalate to human review when semantic entropy exceeds a threshold.
More broadly, semantic entropy connects to longstanding challenges in natural language processing around ambiguity, polysemy, and context-dependence. Its value lies in grounding uncertainty estimation in the semantics of language rather than its statistics, offering a more interpretable and task-relevant measure of model confidence for high-stakes applications such as medical question answering, legal document analysis, and automated fact-checking.