A measure of how unexpected or novel an outcome is given a model's predictions.
In machine learning and information theory, surprise (also called self-information) quantifies how unexpected an event is relative to a model's probability distribution. Formally, the surprise of an event with probability p is defined as −log(p), meaning rare events carry high surprise and near-certain events carry almost none. This measure is foundational to concepts like entropy, cross-entropy loss, and KL divergence, making it a quietly central quantity in how models are trained and evaluated. When a model assigns low probability to the true outcome, it experiences high surprise — and minimizing average surprise over a dataset is precisely what maximum likelihood training accomplishes.
In reinforcement learning, surprise has taken on a more active role as an intrinsic motivation signal. Curiosity-driven and exploration-based agents use surprise — often approximated by prediction error or model uncertainty — to seek out novel states that their world model cannot yet predict well. This encourages exploration beyond what sparse external rewards would drive, and has proven effective in environments where rewards are rare or delayed. Methods like intrinsic curiosity modules (ICM) and count-based exploration bonuses operationalize surprise to guide agent behavior.
Beyond reinforcement learning, surprise is central to anomaly detection, where high-surprise inputs signal potential outliers, fraud, or system faults. It also appears in Bayesian inference as a measure of how much new evidence updates prior beliefs, and in neuroscience-inspired models like predictive coding, where the brain is theorized to minimize prediction error — essentially surprise — at every level of perception. As AI systems are increasingly deployed in open-world settings where distribution shift is common, surprise-based metrics offer a principled way to detect when a model is operating outside its reliable range.