A framework modeling the brain as a hierarchy that minimizes prediction errors about sensory input.
Predictive processing is a theoretical framework in cognitive science and computational neuroscience proposing that the brain operates as a hierarchical prediction machine. Rather than passively receiving and interpreting sensory data, the brain continuously generates top-down predictions about incoming signals and compares them against actual sensory input. The discrepancy between prediction and reality — called prediction error — is propagated upward through the hierarchy, triggering updates to the internal model. This cycle of prediction, comparison, and correction allows the system to build increasingly accurate representations of the world while minimizing surprise.
The framework draws heavily on Bayesian inference, treating perception as a form of probabilistic inference where prior beliefs are combined with sensory evidence to form posterior estimates. Karl Friston formalized this intuition through the free energy principle, which frames the brain's goal as minimizing a quantity called free energy — a bound on the surprise or unexpectedness of sensory observations. This mathematical grounding gave predictive processing a rigorous foundation and connected it to broader theories of self-organizing biological systems. Andy Clark's philosophical work helped popularize the framework and extend it to action, attention, and consciousness.
In machine learning, predictive processing has influenced the design of generative models, particularly variational autoencoders (VAEs) and hierarchical latent variable models, which share the same predict-and-correct architecture. The framework also resonates with self-supervised learning approaches, where models learn by predicting masked or future inputs rather than relying on labeled data. Predictive coding networks — neural architectures that explicitly implement error-propagation between hierarchical layers — have been explored as biologically plausible alternatives to standard backpropagation.
The significance of predictive processing for AI lies in its potential to produce systems that are more sample-efficient, robust to noise, and capable of operating under uncertainty. By treating perception and learning as active inference rather than passive pattern matching, the framework suggests architectures that can generalize from limited data and adapt fluidly to novel environments — properties that remain challenging for conventional deep learning systems.