When neural networks lose prior knowledge after learning new tasks sequentially.
Catastrophic forgetting is a fundamental failure mode in neural networks where training on new data causes the model to rapidly and severely degrade its performance on previously learned tasks. This happens because gradient-based learning adjusts a network's weights to minimize loss on the current training objective, with no inherent mechanism to protect representations that were useful for earlier tasks. The result is that new learning effectively overwrites old knowledge — a problem that becomes acute in any setting where a model must adapt continuously rather than being trained once on a fixed dataset.
The phenomenon is rooted in the distributed nature of neural representations. Unlike modular systems where knowledge about different tasks might be stored in separate components, a neural network encodes information diffusely across shared weights. When those weights shift to accommodate a new task, the delicate configurations that encoded prior knowledge are disrupted. The severity scales with how different the new task is from previous ones and how aggressively the network is updated.
Addressing catastrophic forgetting is central to the goal of continual learning — building systems that accumulate knowledge over time the way biological brains do. Proposed solutions fall into several broad categories: regularization-based methods like Elastic Weight Consolidation (EWC), which penalize changes to weights deemed important for prior tasks; replay-based methods, which periodically re-expose the model to stored or generated examples from old tasks; and architectural approaches like Progressive Neural Networks, which add new capacity for each task while freezing earlier representations. Each strategy involves trade-offs between plasticity (the ability to learn new things) and stability (the ability to retain old ones).
Catastrophic forgetting remains an active and unsolved research problem, particularly as large language models and foundation models are increasingly fine-tuned on specialized datasets. Even at scale, these models exhibit forgetting when adapted to new domains, making mitigation strategies practically important beyond academic benchmarks. The challenge sits at the intersection of optimization theory, neuroscience-inspired learning, and systems design, and solving it robustly would be a major step toward truly adaptive AI.