A large pre-trained model adaptable to many tasks without retraining from scratch.
A foundation model is a large-scale AI system trained on broad, diverse data that can be adapted to a wide range of downstream tasks. Rather than building specialized models from scratch for each application, practitioners fine-tune or prompt a single pre-trained base model, dramatically reducing the cost and data requirements of deploying AI in new domains. The term was formally introduced by Stanford's Center for Research on Foundation Models in 2021, though the underlying paradigm had been building for years through models like BERT and GPT-3.
These models work by learning rich, general-purpose representations during a computationally intensive pre-training phase, typically using self-supervised objectives on massive text, image, or multimodal corpora. The resulting model encodes broad world knowledge and transferable patterns that can be unlocked for specific tasks through fine-tuning on labeled data, retrieval augmentation, or prompt engineering. Scale is central to the paradigm: as model size and training data grow, emergent capabilities appear that were not explicitly trained for, such as in-context learning, chain-of-thought reasoning, and cross-modal understanding.
Foundation models matter because they fundamentally shift the economics and accessibility of AI development. Organizations that lack the resources to train billion-parameter models from scratch can still build capable applications by adapting publicly available or API-accessible foundation models. This has accelerated progress across fields including medicine, law, software engineering, and scientific research, where domain-specific labeled data is scarce but general language or vision understanding is highly valuable.
The paradigm also raises important concerns. Because a single foundation model may underlie thousands of downstream applications, any biases, factual errors, or safety failures baked into pre-training can propagate at scale — a phenomenon sometimes called homogenization risk. Researchers are actively studying how to audit, align, and robustly adapt foundation models to ensure that their broad deployment remains safe and beneficial across the diverse contexts in which they are used.