Powerful general-purpose AI systems adaptable for both beneficial and harmful applications.
A dual-use foundational model is a large-scale AI system trained on broad data that acquires general-purpose capabilities transferable across a wide range of tasks and domains. Unlike narrow AI systems built for a single application, these models—such as large language models, multimodal systems, and code-generation tools—can be fine-tuned or prompted to perform everything from medical summarization and scientific discovery to content generation and software development. Their versatility is precisely what makes them foundational: they serve as a base layer upon which many downstream applications are built.
The "dual-use" dimension arises because the same capabilities that enable beneficial applications can be redirected toward harmful ones with relatively little additional effort. A model capable of synthesizing scientific literature could assist in drug discovery or help bad actors identify vulnerabilities in biological systems. A model fluent in persuasive writing can support education or generate large-scale disinformation. This asymmetry—where the barrier to misuse is far lower than the barrier to building the capability in the first place—distinguishes dual-use foundational models from earlier, narrower AI tools and makes governance substantially more difficult.
The challenge for researchers, developers, and policymakers is that restricting access to prevent misuse can simultaneously suppress legitimate innovation and scientific progress. Proposed mitigation strategies include staged access programs, usage monitoring, red-teaming evaluations before deployment, and model cards that document known risks. Regulatory frameworks such as the EU AI Act have begun to address high-capability general-purpose models explicitly, requiring transparency and risk assessments from developers whose models exceed certain computational thresholds.
The concept gained particular urgency after 2021, when models like GPT-3, Codex, and DALL-E demonstrated that a single pretrained system could be adapted to an unexpectedly broad range of sensitive tasks with minimal effort. This shifted the policy conversation from application-level regulation toward upstream governance of the foundational models themselves—raising unresolved questions about liability, access control, and the international coordination needed to manage risks that cross jurisdictional boundaries.