Generative models producing outputs constrained or guided by specified input conditions.
Conditional generation is a paradigm in machine learning where a generative model produces outputs that are explicitly guided by some conditioning signal rather than sampling freely from a learned distribution. The conditioning information can take many forms — a class label, a text description, an image, a style attribute, or even a partial output — and the model learns to produce samples that are both realistic and consistent with that input. This stands in contrast to unconditional generation, where the model simply learns the marginal data distribution with no external guidance over what gets produced.
The mechanics vary by architecture. In conditional GANs (cGANs), both the generator and discriminator receive the conditioning signal, forcing the generator to produce outputs that match the condition while the discriminator learns to reject mismatched pairs. In transformer-based language and vision models, conditioning is typically achieved through cross-attention mechanisms or by prepending condition tokens to the input sequence. Diffusion models incorporate conditioning through classifier guidance or classifier-free guidance, where the denoising process is steered toward outputs consistent with the condition at inference time. Each approach trades off control fidelity, sample diversity, and computational cost differently.
Conditional generation is foundational to a wide range of practical applications: text-to-image synthesis, machine translation, image captioning, speech synthesis from text, drug molecule design given target properties, and instruction-following language models. The ability to specify what kind of output is desired transforms generative models from curiosities into useful tools. As conditioning mechanisms have grown more expressive — moving from simple class labels to rich natural language prompts — the flexibility and commercial relevance of conditional generation have expanded dramatically, making it one of the central ideas driving modern generative AI.