A generative framework that learns to sample compositional objects proportional to a reward.
A Generative Flow Network (GFlowNet) is a probabilistic generative framework designed to learn policies that sample compositional objects—such as molecular graphs, sequences, or causal structures—with probability proportional to a given reward function. Unlike standard generative models that maximize likelihood or variational objectives, GFlowNets treat generation as a sequential decision-making process: an agent constructs an object step by step by taking actions in a directed acyclic graph of states, and training encourages the resulting flow of probability mass to satisfy a consistency condition known as the flow-matching or detailed balance constraint. This makes GFlowNets closely related to reinforcement learning, but with a fundamentally different goal—diversity of high-reward samples rather than maximization of a single reward.
The core training objective ensures that the total flow into any intermediate state equals the total flow out, analogous to conservation laws in physical flow networks. By satisfying these constraints across all states, the learned policy generates terminal objects with frequencies proportional to their rewards. This property is especially valuable when the reward landscape is multimodal: where a greedy or maximum-likelihood approach would collapse onto a single high-reward mode, a GFlowNet naturally explores and represents the full distribution of good solutions. Training can be performed using variants such as trajectory balance, which provides more stable and efficient gradient estimates than earlier flow-matching formulations.
GFlowNets are particularly well-suited to scientific discovery tasks where diversity matters as much as quality. In drug discovery, for example, a model that proposes many structurally distinct high-affinity molecules is far more useful than one that repeatedly suggests the same compound. They have also been applied to Bayesian structure learning, combinatorial optimization, and active learning, where the ability to maintain uncertainty and explore broadly is critical. Their connection to amortized variational inference and energy-based models gives them a principled probabilistic interpretation, allowing them to serve as flexible approximate samplers for intractable posteriors.
Introduced by Yoshua Bengio and collaborators in 2021, GFlowNets have rapidly attracted research interest as a unifying framework bridging reinforcement learning, probabilistic inference, and deep generative modeling. Their ability to turn reward signals into calibrated generative distributions positions them as a promising tool wherever exploration and diversity are essential.