Generative models that learn complex distributions via composed invertible transformations with exact likelihoods.
Normalizing flows are a family of generative models that transform a simple, tractable base distribution—typically a standard Gaussian—into a complex target distribution by composing a sequence of invertible, differentiable mappings. Because each transformation is bijective, the model can compute exact log-likelihoods using the change-of-variables formula: the log-probability of a data point equals the log-probability of its latent encoding plus the sum of log absolute Jacobian determinants across all transformations. This stands in contrast to variational autoencoders, which optimize a lower bound on likelihood, and GANs, which offer no likelihood estimate at all.
The central engineering challenge in normalizing flows is designing transformations that are simultaneously expressive, efficiently invertible, and cheap to differentiate. Coupling-layer architectures such as NICE and RealNVP achieve this by splitting dimensions and applying conditionally affine maps, keeping Jacobians triangular and thus O(d) to compute. Autoregressive flows like MAF and IAF exploit autoregressive structure for the same triangular-Jacobian benefit, but trade off density evaluation speed against sampling speed depending on the direction of conditioning. Invertible 1×1 convolutions, introduced in Glow, extend these ideas to image generation with competitive visual quality. Continuous normalizing flows (FFJORD) replace discrete compositions with neural ODEs, allowing architecturally unconstrained transformations at the cost of numerical ODE integration during both training and inference.
Normalizing flows matter because exact likelihood is a powerful property: it enables principled model comparison, anomaly detection, and use as expressive variational posteriors in Bayesian inference without the approximation gap of ELBO-based methods. They have been applied to image synthesis, speech generation, molecular conformation modeling, and density estimation in scientific domains where calibrated uncertainty is critical. Their theoretical cleanliness also makes them a natural testbed for studying generative model expressivity and the geometry of learned representations.
Despite their elegance, normalizing flows face practical limitations: architectural constraints required for tractable Jacobians can limit expressivity relative to diffusion models or GANs, and scaling to very high-dimensional data remains computationally demanding. Research into more flexible flow families and hybrid architectures continues to be an active area, with flows increasingly used as components within larger probabilistic pipelines rather than as standalone generative models.