A framework that scores variable configurations with a scalar energy instead of an explicit probability.
Energy-based models (EBMs) are a broad class of probabilistic models that assign a scalar energy value to every possible configuration of variables, where lower energy indicates a more compatible or preferred configuration. Rather than directly specifying a normalized probability distribution, EBMs define an implicit distribution through the Boltzmann relationship: p(x) ∝ exp(−E(x; θ)), where E(x; θ) is a learned energy function parameterized by θ. This formulation is powerful because it sidesteps the need to explicitly enumerate or normalize over all possible configurations during model design—a requirement that becomes prohibitive in high-dimensional spaces.
The central challenge in working with EBMs is the partition function Z(θ) = ∫ exp(−E(x; θ)) dx, which normalizes the distribution but is almost always intractable to compute exactly. This intractability drives the development of specialized training and inference techniques. Contrastive divergence, persistent contrastive divergence, noise-contrastive estimation, and score matching are all methods designed to train EBMs without computing Z directly. Sampling from EBMs typically relies on Markov Chain Monte Carlo methods or Langevin dynamics, which iteratively refine samples by following the gradient of the energy landscape. These approximations introduce their own trade-offs in terms of computational cost, bias, and stability.
EBMs are particularly attractive for tasks involving structured, multimodal, or constraint-rich distributions. Because energies are additive, multiple EBMs can be composed naturally by summing their energy functions, enabling modular system design. They have found applications in image generation, structured prediction, anomaly detection, and as learned priors in hybrid generative systems. The framework also unifies many classical models—Hopfield networks, Boltzmann machines, and conditional random fields are all special cases of EBMs.
Interest in EBMs surged in the late 2010s and early 2020s as researchers demonstrated that deep neural networks could parameterize expressive energy functions and that scalable MCMC samplers made training feasible. The close relationship between EBMs and score-based generative models—which learn the gradient of the log-density rather than the density itself—further cemented their relevance, connecting them directly to the diffusion model revolution in generative AI. Today, EBMs remain an active research frontier bridging probabilistic modeling, generative modeling, and representation learning.