A model class that assigns lower energy scores to more probable data configurations.
Energy-Based Models (EBMs) are a broad family of machine learning models that frame learning as the process of shaping a scalar energy function over the space of inputs and outputs. Rather than directly modeling probabilities, an EBM assigns a real-valued energy score to each configuration of variables, with lower energies corresponding to more compatible or likely configurations. Inference then involves finding the variable assignments that minimize this energy, and learning involves adjusting the energy function so that correct or observed configurations receive lower scores than incorrect or unobserved ones.
The energy function itself is typically a parameterized neural network, giving EBMs the expressive power to capture complex, high-dimensional dependencies. Training requires contrasting low-energy regions (where real data lives) against high-energy regions (where data does not). This is non-trivial because computing the partition function — the normalizing constant needed to convert energies into probabilities — is generally intractable for high-dimensional data. As a result, EBM training often relies on techniques such as contrastive divergence, noise-contrastive estimation, or Markov Chain Monte Carlo (MCMC) sampling to approximate the necessary gradients.
EBMs gained renewed attention in the deep learning era as researchers recognized their flexibility: unlike directed generative models such as VAEs or GANs, EBMs impose no constraints on the architecture of the energy function and can naturally handle structured outputs, missing data, and multimodal distributions. They have been applied to image generation, anomaly detection, structured prediction, and reinforcement learning, among other domains. Notable instantiations include Restricted Boltzmann Machines, Deep Boltzmann Machines, and more recent score-based and diffusion models, which can be interpreted through an energy-based lens.
The appeal of EBMs lies in their conceptual unification: virtually any model that scores configurations can be viewed as an EBM, making the framework a powerful lens for understanding and designing learning systems. Their main practical challenge remains efficient training in high dimensions, an active area of research that continues to yield new algorithms and architectural innovations.