A generative model that learns a structured latent space via probabilistic encoding and decoding.
A Variational Autoencoder (VAE) is a class of generative model that combines deep neural networks with principles from Bayesian inference to learn compact, structured representations of data. Unlike standard autoencoders, which map inputs to fixed points in a latent space, VAEs map inputs to probability distributions — typically Gaussians — over that space. During training, the encoder network outputs the parameters of these distributions (mean and variance), a sample is drawn from them, and the decoder network attempts to reconstruct the original input from that sample. This stochastic bottleneck forces the model to learn smooth, continuous latent representations rather than memorizing individual data points.
The training objective of a VAE is the Evidence Lower Bound (ELBO), which balances two competing terms. The first is a reconstruction loss that penalizes the model when its output diverges from the input. The second is a Kullback-Leibler (KL) divergence term that regularizes the learned distributions to remain close to a standard normal prior. This regularization is what gives the latent space its generative utility: because the space is structured and continuous, interpolating between points or sampling randomly tends to produce coherent, meaningful outputs rather than noise or artifacts.
VAEs matter because they offer a principled, mathematically grounded framework for generative modeling and representation learning. They enable tasks such as image synthesis, data augmentation, anomaly detection, and disentangled representation learning — where individual latent dimensions correspond to interpretable factors of variation in the data. Compared to Generative Adversarial Networks (GANs), VAEs are generally more stable to train and provide an explicit likelihood estimate, though they can produce blurrier outputs due to the averaging effect of the reconstruction loss.
Introduced by Diederik Kingma and Max Welling in their 2013 paper "Auto-Encoding Variational Bayes," VAEs quickly became a foundational tool in deep generative modeling. They have since been extended in numerous directions — including conditional VAEs, hierarchical VAEs, and vector-quantized variants — and remain central to research in representation learning, multimodal generation, and latent diffusion models.