Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. EM (Expectation-Maximization)

EM (Expectation-Maximization)

An iterative algorithm that estimates model parameters when latent variables are present.

Year: 1977Generality: 795
Back to Vocab

Expectation-Maximization (EM) is an iterative optimization algorithm used to find maximum likelihood estimates of parameters in probabilistic models that involve unobserved, or latent, variables. Because directly maximizing the likelihood function is often intractable when hidden variables are present, EM sidesteps this difficulty by alternating between two steps: inferring the likely values of the latent variables given current parameter estimates, and then updating the parameters to better explain the data under those inferences. This cycle continues until the estimates converge to a stable solution.

The algorithm proceeds in two distinct phases per iteration. In the E-step (Expectation), the algorithm computes the expected value of the log-likelihood with respect to the conditional distribution of the latent variables, effectively filling in soft assignments or probabilities for the hidden structure. In the M-step (Maximization), the parameters are updated to maximize this expected log-likelihood, treating the soft assignments as fixed. A key theoretical guarantee is that each full iteration is monotonically non-decreasing in likelihood, meaning the algorithm will never make the fit worse, though it may converge to a local rather than global maximum.

EM finds wide application across machine learning and statistics. Its most prominent use is fitting Gaussian Mixture Models (GMMs), where the latent variable indicates which Gaussian component generated each data point. The E-step assigns probabilistic cluster memberships, and the M-step re-estimates each component's mean, covariance, and mixing weight. Beyond GMMs, EM underpins training for Hidden Markov Models, probabilistic latent semantic analysis, and many Bayesian network inference procedures. It also connects deeply to variational inference, where the E-step is approximated when exact inference is intractable.

EM's appeal lies in its conceptual simplicity, guaranteed likelihood improvement per iteration, and broad applicability to models with missing or hidden data. Its main limitations are sensitivity to initialization, potential convergence to poor local optima, and slow convergence in high-dimensional settings. Despite these drawbacks, EM remains one of the most widely used algorithms in statistical machine learning, serving as both a practical workhorse and a theoretical foundation for more advanced inference methods.

Related

Related

GMM (Gaussian Mixture Models)
GMM (Gaussian Mixture Models)

Probabilistic models representing data as a weighted mixture of Gaussian distributions.

Generality: 731
MLE (Maximum Likelihood Estimation)
MLE (Maximum Likelihood Estimation)

A parameter estimation method that finds values making observed data most probable.

Generality: 875
Mixture Model
Mixture Model

A probabilistic model representing data as drawn from multiple component distributions.

Generality: 796
Energy-Based Models
Energy-Based Models

A framework that scores variable configurations with a scalar energy instead of an explicit probability.

Generality: 694
EBM (Energy-Based Model)
EBM (Energy-Based Model)

A model class that assigns lower energy scores to more probable data configurations.

Generality: 694
Empirical Risk Minimization
Empirical Risk Minimization

A core ML principle that minimizes average training loss to learn model parameters.

Generality: 838