Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Log Likelihood

Log Likelihood

The logarithm of a likelihood function, simplifying probabilistic model optimization and parameter estimation.

Year: 1950Generality: 838
Back to Vocab

Log likelihood is the natural logarithm of the likelihood function — a measure of how probable observed data is under a given set of model parameters. In statistical modeling and machine learning, the likelihood function is typically expressed as a product of probabilities across many data points, which can become numerically unwieldy for large datasets. Taking the logarithm converts this product into a sum, dramatically improving numerical stability and making the function far easier to work with analytically and computationally. Because the logarithm is a monotonically increasing function, maximizing the log likelihood is mathematically equivalent to maximizing the likelihood itself, preserving the same optimal parameter values.

Log likelihood sits at the heart of maximum likelihood estimation (MLE), the dominant framework for fitting probabilistic models to data. In practice, many optimization algorithms minimize a loss function rather than maximize an objective, so practitioners often work with the negative log likelihood (NLL) as a loss. Gradient-based methods like stochastic gradient descent can then efficiently minimize NLL by computing its derivatives with respect to model parameters. This connection makes log likelihood directly relevant to training a wide range of models, from logistic regression and Gaussian mixture models to hidden Markov models and deep neural networks with probabilistic output layers such as softmax classifiers.

Beyond parameter estimation, log likelihood serves as a principled tool for model comparison and evaluation. Metrics like the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) are built on log likelihood values, penalizing model complexity to guard against overfitting. In deep learning, cross-entropy loss — ubiquitous in classification tasks — is mathematically equivalent to minimizing the negative log likelihood under a categorical distribution, illustrating how foundational this concept is across modern machine learning. Its combination of mathematical elegance, computational tractability, and theoretical grounding makes log likelihood one of the most pervasive ideas in both classical statistics and contemporary AI.

Related

Related

Log Odds
Log Odds

The logarithm of the odds ratio, linking probabilities to linear model outputs.

Generality: 694
MLE (Maximum Likelihood Estimation)
MLE (Maximum Likelihood Estimation)

A parameter estimation method that finds values making observed data most probable.

Generality: 875
Logits
Logits

Raw, unnormalized scores output by a neural network before probability conversion.

Generality: 700
Cross-Entropy Loss
Cross-Entropy Loss

A loss function measuring divergence between predicted probability distributions and true labels.

Generality: 838
Logistic Regression
Logistic Regression

A classification algorithm that models the probability of a binary outcome.

Generality: 838
Loss Function
Loss Function

A mathematical measure of error that guides model training toward better predictions.

Generality: 909