Optimizes expensive functions by building a probabilistic surrogate model to guide evaluation.
Bayesian Optimization is a sequential strategy for finding the optimum of objective functions that are expensive, noisy, or lack analytical form. Rather than evaluating the function exhaustively, it builds a probabilistic surrogate model — most commonly a Gaussian Process — that approximates the true function while quantifying uncertainty across the input space. This surrogate is far cheaper to query than the real objective, enabling intelligent decisions about where to sample next.
At each iteration, an acquisition function uses the surrogate's predictions and uncertainty estimates to select the next evaluation point. Common acquisition functions include Expected Improvement (EI), Upper Confidence Bound (UCB), and Probability of Improvement (PI). These functions formalize the exploration-exploitation tradeoff: exploration targets regions of high uncertainty where the model knows little, while exploitation focuses on regions already believed to be near the optimum. After each new evaluation, the surrogate is updated with the fresh data, progressively refining its approximation of the true function.
In machine learning, Bayesian Optimization became the gold standard for automated hyperparameter tuning — the process of selecting learning rates, regularization strengths, network architectures, and other configuration choices that dramatically affect model performance. Because training a deep neural network can take hours or days, evaluating hundreds of random configurations is impractical. Bayesian Optimization typically finds strong hyperparameter settings in far fewer evaluations than grid search or random search, making it both faster and more cost-effective in practice.
Beyond hyperparameter tuning, Bayesian Optimization is widely applied in drug discovery, materials science, robotics, and experimental design — anywhere that each function evaluation carries significant cost. Its adoption in the ML community accelerated in the early 2010s following work by Snoek, Larochelle, and Adams, who demonstrated its practical superiority over manual tuning for deep learning models. Tools like Spearmint, Hyperopt, and Optuna have since made Bayesian Optimization accessible to practitioners, cementing its role as a core component of modern AutoML pipelines.