Generates target-distribution samples by accepting or rejecting candidates from a simpler proposal distribution.
Rejection sampling is a Monte Carlo technique for drawing samples from a probability distribution that is difficult to sample from directly. The method works by first selecting a proposal distribution that is easier to sample from and that, when scaled by a constant c, forms an envelope over the target distribution — meaning c·g(x) ≥ f(x) for all x. A candidate point is drawn from the proposal, and then a uniform random number is drawn to decide whether to accept it. The candidate is accepted with probability f(x) / (c·g(x)); otherwise it is discarded and the process repeats. Accepted samples are guaranteed to follow the target distribution exactly, making the method statistically sound rather than approximate.
The efficiency of rejection sampling depends critically on how tightly the scaled proposal envelope fits the target distribution. When c is large — meaning the proposal is a poor match for the target — most candidates are rejected, and the algorithm wastes significant computation. In high-dimensional spaces this problem becomes severe, as the acceptance rate tends to collapse exponentially with dimension. Practitioners therefore invest considerable effort in designing proposal distributions that closely mimic the shape of the target, or they turn to alternative methods such as importance sampling or Markov Chain Monte Carlo when rejection sampling becomes impractical.
In machine learning and probabilistic AI, rejection sampling appears in Bayesian inference pipelines, generative modeling, and data augmentation workflows. It is used to draw exact posterior samples when the posterior has a known unnormalized form but no closed-form sampler. More recently, rejection sampling has found a role in fine-tuning large language models: candidate outputs are generated and then filtered according to a reward or quality criterion, effectively sampling from a distribution skewed toward higher-quality responses. This application, sometimes called best-of-n sampling or rejection fine-tuning, has become a practical tool for aligning model behavior without gradient-based training.
Despite its simplicity, rejection sampling remains a foundational building block in probabilistic computing. Its guarantees of exactness — as opposed to the asymptotic correctness of MCMC — make it attractive whenever acceptance rates are manageable, and its conceptual clarity makes it an essential reference point for understanding more sophisticated sampling strategies.