A neural network that represents uncertainty by placing probability distributions over its weights.
A Bayesian Neural Network (BNN) is a neural network architecture in which the model's weights and biases are treated as probability distributions rather than fixed point estimates. Instead of learning a single set of parameters, a BNN learns a posterior distribution over parameters given the training data, following Bayes' theorem. This probabilistic treatment allows the network to express not just a prediction, but a measure of confidence in that prediction — a capability that standard deterministic neural networks fundamentally lack.
The mechanics of BNNs center on computing the posterior distribution over weights, which is generally intractable for large networks. Practitioners rely on approximation methods to make this feasible. Markov Chain Monte Carlo (MCMC) sampling can draw from the true posterior but is computationally expensive at scale. Variational inference offers a more practical alternative by approximating the posterior with a simpler, parameterized distribution and optimizing the fit using the evidence lower bound (ELBO). More recent approaches, such as Monte Carlo Dropout, treat standard dropout at inference time as an approximation to Bayesian inference, enabling uncertainty estimates with minimal architectural changes.
The value of BNNs lies primarily in their ability to quantify two distinct sources of uncertainty: aleatoric uncertainty, which reflects irreducible noise in the data itself, and epistemic uncertainty, which reflects gaps in the model's knowledge that could in principle be reduced with more data. This distinction is critical in high-stakes domains such as medical diagnosis, autonomous driving, and scientific modeling, where knowing how confident a model is can be as important as the prediction itself. BNNs also exhibit natural resistance to overfitting, since the prior distribution over weights acts as a regularizer.
Despite their theoretical appeal, BNNs remain challenging to scale to the very large architectures that dominate modern deep learning. Computational cost, sensitivity to prior choice, and the difficulty of validating calibration in practice are active research problems. Nonetheless, BNNs represent a principled foundation for uncertainty-aware machine learning, and ongoing work in scalable approximate inference continues to close the gap between Bayesian ideals and practical deployment.