A probabilistic classifier assuming all input features are mutually independent given the class.
The Naive Bayes classifier is a probabilistic machine learning model built on Bayes' theorem, which provides a way to compute the posterior probability of a class given observed feature values. The core formula relates this posterior to the product of the prior class probability and the likelihood of each feature given the class. The "naive" assumption is that all input features are conditionally independent of one another given the class label — a simplification that rarely holds in practice but dramatically reduces the number of parameters the model must estimate, making training fast and data-efficient.
Despite its strong independence assumption, Naive Bayes often performs surprisingly well in real-world tasks. Because the model only needs to estimate per-feature likelihoods rather than joint feature distributions, it scales gracefully to high-dimensional data. Different variants handle different data types: Gaussian Naive Bayes models continuous features as normally distributed, Multinomial Naive Bayes suits count-based data like word frequencies in text, and Bernoulli Naive Bayes works with binary feature vectors. The classifier outputs a probability score for each class and assigns the label with the highest posterior probability.
Naive Bayes became especially prominent in the 1990s through its success in text classification and spam filtering. Researchers demonstrated that even with violated independence assumptions, the classifier produced well-ranked probability estimates sufficient for accurate classification. Its low computational cost made it a practical baseline long before deep learning dominated the field, and it remains a standard first-pass model for natural language processing tasks such as sentiment analysis, topic labeling, and email filtering.
Beyond its practical utility, Naive Bayes occupies an important conceptual role in machine learning as a clear illustration of generative probabilistic modeling — the model explicitly learns a distribution over inputs for each class, then applies Bayes' rule at inference time. This contrasts with discriminative models that learn decision boundaries directly. Understanding Naive Bayes provides intuition for more complex probabilistic models, including Bayesian networks and latent variable models, making it a foundational concept in the probabilistic ML toolkit.