A machine learning model that assigns input data to predefined categories.
A classifier is a machine learning model trained to assign categorical labels to input instances based on their features. Operating within the supervised learning paradigm, a classifier learns patterns from a labeled training dataset and then applies that learned mapping to predict the class of previously unseen examples. Classification tasks are broadly divided into binary classification, where the model distinguishes between two possible outcomes, and multi-class classification, where the model selects among three or more categories. A further variant, multi-label classification, allows a single instance to belong to multiple classes simultaneously.
The mechanics of classification vary widely across algorithm families. Logistic regression models the probability of class membership using a sigmoid function applied to a linear combination of features. Decision trees partition the feature space through a series of hierarchical if-then rules. Support vector machines find the maximum-margin hyperplane separating classes in high-dimensional space, often using kernel functions to handle nonlinear boundaries. Neural network classifiers, particularly deep convolutional networks, learn hierarchical feature representations directly from raw data, enabling state-of-the-art performance on tasks like image recognition and natural language classification.
Evaluating a classifier requires more than simple accuracy, especially when class distributions are imbalanced. Practitioners rely on metrics such as precision (the fraction of positive predictions that are correct), recall (the fraction of actual positives correctly identified), F1 score (the harmonic mean of precision and recall), and the area under the ROC curve (AUC-ROC), which measures discriminative ability across classification thresholds. Confusion matrices provide a granular breakdown of correct and incorrect predictions across all classes.
Classifiers are among the most widely deployed machine learning components in real-world systems. Applications span spam detection, medical diagnosis, sentiment analysis, fraud detection, object recognition, and genomic analysis. The rise of deep learning in the 2010s dramatically expanded what classifiers could achieve, enabling near-human or superhuman performance on benchmark tasks that were previously considered intractable. Choosing the right classifier for a given problem depends on dataset size, feature dimensionality, interpretability requirements, and computational constraints.