The distance between a decision boundary and the nearest data points of each class.
In machine learning, margin refers to the gap between a model's decision boundary and the closest training examples from each class, known as support vectors. This concept is most prominently associated with Support Vector Machines (SVMs), where the learning objective is explicitly formulated to maximize this geometric separation. A wider margin indicates that the classifier has found a more confident and stable boundary, one that is less likely to misclassify points that differ slightly from the training data. The decision boundary that achieves the largest possible margin is called the maximum-margin hyperplane.
Maximizing the margin is not merely a geometric nicety — it has deep theoretical justification rooted in statistical learning theory. A larger margin corresponds to a lower VC dimension bound, which in turn implies tighter generalization guarantees: the model is less likely to overfit the training data and more likely to perform well on unseen examples. This insight transformed how researchers thought about classifier design, shifting focus from simply minimizing training error to explicitly controlling model complexity through geometric separation.
In practice, real-world data is rarely linearly separable, so the hard-margin formulation is extended to a soft-margin variant that tolerates a controlled number of misclassifications. This is governed by a regularization parameter that trades off margin width against training error. Additionally, the kernel trick allows SVMs to implicitly map data into high-dimensional feature spaces, enabling maximum-margin classification of nonlinearly separable datasets without explicitly computing the transformation.
The concept of margin extends well beyond SVMs and has influenced the broader field of machine learning. Boosting algorithms like AdaBoost can be analyzed through a margin framework, and margin-based loss functions appear throughout modern deep learning in the form of hinge loss, contrastive loss, and triplet loss. Understanding margin provides a unifying lens for thinking about generalization, robustness, and the geometry of learned decision boundaries across many model families.