Training models to measure meaningful similarity between data points for comparison tasks.
Similarity learning is a machine learning paradigm in which models are trained to produce representations of data such that distances or scores between those representations reflect meaningful, task-relevant similarity. Rather than predicting a fixed label or reconstructing input data, the goal is to learn a function — often an embedding — that maps inputs into a space where similar items cluster together and dissimilar items are pushed apart. This approach sits at the intersection of supervised and unsupervised learning, drawing on labeled pairs or triplets of examples to guide the learning process without requiring exhaustive class-level annotations.
The mechanics of similarity learning typically rely on specialized loss functions and network architectures. Siamese networks, which process two inputs through shared weights and compare their outputs, are a canonical architecture for pairwise similarity tasks. Triplet networks extend this idea by simultaneously considering an anchor, a positive example, and a negative example, optimizing a margin-based loss that enforces relative ordering in embedding space. Contrastive loss and triplet loss are among the most widely used objectives, though more recent approaches like NT-Xent (used in contrastive self-supervised learning) have broadened the toolkit considerably. The learned embedding spaces enable efficient nearest-neighbor search, making similarity learning highly practical at scale.
The applications of similarity learning are broad and consequential. In computer vision, it underpins face verification systems, image retrieval, and few-shot recognition, where a model must identify novel categories from only a handful of examples. In natural language processing, sentence embedding models trained with similarity objectives power semantic search and duplicate detection. Recommendation systems use learned item and user embeddings to surface relevant content. The technique is especially valuable in open-world settings where the set of classes is not fixed at training time, since a well-trained embedding generalizes to new categories without retraining.
Similarity learning gained significant momentum in the deep learning era, particularly after the widespread adoption of convolutional neural networks enabled rich visual representations. Its influence has only grown with the rise of contrastive self-supervised methods like SimCLR and MoCo, which demonstrated that powerful general-purpose embeddings could be learned without any labels at all — establishing similarity-based objectives as a cornerstone of modern representation learning.