Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Similarity Learning

Similarity Learning

Training models to measure meaningful similarity between data points for comparison tasks.

Year: 2005Generality: 694
Back to Vocab

Similarity learning is a machine learning paradigm in which models are trained to produce representations of data such that distances or scores between those representations reflect meaningful, task-relevant similarity. Rather than predicting a fixed label or reconstructing input data, the goal is to learn a function — often an embedding — that maps inputs into a space where similar items cluster together and dissimilar items are pushed apart. This approach sits at the intersection of supervised and unsupervised learning, drawing on labeled pairs or triplets of examples to guide the learning process without requiring exhaustive class-level annotations.

The mechanics of similarity learning typically rely on specialized loss functions and network architectures. Siamese networks, which process two inputs through shared weights and compare their outputs, are a canonical architecture for pairwise similarity tasks. Triplet networks extend this idea by simultaneously considering an anchor, a positive example, and a negative example, optimizing a margin-based loss that enforces relative ordering in embedding space. Contrastive loss and triplet loss are among the most widely used objectives, though more recent approaches like NT-Xent (used in contrastive self-supervised learning) have broadened the toolkit considerably. The learned embedding spaces enable efficient nearest-neighbor search, making similarity learning highly practical at scale.

The applications of similarity learning are broad and consequential. In computer vision, it underpins face verification systems, image retrieval, and few-shot recognition, where a model must identify novel categories from only a handful of examples. In natural language processing, sentence embedding models trained with similarity objectives power semantic search and duplicate detection. Recommendation systems use learned item and user embeddings to surface relevant content. The technique is especially valuable in open-world settings where the set of classes is not fixed at training time, since a well-trained embedding generalizes to new categories without retraining.

Similarity learning gained significant momentum in the deep learning era, particularly after the widespread adoption of convolutional neural networks enabled rich visual representations. Its influence has only grown with the rise of contrastive self-supervised methods like SimCLR and MoCo, which demonstrated that powerful general-purpose embeddings could be learned without any labels at all — establishing similarity-based objectives as a cornerstone of modern representation learning.

Related

Related

Similarity Computation
Similarity Computation

Quantifying how alike two data objects are to support learning algorithms.

Generality: 709
Similarity Search
Similarity Search

Finding the most similar items to a query within a large dataset.

Generality: 794
Contrastive Learning
Contrastive Learning

A self-supervised technique that learns representations by comparing similar and dissimilar data pairs.

Generality: 694
Siamese Network
Siamese Network

A twin neural network architecture that learns similarity by comparing two inputs.

Generality: 595
Non-Contrastive Learning
Non-Contrastive Learning

Self-supervised representation learning that requires no negative example pairs.

Generality: 575
Similarity Masking
Similarity Masking

Suppressing redundant or overly similar features to sharpen model focus on distinct information.

Generality: 293