A twin neural network architecture that learns similarity by comparing two inputs.
A Siamese network is a neural network architecture consisting of two or more identical subnetworks that share the same weights and parameters. Rather than classifying a single input, the network processes two inputs in parallel and produces embeddings that can be directly compared. Because both branches are structurally identical and weight-tied, any transformation applied to one input is applied in exactly the same way to the other, ensuring that the resulting representations live in a common feature space where meaningful distance comparisons are possible.
The comparison between outputs is typically performed using a distance metric such as Euclidean distance or cosine similarity, and the network is trained with a loss function designed to shape the embedding space appropriately. Contrastive loss is a classic choice: it penalizes large distances between embeddings of similar pairs and small distances between embeddings of dissimilar pairs. More recent variants use triplet loss, which anchors learning relative to a reference example, a positive match, and a negative mismatch simultaneously, often yielding better-structured embedding spaces.
Siamese networks are particularly well-suited to few-shot learning scenarios, where labeled data is scarce. Instead of learning to classify fixed categories, the network learns a general notion of similarity that can generalize to entirely new classes at inference time. This makes the architecture valuable in domains like face verification, signature authentication, medical image comparison, and one-shot object recognition, where collecting large labeled datasets for every possible class is impractical.
The architecture was introduced in 1993 for handwritten signature verification and gained renewed prominence in the deep learning era as researchers applied it to face recognition and image retrieval with convolutional backbones. Today, Siamese-style weight sharing and contrastive objectives underpin many self-supervised learning methods, including SimCLR and MoCo, making the core idea foundational to modern representation learning well beyond its original verification use case.