Quantifies vector similarity by summing the products of corresponding elements.
Dot product similarity is a fundamental operation in machine learning that measures the alignment between two vectors by computing the sum of the products of their corresponding elements. Given two vectors a and b, the dot product is defined as Σ(aᵢ × bᵢ) across all dimensions. The resulting scalar reflects both the magnitude of each vector and the angle between them, making it a natural measure of how much two representations "agree" in direction and scale. A large positive value indicates strong alignment, a value near zero suggests orthogonality (no similarity), and a negative value implies opposition.
In practice, dot product similarity is central to many core ML operations. In neural networks, every linear layer computes dot products between input vectors and weight vectors. In attention mechanisms—particularly the scaled dot-product attention used in Transformers—query and key vectors are compared via dot products to determine how much each token should attend to every other. The result is scaled by the square root of the vector dimension to prevent extremely large values from pushing softmax outputs into regions with vanishing gradients, a detail that illustrates how carefully this operation must be managed at scale.
When vectors are L2-normalized to unit length, the dot product becomes equivalent to cosine similarity, ranging from -1 to 1 and measuring only angular difference regardless of magnitude. This property is heavily exploited in embedding-based retrieval systems—such as dense passage retrieval for question answering or semantic search—where documents and queries are encoded as vectors and ranked by their dot product scores. The choice between raw dot product and cosine similarity often depends on whether magnitude carries meaningful information: in recommendation systems, for instance, a higher-magnitude user vector might legitimately indicate stronger preferences.
Dot product similarity scales efficiently with modern hardware. Matrix multiplication, which computes dot products between all pairs of rows and columns simultaneously, is the dominant operation in GPU-accelerated deep learning. Approximate nearest-neighbor libraries like FAISS exploit this structure to perform billion-scale similarity search in milliseconds, making dot product similarity not just theoretically elegant but practically indispensable across retrieval, ranking, and representation learning tasks.