Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Cosine Similarity

Cosine Similarity

A measure of angular similarity between two vectors, regardless of their magnitude.

Year: 1975Generality: 796
Back to Vocab

Cosine similarity is a metric that quantifies the similarity between two vectors by computing the cosine of the angle between them. Given two vectors A and B, the cosine similarity is calculated as their dot product divided by the product of their magnitudes: cos(θ) = (A·B) / (‖A‖‖B‖). The result ranges from -1 to 1, where 1 indicates identical orientation, 0 indicates orthogonality (no similarity), and -1 indicates opposite directions. Crucially, the metric is insensitive to vector magnitude — only direction matters — making it well-suited for comparing objects whose scale varies independently of their content.

In machine learning and natural language processing, cosine similarity is most commonly applied to high-dimensional embeddings. In the vector space model of text, documents or words are represented as vectors where each dimension corresponds to a term or feature. Two documents with similar vocabulary distributions will point in similar directions through this space, even if one is much longer than the other. This property makes cosine similarity preferable to Euclidean distance for sparse, high-dimensional data, where raw distance measures are distorted by differences in vector length. Modern dense embedding models — such as word2vec, GloVe, and sentence transformers — also rely heavily on cosine similarity to surface semantic relationships between words, sentences, and documents.

Beyond NLP, cosine similarity is widely used in recommendation systems, image retrieval, and anomaly detection, wherever learned feature vectors need to be compared efficiently. It serves as the backbone of nearest-neighbor search in embedding spaces, enabling applications like semantic search, duplicate detection, and clustering. Its computational simplicity and geometric interpretability have made it one of the most broadly applied similarity measures across machine learning, and it remains a default choice when working with any kind of dense or sparse vector representation.

Related

Related

Dot Product Similarity
Dot Product Similarity

Quantifies vector similarity by summing the products of corresponding elements.

Generality: 694
Similarity Computation
Similarity Computation

Quantifying how alike two data objects are to support learning algorithms.

Generality: 709
Similarity Search
Similarity Search

Finding the most similar items to a query within a large dataset.

Generality: 794
Similarity Learning
Similarity Learning

Training models to measure meaningful similarity between data points for comparison tasks.

Generality: 694
Similarity Masking
Similarity Masking

Suppressing redundant or overly similar features to sharpen model focus on distinct information.

Generality: 293
Word Vector
Word Vector

Dense numerical representations of words encoding semantic meaning and linguistic relationships.

Generality: 720