Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Dimension

Dimension

The number of independent axes defining a vector space used to represent data.

Year: 1990Generality: 0.90
Back to Vocab

In machine learning, dimension refers to the number of independent axes—or features—in the vector space used to represent data points. A single image patch described by pixel intensities, a word represented as a dense embedding, or a tabular record with dozens of measured attributes each occupy a space whose size is determined by how many coordinates are needed to uniquely locate any point within it. Choosing the right dimensionality is one of the most consequential decisions in model design, directly affecting what patterns a representation can capture and how efficiently it can be learned.

Higher-dimensional spaces allow richer, more expressive representations. Word embeddings, for instance, typically use 100 to 1,000 dimensions so that geometric relationships between vectors can encode semantic similarity, analogy, and syntactic structure simultaneously. Transformer-based language models work in embedding spaces of 768 to several thousand dimensions, enabling them to disentangle subtle contextual distinctions. The trade-off is computational cost: operations on high-dimensional vectors are expensive, and storing millions of such vectors demands significant memory.

A well-known hazard of high dimensionality is the curse of dimensionality—as the number of dimensions grows, the volume of the space expands exponentially, causing data points to become increasingly sparse and distances between them to lose discriminative power. This makes density estimation, nearest-neighbor search, and many learning algorithms progressively harder to apply reliably. Practitioners counter this through dimensionality reduction techniques such as PCA, t-SNE, and UMAP, which project data into lower-dimensional spaces while preserving the most informative structure.

The practical importance of dimensionality became especially clear with the rise of dense vector representations in natural language processing during the 2000s and 2010s. Methods like Latent Semantic Analysis, Word2Vec, and GloVe demonstrated that a carefully chosen number of dimensions could compress vast co-occurrence statistics into compact, generalizable representations. Today, selecting embedding dimension remains a core hyperparameter tuning decision across domains ranging from recommendation systems and graph neural networks to protein structure prediction and multimodal learning.

Related

Related

Dimensionality Reduction
Dimensionality Reduction

Transforming high-dimensional data into fewer dimensions while preserving essential structure.

Generality: 0.84
Dimension Returns
Dimension Returns

The output shape of a tensor or matrix after a computational operation.

Generality: 0.38
Curse of Dimensionality
Curse of Dimensionality

As feature count grows, data becomes exponentially sparse and algorithms degrade.

Generality: 0.84
Embedding Space
Embedding Space

A learned vector space where similar data points cluster geometrically close together.

Generality: 0.79
Word Vector
Word Vector

Dense numerical representations of words encoding semantic meaning and linguistic relationships.

Generality: 0.72
Embedding
Embedding

A dense vector representation that encodes semantic relationships between discrete items.

Generality: 0.88