Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Covariance

Covariance

A measure of how two random variables vary together in direction and magnitude.

Year: 1990Generality: 0.88
Back to Vocab

Covariance is a statistical measure that quantifies the degree to which two random variables change together. When two variables tend to increase and decrease simultaneously, their covariance is positive; when one tends to increase as the other decreases, the covariance is negative; and when the variables are statistically independent, their covariance is zero. Mathematically, the covariance of variables X and Y is defined as the expected value of the product of their deviations from their respective means: Cov(X, Y) = E[(X − μₓ)(Y − μᵧ)]. This signed quantity captures both the direction and a sense of the strength of the linear relationship between two variables.

In machine learning, covariance plays a central role in understanding data structure and feature relationships. The covariance matrix — a square matrix containing the pairwise covariances of all features in a dataset — is foundational to techniques like Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Gaussian mixture models. PCA, for instance, diagonalizes the covariance matrix to find orthogonal directions of maximum variance, enabling dimensionality reduction while preserving as much information as possible. Covariance also underpins Gaussian processes, where the covariance (or kernel) function encodes prior assumptions about the smoothness and structure of functions being modeled.

One important limitation of covariance is that its magnitude depends on the scale of the variables, making direct comparisons across different datasets or feature pairs difficult. Normalizing covariance by the product of the standard deviations of the two variables yields the Pearson correlation coefficient, which is bounded between −1 and 1 and is scale-invariant. Despite this limitation, raw covariance remains essential in many algorithms where the absolute scale of variation matters, such as in Kalman filters and multivariate normal distributions.

Covariance estimation from finite samples is itself a significant challenge in high-dimensional machine learning settings. When the number of features exceeds the number of observations, the sample covariance matrix becomes singular and unreliable. Techniques such as shrinkage estimation (e.g., the Ledoit-Wolf estimator) and sparse covariance estimation have been developed to produce well-conditioned covariance matrices in these regimes, making robust covariance estimation an active area of research in modern machine learning.

Related

Related

Equivariance
Equivariance

A function property where input transformations produce corresponding, predictable transformations in the output.

Generality: 0.69
PCA (Principal Component Analysis)
PCA (Principal Component Analysis)

Dimensionality reduction technique that projects data onto its highest-variance directions.

Generality: 0.87
Cosine Similarity
Cosine Similarity

A measure of angular similarity between two vectors, regardless of their magnitude.

Generality: 0.80
Stochastic
Stochastic

Describing processes or systems that incorporate randomness and probabilistic outcomes.

Generality: 0.75
Variance Reduction Techniques
Variance Reduction Techniques

Methods that decrease estimation variability to improve model accuracy and reliability.

Generality: 0.72
Matrix Models
Matrix Models

Mathematical frameworks using parameter-defined matrices to represent and learn complex relationships from data.

Generality: 0.70