A matrix factorization technique that reveals structure for dimensionality reduction and data analysis.
Singular Value Decomposition (SVD) is a fundamental matrix factorization method that decomposes any real or complex matrix A into the product of three matrices: A = UΣVᵀ. Here, U and V are orthogonal matrices whose columns represent left and right singular vectors respectively, while Σ is a diagonal matrix containing non-negative singular values arranged in descending order. These singular values quantify how much variance or "energy" each corresponding dimension captures, providing a principled way to understand the underlying structure of any dataset represented as a matrix.
The power of SVD in machine learning lies in its ability to produce optimal low-rank approximations of data. By retaining only the top k singular values and their associated vectors, practitioners can compress a matrix while minimizing reconstruction error — a property guaranteed by the Eckart–Young theorem. This truncated form drives Principal Component Analysis (PCA), Latent Semantic Analysis (LSA) for text mining, and collaborative filtering in recommendation systems. In each case, SVD strips away noise and redundancy, exposing the most informative latent dimensions in the data.
SVD has become a cornerstone of modern deep learning infrastructure as well. Low-rank SVD decomposition is used to compress large weight matrices in neural networks, reducing memory footprint and inference latency with minimal accuracy loss. It also appears in the analysis of training dynamics — researchers use SVD to study how the singular value spectrum of weight matrices evolves during training, offering insight into generalization and optimization behavior. More recently, techniques like LoRA (Low-Rank Adaptation) for fine-tuning large language models are directly inspired by SVD's low-rank approximation framework.
Beyond compression, SVD underpins numerical stability in many ML algorithms. Solving least-squares problems, computing pseudoinverses, and conditioning optimization landscapes all benefit from SVD's robust decomposition. Its computational cost — traditionally O(min(mn², m²n)) for an m×n matrix — has been addressed by randomized SVD algorithms that deliver approximate decompositions orders of magnitude faster, making it practical even for the massive matrices encountered in contemporary AI workloads.