Decomposes a matrix into two non-negative factors for interpretable, parts-based representations.
Non-Negative Matrix Factorization (NMF) is a dimensionality reduction technique that decomposes a non-negative matrix V into two lower-rank non-negative matrices W and H, such that V ≈ WH. Unlike other factorization methods, the strict non-negativity constraint on all three matrices means that NMF learns purely additive combinations of components — there is no cancellation between positive and negative terms. This property makes NMF especially well-suited to data where values represent quantities that cannot meaningfully be negative, such as pixel intensities, word counts, or gene expression levels.
The algorithm works by iteratively updating W and H to minimize a reconstruction error between V and the product WH, typically measured using squared Euclidean distance or Kullback-Leibler divergence. Multiplicative update rules, introduced by Lee and Seung in 1999, are the most widely used optimization approach because they naturally preserve non-negativity throughout training without requiring constrained optimization solvers. More recent variants incorporate sparsity penalties, online learning, and probabilistic interpretations to improve scalability and robustness.
What makes NMF particularly valuable in machine learning is its tendency to produce parts-based representations — decompositions where each component corresponds to a meaningful, localized feature of the data. In face recognition, for example, NMF components often resemble eyes, noses, and mouths rather than the holistic, globally distributed features produced by PCA. In text mining, NMF applied to term-document matrices yields topics composed of co-occurring words, making it a natural competitor to probabilistic topic models like LDA.
NMF has found broad application across computer vision, bioinformatics, audio source separation, and recommender systems. Its interpretability advantage over methods like SVD or PCA comes at a cost: the non-convex optimization landscape means solutions are not unique and results can vary across runs. Choosing the correct rank — the number of components — also remains a practical challenge. Despite these limitations, NMF remains a foundational tool wherever additive, non-negative structure is a meaningful prior about the data.