Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Observatory
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Kolmogorov Complexity

Kolmogorov Complexity

The length of the shortest program that produces a given string as output.

Year: 1965Generality: 760
Back to Vocab

Kolmogorov complexity is a measure of the information content of a string or dataset, defined as the length of the shortest computer program—written in some fixed universal programming language—that produces that string as its output. A string that contains regular patterns can be described by a short program, giving it low Kolmogorov complexity, while a truly random string cannot be compressed below its own length and therefore has high complexity. This framework provides a rigorous, mathematical definition of randomness and simplicity that does not depend on any particular statistical model or prior assumptions.

One of the concept's most important properties is its near language-independence: the choice of universal programming language affects the complexity measure by at most a constant, making Kolmogorov complexity an intrinsic property of the data itself rather than an artifact of representation. This invariance theorem is what gives the measure its theoretical power. In practice, however, Kolmogorov complexity is uncomputable—no algorithm can calculate the exact shortest program for an arbitrary string—so researchers work with computable approximations, such as standard compression algorithms like gzip or LZW, which serve as practical upper bounds.

In machine learning and AI, Kolmogorov complexity underpins several foundational ideas. It is central to Solomonoff induction, a theoretically optimal framework for sequence prediction that assigns higher prior probability to simpler hypotheses. It also connects directly to the Minimum Description Length (MDL) principle, which formalizes Occam's razor as a model selection criterion: prefer the model that most compresses the data. These ideas influence modern regularization techniques, Bayesian model comparison, and the theoretical analysis of generalization in learning algorithms.

Beyond formal theory, Kolmogorov complexity shapes intuitions about what it means for a model to truly learn structure versus memorize noise. A model that captures genuine regularities in data effectively compresses it, while one that overfits essentially encodes random fluctuations. This perspective has influenced research into neural network compression, meta-learning, and the study of implicit biases in deep learning, where simpler functions are empirically favored even without explicit regularization.

Related

Related

AIT (Algorithmic Information Theory)
AIT (Algorithmic Information Theory)

A framework quantifying data complexity via the shortest program that produces it.

Generality: 752
Solomonoff Induction
Solomonoff Induction

A universal Bayesian framework for prediction grounded in algorithmic information theory.

Generality: 678
Computational Complexity Theory
Computational Complexity Theory

A framework classifying problems by the computational resources required to solve them.

Generality: 875
Algorithmic Probability
Algorithmic Probability

The probability that a random program produces a specific output on a universal Turing machine.

Generality: 657
MDL (Minimum Description Length)
MDL (Minimum Description Length)

An information-theoretic principle selecting the model that most compresses data plus its own description.

Generality: 692
Linear Complexity
Linear Complexity

An algorithm whose runtime or memory usage scales directly with input size.

Generality: 796