ary/index_
ary/ notes/ kolmogorov-complexity

Kolmogorov complexity, intuitively

seedling planted 2026-04-15 tended 2026-04-16 196 words #math

The Kolmogorov complexity of a string is the length of the shortest program that outputs it. It is a formalisation of the idea that "aaaaaaaaaa" is less complex than "48ka2nxqpw", even though both are ten characters long.

Call it K(x)K(x). Then:

K(x)=min{p:U(p)=x}K(x) = \min \{ |p| : U(p) = x \}

where UU is a fixed universal Turing machine and p|p| is the length in bits of program pp. The choice of UU changes K(x)K(x) by at most a constant — the invariance theorem, and it is what makes the definition useful despite appearing arbitrary.

why this matters

  • K(x)K(x) is uncomputable. You cannot write a function that returns it for arbitrary xx. (Proof: halting-problem reduction.)
  • It gives a rigorous definition of randomness — a string is random iff K(x)xK(x) \approx |x|.
  • Almost every result in algorithmic information theory descends from it.

See also minimum-description-length for the practical approximation, and occams-razor-as-a-prior for the philosophical consequence.

open in my head

I still don’t have a clean intuition for why K(x)K(x) is uncomputable while many weaker compressions (LZ77, gzip) are tractable. The gap feels philosophically important but I can’t articulate it yet.

see this note in the graph view