Kolmogorov Complexity: Measuring the Simplicity of Information

Math & Logic

Kolmogorov Complexity: Measuring the Simplicity of Information

Imagine you want to send a friend a message, and you need to know the shortest way to describe it. If the message is "AAAAAAAAAA" (ten A's), you could describe it as "the letter A repeated 10 times," which is much shorter than writing it out fully. But if the message is "AEKXMZQBPJ," there's no shorter way to describe it than just writing it out. This is the core idea behind Kolmogorov complexity: a measure of how "simple" or "compressible" a piece of information really is. Formally, the Kolmogorov complexity of an object (like a text, image, or number) is the length of the shortest computer program that can produce that object as its output. It answers a deceptively simple question: what is the minimum amount of code you need to generate something? This concept comes from algorithmic information theory, a field that sits at the crossroads of computer science, mathematics, and philosophy.

Andrey Kolmogorov, a legendary Russian mathematician, introduced this idea in 1963, building on earlier work in information theory and computation. The concept was developed independently and further refined by others, including Ray Solomonoff and Gregory Chaitin, which is why it's sometimes called Solomonoff-Kolmogorov-Chaitin complexity. What made Kolmogorov's insight revolutionary was that it offered a mathematical way to formalize what we mean by "randomness" and "simplicity." Before this, these ideas were mostly intuitive. A highly random string of characters has high Kolmogorov complexity because you cannot compress it; it requires a program nearly as long as the string itself to produce it. A highly ordered string, like repeating patterns, has low complexity because a short program can generate it. This simple principle connected deep ideas in mathematics, logic, and computer science in a surprising way.

Here's how the idea works in practice. Suppose you want to compute the Kolmogorov complexity of a specific text using a particular programming language (Python, C, or any other language). You'd write the shortest possible program that outputs exactly that text, count the number of characters or bits in that program, and that's your complexity measure. The catch: different programming languages will give slightly different answers because some languages are more efficient than others. To handle this, mathematicians use a constant correction factor that doesn't change the fundamental conclusions. Another important detail: the choice of which programming language you use does matter in absolute terms, but for comparing two pieces of information, the ranking usually stays consistent. A random string will have higher complexity than an orderly one no matter which language you pick.

Why does this matter beyond abstract theory? Kolmogorov complexity has remarkable power to express deep truths about what can and cannot be computed. It can be used to restate some of the most famous impossibility results in mathematics and logic, including Cantor's diagonal argument (about the limits of infinite sets), Gödel's incompleteness theorem (about what mathematics can prove), and Turing's halting problem (about what computers can decide). One of the most striking results is Chaitin's incompleteness theorem, which states that no single program can compute the exact Kolmogorov complexity for infinitely many texts. In other words, there's a fundamental limit to how much a finite set of rules or axioms can say about the complexity of information itself. This connects randomness, computability, and the limits of proof in a profound way.

A memorable fact: if you pick a random string of 1,000 bits, its Kolmogorov complexity is almost certainly close to 1,000 bits itself because you can't compress randomness much. But the string "2,019,147,909" (which is the 1 millionth digit of pi) has low complexity because you can write a short program that computes pi to that many digits. So the digits look random to us, but they follow a simple rule, making them mathematically "simple" even though they appear chaotic. This reveals a hidden truth: true randomness is exceptionally rare, and most structures we see in nature have more order and compressibility than we realize.

Source: Wikipedia

`j`	Next card
`k`	Previous card
`r`	Read more on focused card
`?`	Show this help