Surely you’ve heard of The Infinite Monkey Theorem. You probably don’t believe it. No way could that monkey accidentally type out anything meaningful, much less the complete works of Shakespeare. Well…
In several of his Discworld books, author Terry Pratchett featured something called Library-space, L-space for short. It’s defined as “a dimension that connects every library and book depository in the universe. L-Space is portrayed as a natural outgrowth of the fact that knowledge = power = energy = matter = mass and mass warps space, and therefore, libraries in the Discworld universe are a very dangerous place indeed for the unprepared”.
Somewhere, Pratchett wrote that L-space contains all the books that have been written, all those that will be written, and all those that would have been written but the author thought better of it. Well, how big is L-space?
To over-estimate, suppose L-space contains a billion (109) books, each book is 500 pages long, each page contains 4000 characters, and the characters are chosen from an “alphabet” of 500 marks (upper- and lower-case letters, numbers and punctuation marks, all in normal, bold and italic forms in a several different fonts). One book would then contain two million marks.
Now, how many possible books are there, including ‘impossible’ character combinations like “zqzqzqzq”? We can construct a “possible” book by choosing some random one of the 500 marks as the first character, the same or a different one as the second character (500×500 = 5002 = 250,000 possibilities so far) and so on, until we’ve built (or our monkey has typed) a two-million-character book. It could be a book that contains nothing but a string of a million copies of “zq” — but that’s OK, it’s still a possible book. So is the book that contains all the works of Shakespeare and so is a typo version that inconsistently misspells “Romeo.”
On this basis there are some 5002,000,000 = 105,397,940 different possible books. L-space with only a billion books is thus very small indeed compared to the number of possible books. Put another way, the set of all possible books (which we can call B-space) could hold 105,397,931 versions of the L-space that initially seemed so immense.
Note that there are two distinct operations involved in the Monkey Theorem’s process
- Generate a string of characters, and
- Identify a meaningful substring within that.
The monkey doesn’t care, it’s just typing.
In the second step of the process someone has to recognize Macbeth or The Tempest buried in all the nonsense. If we were walking through the stacks of B-space and pulled a book off the shelf, what are the odds that the book we grabbed belongs to L-space?
The answer is one in 105,397,931. That’s a very small probability, BUT IT’S NOT ZERO. By construction, we’re guaranteed that all the L-space books are in B-space – but we have a vanishingly small chance of finding one of them.
Now for our extremely patient monkey who has been typing for a really, really long time. It’s been at it long enough to produce many, many copies of B-space. After all, even 105,397,940 is a very small number compared to infinity.
The core of the Infinite Monkey Theorem is that with so much opportunity for duplication, we are guaranteed that there exists at least one complete and perfect copy of B-space and so at least one good copy of L-space and so at least one good copy of all the works of Shakespeare. Also there’s at least one copy of “zqzqzqzq”.
The challenge is in laying hands on that one good copy. From a physicist’s perspective, it’s such a low-probability event that it can be ignored. On the other hand, the probability of Life arising on Earth was pretty low, too, but I’m glad it happened.
~~ Rich Olcott
* I had a great “Monkeys typing” graphic, but they were chimps. Pratchett’s Diskworld Librarian would object, quite firmly, because apes aren’t monkeys.