This reminds me of something that W.R, Bennett wrote in his wonderful article “How Artificial is Intelligence?” (American Scientist65 (6) 694-702 Nov.-Dec. 1977). It was his take on the question of how long it would take a roomful of monkeys with typewriters to reproduce the works of Shakespeare, only he used computers and a random number generator in place of the hardware. After trying truly random numbers, he started using random numbers and a table of probabilities for each letter, which gave somewhat better results. Then he used a table of the probabilities of pairs of letters, then probabilities of three letters, and 4 letters, and so on. By the time he got to “fifth order monkeys”, he was getting lots of words (often surprisingly obscene), and if he was using a foreign language, it would look like real text to someone who didn’t read it.
Of course, he was using some text to generate the probabilities, and the obvious question was “how many tries until you reproduce the original text?” Unless the order of your monkeys approached the length of the sampled text, he determined that it was highly unlikely that you ever would. There wasn’t enough “noise” and randomness in his pseudo-random number generator to achieve this.
Zero-th order (totally random letters and spaces) – ‘vydyrchoz e znvdlyvzrawzsuukretahbcwsizalkuvfnmrtuh’
First-order (English probabilities) - ‘d sdho e nyem w sa tuulsetahad ti anlvyennstuhtsedo’
Second-order (Pair probability) - ‘ue pamo wa whofo t rau s thig amad so ang thertusis cap helves’
Third-order (Triplet probability) - ‘rovere plety so dambelis to cardults priferibles hint ifingurn itionly whin’
Fourth-Order (Foursome probability) - ‘up said we whole troduck skulledge peral stand ster’
Fifth-Order (Fivesome probability) - ‘up the worth the sweat prolonialist hold staring good luckles properly‘
2nd order is at least pronounceable (mostly), while third-order and up, is looking distinctly English
(I used Moby-Dick, Swann’s Way (the English translation) and other out of copyright writings from Project Gutenberg - the age of which probably has something to do with the way the pseudo-English seems a bit old-fashioned)
That’s really cool.*
Similar is this early website (still works!) inspired by the Borges story “The Library of Babel.”
Letter combos that happen to be English words are highlighted, if you click “Anglishize” upon reaching a page.
Good luck in finding the answer to the riddle of the universe!
*”Prolonialist” … hmm… Someone who favors Spanish canvas (lona)?