What texts or images have we found in pi?

That’s certainly true of a true random number generator but is it true for pi? I’m probably wrong but it seems to be that by the nature of the way the pi is calculated that there could be sequences that couldn’t happen.

See post #6.

It is a hobby, and there is a club of sorts. Finding messages in pi lies at the convergence of recreational mathematics and recreational linguistics, both of which subjects are supported by small but very enthusiastic (not to mention prolific) communities. Among the many books and journals devoted to such things, Word Ways: The Journal of Recreational Linguistics has probably published the most on words embedded in pi and other irrational numbers. Offhand I can think of the following articles, all of which are old enough that they should be free to read on the journal’s archive without a subscription:
[ul]
[li]Base 27: The Key to a New Gematria by Lee Sallows (Vol. 26, No. 2)[/li][li]The Pi Code by Mike Keith (Vol. 32, No. 4)[/li][li]Cracking The Pi-Code by Dave Morice (Vol. 32, No. 4)[/li][li]Further Cracks in the Pi-Code by Mike Keith (Vol. 33, No. 1)[/li][li]Pi Words by Howard Bergerson and Mike Keith (Vol. 34, No. 1)[/li][li]Embedding Many Words in Pi by A. Ross Eckler (Vol. 39, No. 2)[/li][/ul]
I won’t repeat all their findings here, which probably run to a couple dozen pages, though I can promise you that they articles are all quite accessible, and probably quite interesting if you’re into linguistic and mathematical oddities.

Mike Keith, who wrote many of the above articles and who’s a coauthor of mine on some further work in this general area, also maintains a website devoted to recreational mathematics and linguistics. One of the pages on it is devoted specifically to messages in pi: The Pi Code

Don’t image formats tend to have a header? We could look at all the formats in use. If there’s a format that has a header of just a few bytes, followed by the data, we could just search for the header, and take whatever follows as the image data.

ETA:

Fortunately, I have Flash set to require me to click to play.

Gotta practice safe surfing, dontcha know…

Okay, understood.

Well then, when I open the gif image of the smiley in a simple text editor application, it gives me several lines of ASCII code. I’m most positive a search for that string would easily extend beyond billions, if not quadrillions of places of pi.

Pi, if so normalized, should have a googleplex of consecutive zeros.

ASCII is pretty inefficient, though. For starters, one might use a 5-bit code instead–sufficient for letters and basic punctuation. A very simple Huffman code would reduce the average to around 4 bits/character. A context-sensitive Arithmetic code can reach maybe 2.2 bits/character. That’s already reaching brute-force levels (~50 bits), but the theoretical limit of English text is more like 1 bit/character, which would make this passage easy to find.

You could interpret the digits of pi so that 2–8 equal shades of gray, 1 and 9 equal white and black respectively and 0 would indicate starting a new row of pixels. That way this would give you a simple B&W smiley face:

“…11111101911910911119019999101111110”

Yes, I’m aware that this all depends on the encoding scheme used, which is precisely why I specified a coding scheme. I could have picked any of a great number of encodings, but chose ASCII because it’s the most standard way of converting text to numbers. In any event, the encoding must be chosen before you start searching, or you can just say after the fact that you’re picking an encoding where “14” means the text of the complete works of Shakespeare.

And while we don’t know that pi is normal, we do have strong reason to suspect it, in that the vast majority of real numbers are normal, and there isn’t any reason to suspect that pi is not normal.

It’s odd to me that people are so interested in finding strings in π. Why not search for strings in the Champernowne constant instead (where you’re guaranteed to find them, and about as quickly as you’d expect from a random string of digits)? Too easy?

Well, “vast majority” according only to some particular distribution which has no particular connection to the distribution of human interest in arithmetic constants. In the same sense, the “vast majority” of real numbers are uncomputable, etc., contra the vast majority of arithmetic constants of longstanding mathematical interest.

There are measures on the real numbers such that the normal numbers have measure zero, so why should we not just as well consider ourselves to have strong reason to suspect pi is non-normal? Put another way, consider a bijection f between the normal and the non-normal numbers (both having, classically, the cardinality beth_1). Can we not just as well apply this kind of argument to suspect f(pi) is normal? But that would amount to suspecting pi is non-normal!

Right. Which is why this problem is actually a bit boring when you think about it. Inefficient codings guarantee that nothing of any interest will be found (184 bits is simply unreasonable, not to mention the bits you need for a paragraph or book). Efficient codings have the problem you describe (hell, we can encode every book with an ISBN number as… the ISBN number). In between we have codings (like Huffman) which are moderately efficient and somewhat general, but are still optimized toward a single language, etc.

And as Indistinguishable says, the Champernowne constant also contains every possible string, with the added advantage that we know exactly where that string starts. That seems even more boring, for some reason… but why? It’s just as a good a number as pi.

Well, yes, but we do have strong reason to suspect (warning: extreme understatement) that pi is computable.

I was not aware of the existence of these measures.

In any event, there is one other line of evidence that suggests pi’s normality: All of the finite sets of digits we’ve collected appear statistically consistent with normality. Now, granted, that’s not proof and can’t really be extended into a proof, since we can only ever sample finite sets and not the full infinite one, but it’s still suggestive.

Because pi is taught in grade school and is one of the most practical numbers taught so early, making it incredibly widely-known, familiar and easy to understand.

My date of birth occurs 4 times in the first 200 million digits of pi. That clearly means…something. Now I just have to figure out how to translate the rest of it to get the full story.

I think that one reason people are so fascinated by finding things in pi (or other irrational numbers, but pi is the best-known irrational number) is that they think that it would represent a data compression scheme. “Wait, so instead of dealing with all that information, I could just say what digit it start on, and find it all already there?”. They don’t realize that specifying the digit it starts on would take just as much information as the string itself.

Sure. I mean, I get it. I understand why this is a phenomenon of human behavior.

Still, I think it’s worth noting that 012345678910111213141516… is an even easier to understand string of digits (and, in some sense, more familiar and earlier taught), and guaranteed to contain what you’re looking for. This is my gift to all the tired π-searchers, brows sweaty and arms aching from their silly labor. You can finally relax. Enjoy!

Right, which is why I think pointing out a simple string of digits with manifestly the properties they’re hoping for from pi, and just as manifestly no use in compression, is useful to them.

The Champernowne constant lacks mystery and intrigue. Association with a pure geometric figure we are drawn to and draw. There’s something compelling about pi… its randomness.
We’re pretty sure every finite string is in there somewhere… but is it really? If I needed or wanted to, for whatever reason, can I find it? Does it speak something about nature itself? And, y’know: Pie.

When you’re searching pi for them, it’s my bet it’ll be the only place you’ll find them. :wink: