Odds concerning monkeys and the lottery

micco · January 23, 2004, 5:57pm

It depends on the “randomness” of whatever is generating your output. With monkeys typing, you could be unlucky because they could produce the same wrong output many times before they got around to the right output. In other cases where you’re systematically exhausting possibilities, you’re guaranteed to hit the right output once before you hit the time limit because you never repeat anything.

This is a common metric for people who brute-force encryption keys. If you’re systematically testing each encryption key, then you will eventually find it. You can’t get unlucky and have to wait longer than your max limit because you design your algorithm not to repeat trials. In these cases, the rule of thumb is that you will probably find the right key when you’ve run through half the key space. You might have to test them all, but you might hit it in the first try. This measure of “likely” is only really relevant if you’re trying to brute force a lot of keys and want an average requirement. Since you’re going to hit some early and some late, your average requirement is testing half the possibilities.

On the otherhand, with monkeys typing there is no guarantee. Their quality control is notoriously lax.

panamajack · January 23, 2004, 6:04pm

You’re correct. No particular sequence must come up in a finite length of time. Even with an infinite amout of time, it isn’t guaranteed (according to traditional probability theory). This was the point ultrafilter stated more formally.

Flash-57 · January 23, 2004, 6:16pm

True. I was changing the rules a bit. But then, you’d need a copy of Hamlet anyway to determine if one of the monkeys had actually recreated the play.

The initial argument seemed to be that enough randomness will eventually create any desired result. My addition of “guided randomness” will simply help randomness arrive at the final result much quicker.

Besides, you wouldn’t really need a copy of Hamlet to help guide the monkeys along. Even some simple rules would speed things up considerably (such as tossing out a “v” that followed a “g” or a third “p” in a row).

jk1245 · January 23, 2004, 6:56pm

Couple of thoughts (I’m not a statistics guru):

How are the odds changed if you give the monkeys a keyboard with only 27 keys on it (letters plus a space bar)?
The odds are equally likely that they’d type Hamlet on their first try as any other, correct?
In the “Yoshitsune” example (gotta remember that name), the time given is the max, right? Is there any way to calculate the expected time? I would think that there is a 50% chance your name would be found in half the time. Is this correct?

micco · January 23, 2004, 7:14pm

If you want to actually calculate the odds, you have define what character set you’re using. If you allow the monkeys to type a lot of characters which aren’t part of the desired output, the odds go up. If you limit them to the characters in the output you desire, the odds are better simply because you’re excluding the possibility of characters which would contribute to a failure. Of course, if you’re going to be picky about capitalization and punctuation, you’re dealing with more than 27 characters, more like 56+ depending on the actual punctuation marks which appear in Hamlet.

Yes.

See my earlier post. The time is the max if you’re preventing the monkeys from ever duplicating their work. This requires highly trained monkeys with good memories. If you’re restricting your trials so they’re unique (no repeats) then, by definition, there is a 50% chance you’ll hit in the first half of the trials.

Sublight · January 25, 2004, 2:54pm

You’re absolutely right, the 4 million years was just a rough guess of about how long it would probably take, within an order of magnitude or so. It could take much longer, and there’s no guarantee that it will happen in any finite amount of time.

Regarding ultrafilter’s post: while it’s true that given an infinite number of possible outcomes, there is a there is a chance that a particular outcome will never happen, even given an infinite amount of time, I don’t think it applies here. If you just look at Hamlet-sized blocks of text (about 200,000 characters), then the number of possible samples is finite, though extremely large. Given an infinite amount of time, therefore, a perfect copy of Hamlet would have to come up at some point.

(My favorite short story is Library of Babel. Does it show?)

Shade · January 25, 2004, 3:16pm

That’s an interesting point. However, I don’t think it happens here, since it’s ‘possible’ that the monkey hits ‘a’ every time. Then he never gets Hamlet. IIRC The monkey typing ‘aaa…’ is an example of what ultrafilter said: it has probability 0, yet can happen. So is any other sequence of letters.

El_Zagna · January 25, 2004, 4:36pm

Well, I’m having a hard time getting my mind around this. That seems to violate the nature of infinity that I’ve come to know and love. The idea that something with a probability of 1 may not ever occur in an infinite number of tries is curious enough, but the idea that something with a probability of 0 can somehow occur is just… NO WAY!

The only thing that keeps me from snorting in indignation at the very notion is that I’ve been fooled by the nature of infinity before.

ultrafilter · January 25, 2004, 5:15pm

Yeah, it’s bad to intuit about the nature of infinitary processes. They’re weird.

Would it help if I told you that every infinite sequence of characters has probability 0?

Will_Repair · January 25, 2004, 7:08pm

Your chances of winning the lottery are only marginally improved by actually purchasing a ticket.

js_africanus · January 25, 2004, 9:55pm

To the OP, may I suggest reading How to Lie With Statistics? Additionally, The Cartoon Guide to Statistics is very good, and Calculated Risks is super (IMO).
But don’t let that stop you from posting questions!

Shade · January 25, 2004, 10:39pm

Let me recount the reasoning in a pseudo-Socratic dialogue because I’m bored

Shade: Let’s see. The probability of ‘aaaa…’ must be zero because it’s so unlikely that all the letters are ‘a’.
Shade: Does everything have probability zero?
Shade: No, something must happen. Umm… ummm… yeah, ‘abadfijzasetpq…’ might be better.
Shade: But you’re actually thinking of all sequences like that. If you pick THAT sequence (assuming we know how an infinite sequence ends), it must be probability 0; it’s not more likely that the sequence starts ‘abad’ than ‘aaaa’ [ed: if you don’t get this bit, read one of the many dialogues on the subject :)] so why should it be different for long ones?
Shade: OK, OK, so any sequence is prob 0.
Shade: It must be!
Shade: But… SOMETHING must happen!
Shade: Well, yes. Something with infinitessimal probability.
Shade: Umm…
Shade: Colloquially.
Shade: OK. And all these zeros add to one?
Shade: I don’t like it, but what else could happen?
Shade: AHA! It’s like a line: the length of any point is zero, but they can be added up to get a length!
Shade: Yeah! Right! I think…
Shade: Cool.
Shade: But I bet the maths works out wrong. You can’t deal with infinities that easily.
[Current state of mathematical research in the world ex machina]: Actually, I did the maths earlier and it it wasn’t easy, but eventually I got that. Your intuition was, surprisingly, right.
Shade: But… my intuition also says other stuff. Shouldn’t we be dealing wih infinitessimal (‘just’ over zero) numbers?
[maths]: Well, maybe. But trust me, the first way works out more useful in the end. Go with it.
Shade: OK.

El_Zagna · January 26, 2004, 12:41am

But it’s not 0, it’s damn close to 0 but not 0. Or to put it in terms of the little calculus that I remember, it approaches zero as the sequence’s length approaches infinity. Add up an infinite number of things damn close to 0 and you get 1, I guess.

Shade · January 26, 2004, 1:05am

Firstly, sorry - my internal dialogue was just supposed to show the thought process, not be perfectly rigorous. Perhaps I should have made clear exactly what is normally accepted.

But I assure you that by far the standard and almost universal way of defining probabilities, the probability of an event is a real number.

If so, the probabilities in this case can be consistently defined, and the prob. of ‘aaa…’ would be 0, but it ‘can still occur’. I’m sure you could define probabilities using infinitessimals (numbers ‘damn close to zero’ ie. not zero, but between zero and any real number) somehow, which can be defined in terms of limits iirc, but I suspect that doing so is a can of worms.

Also, remember the ‘it approaches’ type argument is risky. For instance, P(first letter is ‘a’), P(first two letters are ‘a’), P(first three letters are ‘a’) etc. approach 0, and P(all letters are ‘a’) is ‘close’ to 0. But P(first three letters contain strings of 'a’s of any finite length) is 0, and is if we replace ‘three’ by anything, but P(all letters contains strings of 'a’s of any finite length) is close to 1. Sorry I don’t have a better example, but basically, I’ve learnt the hard way ‘be very careful with infinity’.

El_Zagna · January 26, 2004, 1:16am

I’m afraid I don’t know what you’re saying there, but I do agree that one should be very careful with infinity.

Topic		Replies	Views
Probability and Infinity. A rather incoherent question. Factual Questions	5	942	October 12, 2005
To Monkey, or not to Monkey... Factual Questions	11	1413	October 3, 2009
1-2-3-4-5-6 Factual Questions	50	2024	June 16, 1999
Millions of Monkeys - a Q of chance Factual Questions	56	2643	June 10, 2004
One in a zillion Factual Questions	2	737	August 9, 2000

Odds concerning monkeys and the lottery

Related topics