Gambler's fallacy question

I understand the Gamblers fallacy, but I’ve been working on a little self-imposed probability question and it seems like my answer falls into the Gambler’s Fallacy of a number being “due”, but the math seems sound so I don’t know if my misunderstanding is with the fallacy, the math, or something else.

Say I have a true random number generator that generates (“rolls”) a number between one and a hundred. Let’s say I want to get some specific number (or one of a set of specific numbers, it doesn’t really matter, it just changes the size of the numerator), we’ll say 100 for simplicity. I don’t care how long it takes, I just want to get that number.

So the chance in any one roll of rolling 100 is 1/100. This means there’s a 99/100 chance of NOT rolling a 100.

Now, here’s where I can’t tell if I’m going wrong mathematically. So I want to know how much time to set aside to roll my 100. It seems to me that this logic is sound:

The chance of me rolling two 100’s in a row is (1/100)[sup]2[/sup]. So, the chances of me NOT rolling a 100 two times in a row is (99/100)[sup]2[/sup].
The chances of me rolling a 100 after two rolls is 1 - (99/100)[sup]2[/sup].

So it seems to me that 1-(99/100)[sup]x[/sup] is the chance that I’ll roll a 100 after “x” rolls. So if I want to set aside enough time for myself to roll that number (for whatever reason), I just solve 1-(99/100)[sup]x[/sup]=.9 for x, and then multiply x by the time it takes per roll. We’ll just say that I assume 90% is enough confidence and if I’m unlucky then I’ll set aside more time later.

But it seems to me like this is assuming that 100 becomes more “due” over time. In fact, if you take the limit as x goes to infinity, rolling a 100 on a fair generator becomes inevitable given an infinite amount of time (probability of 1).

Again, it seems like this is assuming that rolling the number of my choice becomes “due”, even though each roll is an independent event. Can somebody explain whether it’s my math or my understanding of the fallacy, or what else is wrong please?

I can try.

Your mistake is switching between considering only the next event, or considering the whole series.

If you hit two 100’s in a row, the chance AT THAT POINT of getting your next 100 is only 1 in 100. But the chance of that whole string of three in a row happening is 1 / 1000000.

A probability of 1 turns out not to mean inevitability. Conversely, a probability of zero turns out not to mean impossibility. For example, each point on a dartboard has a probability of zero of being hit–but one or the other point is nevertheless going to be hit, despite the zero probability.

As to the substance of your post, it’s true that a greater number of rolls increases the probability that one of those rolls will be 100. But once you start actually executing the rolls, given that none of the rolls so far has been 100, the chance of future rolls being 100 isn’t thereby raised. After each non-100 roll, you have to revise your probability calculation in light of the new information (basically subtracting one from your x after each roll).

That sounds about right. It’s the difference between “you know that’s going to take forever, right?” And “This fair die has never rolled a one after 10k rolls, man, use it, you can’t afford to roll a 1!”

Right, I’ve actually done the latter calculation before. If you consider the area of something as an integral, the chance of it hitting any one point is an integral between x and x which is 0, even though clearly it COULD happen at that point. I didn’t actually know about the “almost surely” though.

Of course, “infinite monkeys banging on infinite typewriters for all eternity will almost surely produce all the world of William Shakespeare” isn’t quite as catchy.

This is a classic probability problem dating back to 17th century. Some French degenerate used to hustle his buddies by betting that he could roll a six in four rolls. When they got tired of losing to him, he decided to extend the game to two dice and make the game the number of rolls to roll double-sixes. He had to bring in his friend Blaise Pascal to work out the exact mathematical details.

Anyway, for your problem, what would be a fair number of draws before one could reasonably expect to hit the 1/100 event.

0.5 = (.99)[sup]x[/sup]
Take logarithms of both sides and solving gives:
x = 68.9676.

Each draw is still 1/100 so the odds never improve. The old maxim ‘dice have no memory’ is what is meant by independent trials.

standingwave, running bad for thirty years…

Rolling the dice 1000000 times without getting a 100 is extremely unlikely. So if you are going to roll the dice 1000000 times, you are “due” to get a 100 unless you are very, very unlucky.

But if you rolled the dice 999999 times and didn’t get a 100, then you have been very unlucky already and there is still only 1/100 chance to get a 100 on the next roll.

What I’ve found through actual tests is that the probability is always accurate given enough samples. Let’s say on 10k rolls with a 1% chance, you nearly always end up after 10k with 100 of the targeted roll.

However, on subsets of the 10k, you can get high variance. One subset of 1k (10 predicted successful rolls) might range anywhere from 0 to 20. Also, the order of successes might be something like 1, 2, 10, 20, 1, 1, 1. Therefore, on subsets of 1000, below average success is not indicative of above average success on the next subset, and vice versa.

Yes! And similarly, suppose you have a random number generator that chooses real (or rational) numbers in the range [0..1)

Every number in that range has a 0 probability of being chosen. Yet one of them will be chosen.

Now, I’ve built a random number generator that generates truly random numbers in the range of .99999… to 1 – Let the wagering begin! :stuck_out_tongue:

I think that article is somewhat misleading in not being as clear as it might on the limitations of that theory. It’s for a specific subset of probability theory, and is not necessarily applicable in real life, or in a great deal of other probabiltiy theory.

For example, an event with an actual probability of 0 cannot happen. However, an event with a probability so small we cannot meaningfully calculate it CAN happen. There are not infinite ways a dfart can hit the dartboard. The number of point at which the dart penetrates is finite, albeit very large. The probabiltiy may be “almost never,” but it is not zero. Likewise, a probability of 1 does mean it is certain - but we know that most events do NOT have a probability of 1, simply a very high probability which approaches 1 to a degree we can’t calculate.

First, the OP’s calculations are a bit off, and are given in any description of the geometric distribution.

This is it exactly. It’s best to think of an infinite sequence of die rolls that you stop looking at after you see the first 100. In any finite prefix of the sequence, anything can happen, but if you look at a long enough part of it, the proportion of rolls that come up to 100 is going to be pretty close to 1/100. The law of averages doesn’t work by remembering past surprises and canceling them out; it works by swamping them with enough unsurprising outcomes that they become insignificant.

Your confusion arises from mis-defining “over time”. In your head, it seems like it means “My second roll is more probably than my first, and my third even more so.” But in the math, it actually means “Two roll will more likely get me a 100, and three rolls even more so.” It’s dependent on you not having rolled the first time. If you’ve already rolled the dice once, then the probability of you rolling a 100 is either 1 or 0 for that first roll. It’s no longer 1/100, so you can’t mathematically treat it as such.

What surprises me is how many people who believe a result is due also believe in hot streaks. The two beliefs rely on contradictory logic. The gambler’s fallacy falsely argues an event is more likely to occur if it hasn’t occurred recently. The hot streak fallacy falsely argues that an event is more likely to occur if it’s recently happened several times. Even if you don’t understand probabilities and believe one of these is true, how can you believe in both of them?

I see. I was calculating “out of k rolls, what’s the probability that one of them is the number 100?” (cumulatively) What I should have been calculating was the geometric distribution ultrafilter linked to, which accurately describes “what is the probability that I need k rolls to generate the number 100?”

Or maybe it would be more accurate to say “what’s the probability that 100 is in the set of generated numbers at least once”?