I still cannot intuitively see why the chance of at least one occurrence is not 30%. I don’t doubt your math, but each is 1/100. 30 times that is .3 or 30%. I know you are correct, but simply cannot visualize it.
Can you intuitively see why the chance of at least one occurence in 100 trials is not 100%?
Can you intuitively see why the chance of at least one head in two coin flips is not 100%? After all, the chance for each coin is 1/2, and 2 times that is 1 or 100%.
That’s irrelevant.
If I’m rolling a hundred-sided die, I can expect to roll a 72 roughly one time out of a hundred. The fact that each trial is independent of the previous one doesn’t mean anything.
If it’s a thousand-sided die, I can expect to roll a 72 roughly one time out of a thousand.
And if I’m rolling a 200-million-sided die, I can expect to roll a 72 roughly one time out of 200 million.
The math doesn’t change just because the numbers get bigger.
The die is gonna have to get way bigger, though.
Yes I can. So let’s just keep it at one chance. Let’s take a single occurrence. Is it 1%?
Perhaps you’d find it easier, conceptually, to consider the reciprocal. If the probability of an occurrence any given time is 1/100, then the probably of a non-occurrence any given time is 99/100 (basically what GreenWyvern laid out in not so many words).
Thus, the odds of not getting that 1/100 probability even after 30 attempts is (99/100)^30. Or (rounding) 74/100.
1-74/100 = 26/100 or… 26%
Well, look at a coin flip. With 1 coin flip there are 2 possible states. 1H, 1T. So 1H over 2 states is 50%.
Flip the coin twice. You have 4 possible permutations. HH, HT, TH, and TT. One of those permutations has no H. 1/4 of the time if you flip a coin twice you won’t get a H.
Yes. It’s 1%.
Now let’s look at the chance of at least one occurrence in two chances. This can happen in one of three ways:
- Two successes in two chances. The probability of this is 0.01 * 0.01 = 0.0001.
- A success on the first chance, and a failure on the second chance. The probability of this is 0.01*0.99 = 0.0099.
- A failure on the first chance, and a success on the second chance. The probability of this is 0.99*0.01 = 0.0099.
The total probability of at least one success is then 0.0001 + 0.0099 + 0.0099 = 0.0199 = 1.99%. Which is less than 2%. This always holds: the probability of at least one success in N chances, each with an independent 1% chance of success, is less than N%.
Another way to think about it is this: imagine I do the same drawing 20,000 times, and I pair these results up (so I have 10,000 pairs.) There should be about 200 successes among these drawings, randomly distributed among the pairs. If there were exactly one success per pair of drawings, then the number of pairs with at least one success would be exactly 2% of the total. But there’s a small chance that two successes ended up in the same pair; and that pair only counts once in my total of “pairs of drawings with at least one success”, not twice. So in reality, the probability of getting at least one success in two drawings is a bit less than 2%.
If you played the Powerball a quadrillion times it would be truly extraordinary if you didn’t win it. That’s effectively impossible.
In fact, on average, with a quadrillion trials, you should expect to win around 5 million times.
So I’d start suspecting some hanky panky with the drawing if I got to a quadrillion plays and still haven’t won.
Keep in mind that the chance of winning can only go up to 100%. It’s easy to think that at low iterations like 30 tries at 1% is 30x1%=30%, but that line of thinking is clearly broken after 100 tries since the percentage can’t go above 100%. Doing 200 tries doesn’t mean there is a 200x1%=200% chance of winning, because 100% is the max possible. And even with 200 tries there’s always the chance of losing, so even then the percentage would not actually be 100%.
Yes and no. It depends on how you frame the problem/at which point you assess probabilities. I’m sure you are completely aware of this and don’t need a lecture from me, but for readers following along at home who may have had less exposure to statistics in school:
If I say “I’m going to pick a marble out of a bag of 100 marbles numbered 1-100, and repeat the process X times” then I can say what the probability is of choosing a particular number at least once. As long as I’m estimating the probability for X trials, that doesn’t change. (We are of course talking about a replenished bag of 100 marbles each time - if on my 2nd draw there are only 99 marbles, 98 on my third, and so on, that obviously changes the situation).
But, if I do one draw from my always replenished bag, now I have X-1 draws left. At that point, I can, if I desire, calculate “what are the probabilities of at least one success in X-1 draws?” The answer will not be the same as in X draws since, as my stats professor often reminded us, “chance has no memory.”
(You’ll notice I’m sticking with narrative and not laying out any actual equations. It’s been 39 years since my university statistics classes. I’m just happy I remember the concepts; I no longer recall any but the most basic formulae.)
In fact, it would actually be around 87%.
If we’re talking about drawing marbles from bags, then we have to be clear whether we’re putting the marbles back after each draw or not. If we’re not putting them back, then after 30 draws from a bag of 100 marbles, it is in fact a 30% chance that we’ve got the one we want. After 100 draws, it’s 100%, and there’s no such thing as making more than 100 draws.
The gambler’s fallacy part of this is when you think that you are “due”.
If I flip a coin 10 times, and it comes up heads 10 times, then what are the chances that it is tails this time?
Gambler’s fallacy makes us want to think that it should be tails this time, that we are due, when the chance for this next event is still just 50/50.
(Though if you flip a coin 10 times and it comes up heads each time, I may start to think that it’s not a fair coin.)
And if it’s not a fair coin, then you should expect it to come up heads again. The Gambler’s Fallacy is not only untrue, it’s the opposite of true.
Then again, if you’re drawing without replacement, then you really can be due for a particular result.
We humans aren’t very good at understanding true randomness, because in most of life, true randomness is very rare, and usually, there is a pattern of some sort that can in principle be found. So we’ve gotten very good at finding those patterns. So good, in fact, that even when we very carefully contrive to eliminate those patterns, we’ll still find patterns anyway even when they don’t exist.
Right, which is why “random” shuffle playlists aren’t random, as people didn’t think that they were the sort of random that they expected.
There is also the story of during WWII, when creating the cypher key for a one time pad encryption system, the person in charge of it didn’t think that the numbers looked random enough, and so would remove sequences that didn’t seem to be random. And that left a pattern that could be exploited.
Casinos and lotteries would not exist if people were were able to understand randomness.
For intuition:
Imagine that 1 in 100 trials is a success, but that they aren’t random. Instead, precisely the 77th, 177th, 277th, …, 100077th, etc., trials are the successes and all the rest are the failures. Write this sequence out and chop it up into blocks of 30:
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
ooooooooooooooooXooooooooooooo
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooXooo
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
ooooooXooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
etc.
The overall rate is 1%, and the rate of 30-trial blocks with at least one success is precisely 30%. This is what you “feel” should happen, but this shows that it’s actually a rather extreme case. There are only so many successes to go around, and the only way to end up with a 30% block-success rate is to never waste a single trial-success with double-ups.
If the successes are random, then you might get something like:
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
oXoooooooooooooooooooooooooooo
oXoooooooooooooooooooooooooooo
ooooooooooXooooooooooooooooooo
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooo
oXoooooooooooooooXoooooooooooo <--- wasted one!
oooooooooooooooooooooooooooooo
etc.
That 30-trial block with two trial successes counts as just one block success, “wasting” forever that extra successful trial. That wasted trial will never have a block to call it’s own, so you can never reach 30% rate of block successes.
My favorite example of humans not being good at randomness was an intelligence test on which rats outscored us. The test subject (human or rat) was presented with two buttons and two corresponding lights. On some signal, the subject presses a button, and a light lights up. If the button matches the light, the subject gets a reward, and if not, then the subject gets nothing.
It was actually wired up to randomly select the left light 2/3 of the time, and the right light 1/3 of the time. The rats quickly realized that, and just always pushed the left button, the optimal strategy. But the humans kept on coming up with rules like “if it’s left three times in a row, the next one will be right”, that caused them to deviate from optimal strategy.
Agree completely about your larger point in both your posts about human pattern matching being far more eager than the real world is.
Specifically as to the humans vs rats test I wonder how much that’s related to our conditioning to prefer the “tit-for-tat” approach to repeated 1-on-1 competitions? I suspect that’s one of humans’ subconscious’s larger thumbs on the scale.
Again, I am good with the math, and I think I understand everything that has been said. Yet I continue to be hung up on the 1 trial probability. Take a coin flip. If heads is 50/50 on the first trial, I cannot see why it stops being 50/50 after 10 trials, or 2 trials. If that is true, it shouldn’t be 50/50 on one trial.