Is there a name for this statistical phenomenon?

Chronos · October 8, 2014, 4:35pm

Suppose that I have a fair n-sided die, and I want to roll it until I get all n outcomes. This is probably going to take more than n rolls, because I’m probably going to get some repeats. For instance, if n = 10, it’ll take, on average, over 29 rolls in order to get all of them.

But this means that there will probably be a significant span of rolls where I’ve gotten nine out of my ten outcomes, and am diligently pounding away at that last one. And even once I do get it, I’ll have only gotten that one once, while my other rolls will average around three times each. So it’s going to seem to me like that last outcome, whatever it is, is really rare, while the others will seem fairly common.

It’s only an illusion, of course, and it varies from person to person. Like, I might say to someone attempting the same task “Man, trying to get a three takes forever, doesn’t it!?”, to which they might reply “What do you mean? I’m always getting threes. It’s sevens that are the rare one.”. In reality, of course, neither is actually rare: All rolls are always equally likely.

So, is there a name for this illusion?

robert_columbia · October 8, 2014, 4:37pm

The Law of Large Numbers?

notquitekarpov · October 8, 2014, 4:48pm

The gamblers fallacy?

You know that all results are equally likely but are mixing up experience of past rolls (which have no effect on the future odds) with the “rarity” of waiting to get your last number. By definition, every time you miss you will have loaded up on those you already have making them seem “common” and your missing number “rare”.

septimus · October 8, 2014, 5:48pm

People playing backgammon against a machine, which often rolls boxcars when it needs them, call it “The computer cheats!”

Thudlow_Boink · October 8, 2014, 5:57pm

It sounds like what you’re describing either is, or is closely related to, the Clustering Illusion (see also here).

muttrox · October 8, 2014, 6:12pm

IMO This is the mathematical version of “Your keys are always in the last place you look.”

Saint_Cad · October 8, 2014, 6:50pm

I think it’s more of just realizing p(x)>0 means that there is possibility that the event will occur. For example rolling doubles is 1/6 so rolling doubles 10 times is less than 1.7 x 10^-8 but this means it can occur. So when it does and gets noticed as unusual (rightfully and wrongfully) I would classify it as confirmation bias.

naita · October 8, 2014, 7:59pm

Isn’t the Gambler’s fallacy the opposite of the OPs illusion. The gambler actually acknowledges all outcomes should be equally likely, but mistakenly thinks that previous events should influence future events, so that the “hard to get” number should now crop up more often to balance things.

The OP on the other hand describes a different, but equally problematic result of trying to do intuitive statistics, trying to establish the probabilities in an event with a stopping condition that inherently makes the results biased.

If there are situations where this activity occurs, there may be a name for it. Is there any activity other than idle dice tossing and bad statistics where you’d try throwing a die until you get all possible results?

septimus · October 8, 2014, 8:01pm

I was interested in the relationship between n and E(n), the expected requirement. It’s easy to find a recurrence relation (e.g., E(10) = 7381/252 = 29.28968…) but I didn’t find a closed form. The sequence 2!E(2), 3!E(3), 4!E(4) is not in Sloane’s on-line Encyclopedia.

Asymptotically, I think E ~= n*ln(n). Surely this problem is well known?

Chronos · October 8, 2014, 11:32pm

This is correct. It comes from the harmonic series: The general form is E(n) = Sum(i=1…n) n/i. This is because on your first roll, you’re guaranteed to get a new result: Your expected number of rolls to get a new result is 1 (or n/n). Once you’ve gotten a single result, the probability of getting another new one is (n-1)/n, and so your expected number of rolls to get your second unique result is n/(n-1). For your third one, the probability of a new result is (n-2)/n for an expectation of n/(n-2) rolls, and so on, until when you only have one left, your chance of success on each roll is 1/n, and so your expected number of rolls to get that last one is n.

And the specific case that prompted my interest was a set of achievements in a computer game. For instance, in one spot, there’s (sometimes) a device that shows up in the game that can give one of ten possible outcomes, and there’s an achievement for having encountered all ten (since this device often doesn’t show up at all, and is in a very large area that takes a long time to search, this is a tedious slog). And there’s also one for finding all of a set of randomly-occurring enemies (though this one isn’t quite the same situation, since some of those enemies really are rarer than others, but the effect would still show up).

Isilder · October 9, 2014, 12:07am

Observer bias … no because the observer is not influences the die (except if he gives up in disgust … )

Basically its just evidence of too small sample size … Low P factor result …

Indistinguishable · October 9, 2014, 12:13am

Surely it is! Chronos gave a perfect explanation of why E(n) is precisely n * the nth harmonic number, but in case you care to read up more, this generally goes by the name “coupon-collector problem”.

Thudlow_Boink · October 9, 2014, 12:13am

Chronos, does what any of us have suggested sound like what you were looking for?

Topic		Replies	Views
Gambler's fallacy question Factual Questions	14	3592	August 5, 2012
Probability of Dice Rolls Spread across Rollers Factual Questions	30	508	September 28, 2022
Probability formula for number of rolls to get a particular result Factual Questions	49	1519	September 4, 2022
Gamblers fallacy Factual Questions	113	6498	April 20, 2004
Probability of throwing three sixes in a row? Factual Questions	35	1932	August 6, 2021

Is there a name for this statistical phenomenon?

Related topics