What is the term for this kind of statistical mistake?

Let’s say there is an industry of companies that make widgets. There are actually two types of widget, fluttering widgets and oscillating widgets, but the difference in functionality is subtle and not appreciated by most consumers, or anyone else. Most companies make both kind of widget anyway, and they are sold in roughly equal numbers.

After receiving scattered reports of deaths due to widget use, the government commissions a study, and said study concludes that the use of widgets leads to a 40% greater chance of a heart attack in both the widget user and anyone nearby.

Widgets are quickly banned.

However, in reality, the risk of a heart attack is a peculiar consequence of the mode of operation of fluttering widgets only. Fluttering widgets in fact increase your risk of a heart attack by 80%, yet oscillating widgets are perfectly safe. However, because no one appreciated the significance of the two different types of widget, the study did not even gather data on which deaths were associated with which type of widget, and the most accurate statistical correlation was never discovered.

Is there a name for this kind of statistical error? “Choosing too broad a sampling variable”, or something?

I would say ill-defined variables. As for the fallacy name, I would say this is a subset of hasty generalization.

Is it really as simple as that? It is not simply an ill-defined variable, it is a failure to recognize that a variable exists.

Error in model specification?

I don’t think that it is lack of recognition of the variable’s existence. For example, crooked tails are common in siamese cats so I do a study of 100 cats (8 of which are siamese) and get 8% of my sample has crooked tails. Are you claiming that I do not recognize that there are siamese and non-siamese cats? Instead I made my variable too general (all cats).

It sounds similar to either the ecological fallacy or the fallacy of division, but it’s not exactly the same as either.

It does not have a fancy name. It is a matter of not accounting for reasonably plausible sources of variability, in this case type of widget. In practice one stratifies or adjusts for these sources.

Simpson’s Paradox, wikipedia link

Average of total vastly different from average of subgroups.

Thank you for that. Following links on that page, I found the following, which I think is closest to the situation I described:

It’s definitely not Simpson’s paradox, which is referring specifically to the situation described in the very first sentence of the article you linked to:

Right, Simpsons requires the aggregate effect to flip relative to the subsets.

The error here is not considering the effect of widget type in the analysis.

The idea is that your setup is valid if you’ve correctly specified the underlying variables in an appropriate and complete manner, including the right variables and not including the wrong variables.

Model miss-specification covers many problems.

Would you agree that “Omitted-variable bias”, linked above, is an accurate description?

The problem here is the omission of widget type in the analysis. Variable omission is the specific fault here.

I prefer a broader term encompassing incomplete or excessive adjustment to
multiple terms describing every particular error, though. Too much jargon.

It’s an omitted variable issue for sure, but the word “bias” has a specific technical meaning in statistics that doesn’t fit here.