You keep writing about this as if the investigators, once they have the data in hand, have to bring in additional information to then do the analysis. They don’t.
Once the data is collected, it is wrong to say “they would then need to find out how much location explains the difference in their results.” The data would tell them in their data how much variation in caseness the indoor/outdoor variable explained. They don’t have to extrapolate.
You keep going on about the relative percentages, but that is irrelevant to the ability to conduct the analysis. It’s the absolute numbers in the various cells that is the issue. So, if 10% of the controls were outside, that’s 68 people. Low cell concerns arise at 5 or below. A concern would arise here if sparsity among the various combinations of covariates tested in a given model contributed to error in the estimates from that model. There’s no evidence of such problems here.
Specifically they tested how many people lying about having a gun would influence the results, and found this to be 5%. It’s not common for studies to probe “what if’s” about data models with such specificity, but here the authors provide that info. Rather than treat this as good science, useful information, and objectivity on the part of the investigators, you want to use this to reject the results entirely.
Seems like meeting the good faith objectivity of the investigators would be a more appropriate stance to take.
Oh, I’m willing to be persuaded. Unfortunately you haven’t done so because you know, it’s math.
[/QUOTE]