As I’ve mentioned before, while it’s a popular notion that genetically defining “race” is an important preliminary to figuring out if outcome differences are genetic, it’s actually irrelevant. This argument is advanced in error.
What you need to be able to show is that two groups you are comparing differ in prevalences for gene sets. That’s it. You don’t need to show the two groups are internally related, or have specific markers, or anything else.
It’s not clear to me why folks are so persistently confused about this.
Suppose, for instance, I make this statement: Cystic fibrosis is more common in whites.
I get challenged: “Define whites.”
I reply, "The Self-Identified Race/Ethnicity, ‘White’ "
I make a similar statement for blacks: Black males have higher levels of testosterone than whites.
Now these two physiologic differences (and thousands of others) hold true. They hold true because those two SIRE groups have different prevalences for those genes. Period.
You may complain they are stupid groupings (they are). You may complain there are no genetic markers, or that the categories confuse and conflate hundreds of sub-categories (they do) or that there is plenty of genetic admixture blurring the edges (there is) or occasional exceptions (there are).
None of that–none of it–gets around the simple fact that these two SIRE groups–and all the other current standard SIRE groups–have different prevalences for genes. This is because of the general history of how human groups have flowed across the world, and have tended to stay within a particular population.
Whether or not you like the standard SIRE groups ultimately depends on whether you are a splitter or a lumper. But either way, you cannot take any scientific issue with the fact that lumping into SIRE groups creates partitions in which there are different prevalences for genes, and it is those genetic differences which help to drive the phenotypic differences we see.