I haven’t had a chance to fully work out the unified equation, but I did come up with a few thoughts I thought I’d throw out. (I’m still working on these, too)
The key concept here seems to be “the universe of (applicable) possibilities” or the “gross denominator”. When making a chart or possibilities, you must pick the correct rows and columns, or “counting cells” will mislead you.
As I earlier noted, making a matching grid of blacks vs whites produces a false bias. In such a grid, each additional white person increases the number of cells in each black criminal’s row, but not any white criminal’s column. But does a black man actually become “more probably guilty” if white person moves to town?
To remove any obscuring ‘intuitions’, let’s rename the conditions. The town is a former all-boy’s school. Blacks and whites are girls and boys. ‘Criminals’ are ‘drivers’. The crime becomes an accident where a truck hits a car, killing the driver, but not the one passenger. We’ll assume all dating is in-school and heterosexual.
Just as “only criminals commit crimes”, only drivers drive, but a license doesn’t prove you weren’t a passenger. A criminal record doesn’t prove guilt in any later crime.
In this example, it’s easier to see that to assess the odds that a girl died, you don’t multiply by the number of “available boys” (vs. the girls available to the guys) The date happened. There was ample opportunity for it to happen (the number of potential partners amply exceeds the number of drivers) Since it is not a limiting condition, you should leave it alone (Below, we’ll see how Example C hits a limiting condition)
Statistical “opportunity” can be illusory: pedestrians aren’t run over 10x as much in cities with 10x more roads; in fact the accident rate is often higher with fewer roads. Dating -or having innocents near your crime- usually isn’t limited by the number of possible partners, so the effect of more potential partners isn’t calculable. When you hear of the accident, knowing that the accident was on a date (vs. with a same sex friend) may your affect assessment on a person-by-person (not gender) basis
The fraction of licensed boys vs. girls (5% vs 20%) DOESN’T affect the probable gender of the victim. The fraction of drivers who are boys (69.23) vs girls (30.77) DOES.
“Licensure rate by gender” (criminality by race) is a sloppily framed statistic which could only be used if we felt we ‘needed’ to judge by raw gender, just as the original scenario was crafted to FORCE us to judge by race: the only answers we’re allowed to give are “black” or “white”
An insurance company would go broke if they used “accidents per girl” instead of “Accidents per girl driver” to calculate rates. The scenario makes it sound like the cop MUST decide based on race, but in fact, he could chase the one who is closer, slower, wearing lighter colored clothing (easier to see at night), looks easier to subdue, is headed toward less concealing cover, or even choose one at random. A cop who sees two fleeing suspects and sees only race is a poor cop indeed.
Now let’s remove race intuitions from Oomphaloskeptic’s most extreme Example C:
It’s a post-Apocalyptic future after a cruel bioweapon killed almost all women. By tradition, all women drive (at first, they didn’t dare ride with a man!) but almost no men are allowed to drive (they might catch the few women). After a century or so, women are no longer afraid; they are worshipped and protected.
It’s very rare to see a man alone with a woman (who are 1/5000th of the city). Yet one day, a paper reports that -horrors- a accident killed the driver of a car containing a woman. The whole city wants to know: did a woman die?
Like the original scenario, it is an unlikely event cherry-picked to make a point, but does the extreme rarity change the conclusion we established above?
No, it doesn’t! While I agree that, this time, it was probably a woman who was killed (a black who was guilty), the 100% prevalence of driving among women (criminality among blacks) is actually quite irrelevant
The apparently contradictory finding of Example C is not caused by the HIGH 100% rate, but by the ULTRA-LOW rates: prevalence of women, and prevalence of driving males (small number of blacks, and almost total noncriminality of whites in Example C)
To see this, let’s see how changing the 100% rate affects the “most probable outcome”:
DROPPING BLACK CRIMINAL RATE FROM 100% to 0.05% DOESN'T AFFECT EXAMPLE C
EVEN AT RATES SO LOW THAT NOT ONE SINGLE BLACK CRIMINAL EXISTS
TOTAL CRIMINALS HONEST Racial Prevalence % CRIM
BLACK 200 200 0 1: 5000 100%
WHITE 999800 450 999350 1: 1.00020004 0.0450090018%
TOTAL 1000000 650 999350 0.065%
B+W: 200.05 B guilty: 199.96 W guilty: 0.09
TOTAL CRIMINALS HONEST Racial Prevalence % CRIM
BLACK 200 20 180 1: 5000 10%
WHITE 999800 450 999350 1: 1.0002000400 0.0450090018%
TOTAL 1000000 470 999530 0.047%
B+W: 20.086 B guilty: 19.996 W guilty: 0.09
TOTAL CRIMINALS HONEST Racial Prevalence % CRIM
BLACK 200 2 198 1: 5000 1%
WHITE 999800 450 999350 1: 1.0002000400 0.0450090018%
TOTAL 1000000 452 999548 0.0452000000%
B+W: 2.0896 B guilty: 1.9996 W guilty: 0.09
TOTAL CRIMINALS HONEST Racial Prevalence % CRIM
BLACK 200 0.2 199.8 1: 5000 0.1%
WHITE 999800 450 999350 1: 1.0002000400 0.0450090018%
TOTAL 1000000 450.2 999549.8 0.0450200000%
TOTAL CRIMINALS HONEST Racial Prevalence % CRIM
BLACK 200 0.1 199.9 1: 5000 0.05%
WHITE 999800 450 999350 1: 1.0002000400 0.0450090018%
TOTAL 1000000 450.1 999549.9 0.0450100000%
B+W: 0.18998 B guilty: 0.09998 W guilty: 0.09
As you can see, the “criminality of blacks” is irrelevant in Oomphaloskeptic’s Example C. Even when there is not one single black criminal, but 450 white criminals, the ultra-low white rate makes black men "the likeliest candidate". I don’t know what “0.1” black criminal is, but it’s way less than “one single criminal”.
Apparently even “bad thoughts” by a black man should affect a cop’s decision of whom to chase, more than 450 actual White criminals, if the “white crime rate” is low enough.
Such ‘small number effects’ are non-linear enough to constitute a deliberately skewed sampling: e.g. a tiny black population so small deprives them of the “statistical benefit” of “B/B” scenarios. At a black racial prevalence of 1:5000, the B/B effect is 0.00000004 while the 99.98% of white crime is buried in W/W scenarios.
This makes Example C so sensitive to black misdeeds, and so forgiving of the prospect of white misdeeds that it actually says you should arrest the black when there isn’t a single black criminal, but there are hundreds of white criminals.
In fact, under Example C you could raise the White criminal rate to 100%, and the answer still wouldn’t be “arrest the white man”.
Interestingly, The most extreme case of an “Example C” scenario is “the only Chinese in town,” if that person has a criminal record, Example C says he should be chased because his visible ethnicity has 100% criminality. Yet, in reality, the policeman should chase the white suspect: he can pick up the Chinese man later, but the white suspect is still unidentified.
Just because a quantity can be calculated, doesn’t make it a sufficient basis for a decision. Personally, I’m more inclined to ‘follow the math’ than any other single factor, but I encounter situations daily when the math simply fails to provide the best solution in cases of limited information.
I find this problem interesting mathematically, but I think there is substantial reason to say that the cop’s knowledge of racial statistics is no more relevant than a thousand other details he would have also seen. Some cops are known for giving speeding tickets to sports cars or even just “red” cars, but despite studies indicating that these are worse offenders, I would argue that targetting this “high risk population” would be a poorer global practice than ticketing ‘at random’.
[I put red in quites, because I don’t have a cite on speeding rates by color.]