Statistics & profiling problem

I’m trying to write a more detailed explanation of my original (Bayesian) answer, but for now let me try to explain why I think your 70% solution is wrong:

This is fine so far. In the absence of any other information, his universe of suspects is all criminals in town, ~70% of whom are white.

This is where I think the problem lies. Once he sees the two people running from the scene, these are his two suspects (in reality, these would be “primary” suspects and there would still be some suspicion cast on criminals not at the scene, but I’ve ignored this). He’s not choosing any more whether to chase all 450 white criminals or all 200 black criminals, just whether to chase this one particular white (who may be a criminal) or this one particular black (who may be a criminal).

I hesitate to bring up a different example, because I think analogies usually just confuse the issue, but here’s my attempt (NB: this is primarily aimed at the specific problem I see with the reasoning above, and not as a complete analogy):

If B = W = 1, then both men are criminals, so chasing either man could result in catching a criminal, if not the perpetrator of this crime. So the best choice is to chase one of them–perhaps the one who is running more slowly.

If B = W = 0, then neither man is the criminal, but may be a witness. Again, the best choice is to chase one of them.

So it looks to me like case 1 contains case 4, and cases 2 and 3 should be taken to exclude B = W.

But I do agree that any solution must have this property.

Okay, I agree ultrafilter. I was thinking that the cop assumed that exactly one of the two was bad, in which case Case 4 would contradict this assumption. But I realize now that this assumption is not necessary.

I haven’t had a chance to fully work out the unified equation, but I did come up with a few thoughts I thought I’d throw out. (I’m still working on these, too)

The key concept here seems to be “the universe of (applicable) possibilities” or the “gross denominator”. When making a chart or possibilities, you must pick the correct rows and columns, or “counting cells” will mislead you.

As I earlier noted, making a matching grid of blacks vs whites produces a false bias. In such a grid, each additional white person increases the number of cells in each black criminal’s row, but not any white criminal’s column. But does a black man actually become “more probably guilty” if white person moves to town?

To remove any obscuring ‘intuitions’, let’s rename the conditions. The town is a former all-boy’s school. Blacks and whites are girls and boys. ‘Criminals’ are ‘drivers’. The crime becomes an accident where a truck hits a car, killing the driver, but not the one passenger. We’ll assume all dating is in-school and heterosexual.

Just as “only criminals commit crimes”, only drivers drive, but a license doesn’t prove you weren’t a passenger. A criminal record doesn’t prove guilt in any later crime.

In this example, it’s easier to see that to assess the odds that a girl died, you don’t multiply by the number of “available boys” (vs. the girls available to the guys) The date happened. There was ample opportunity for it to happen (the number of potential partners amply exceeds the number of drivers) Since it is not a limiting condition, you should leave it alone (Below, we’ll see how Example C hits a limiting condition)

Statistical “opportunity” can be illusory: pedestrians aren’t run over 10x as much in cities with 10x more roads; in fact the accident rate is often higher with fewer roads. Dating -or having innocents near your crime- usually isn’t limited by the number of possible partners, so the effect of more potential partners isn’t calculable. When you hear of the accident, knowing that the accident was on a date (vs. with a same sex friend) may your affect assessment on a person-by-person (not gender) basis

The fraction of licensed boys vs. girls (5% vs 20%) DOESN’T affect the probable gender of the victim. The fraction of drivers who are boys (69.23) vs girls (30.77) DOES.

“Licensure rate by gender” (criminality by race) is a sloppily framed statistic which could only be used if we felt we ‘needed’ to judge by raw gender, just as the original scenario was crafted to FORCE us to judge by race: the only answers we’re allowed to give are “black” or “white”

An insurance company would go broke if they used “accidents per girl” instead of “Accidents per girl driver” to calculate rates. The scenario makes it sound like the cop MUST decide based on race, but in fact, he could chase the one who is closer, slower, wearing lighter colored clothing (easier to see at night), looks easier to subdue, is headed toward less concealing cover, or even choose one at random. A cop who sees two fleeing suspects and sees only race is a poor cop indeed.

Now let’s remove race intuitions from Oomphaloskeptic’s most extreme Example C:

It’s a post-Apocalyptic future after a cruel bioweapon killed almost all women. By tradition, all women drive (at first, they didn’t dare ride with a man!) but almost no men are allowed to drive (they might catch the few women). After a century or so, women are no longer afraid; they are worshipped and protected.

It’s very rare to see a man alone with a woman (who are 1/5000th of the city). Yet one day, a paper reports that -horrors- a accident killed the driver of a car containing a woman. The whole city wants to know: did a woman die?

Like the original scenario, it is an unlikely event cherry-picked to make a point, but does the extreme rarity change the conclusion we established above?

No, it doesn’t! While I agree that, this time, it was probably a woman who was killed (a black who was guilty), the 100% prevalence of driving among women (criminality among blacks) is actually quite irrelevant

The apparently contradictory finding of Example C is not caused by the HIGH 100% rate, but by the ULTRA-LOW rates: prevalence of women, and prevalence of driving males (small number of blacks, and almost total noncriminality of whites in Example C)

To see this, let’s see how changing the 100% rate affects the “most probable outcome”:



DROPPING BLACK CRIMINAL RATE FROM 100% to 0.05% DOESN'T AFFECT EXAMPLE C
EVEN AT RATES SO LOW THAT NOT ONE SINGLE BLACK CRIMINAL EXISTS

            TOTAL   CRIMINALS    HONEST    Racial Prevalence   % CRIM
BLACK          200         200         0  1: 5000              100%
WHITE       999800         450    999350  1: 1.00020004          0.0450090018%
TOTAL      1000000         650    999350                         0.065%

   B+W: 200.05      B guilty: 199.96            W guilty:  0.09

            TOTAL   CRIMINALS    HONEST    Racial Prevalence   % CRIM
BLACK          200          20       180  1: 5000               10%
WHITE       999800         450    999350  1: 1.0002000400        0.0450090018%
TOTAL      1000000         470    999530                         0.047%

   B+W: 20.086      B guilty: 19.996            W guilty:  0.09

            TOTAL   CRIMINALS    HONEST    Racial Prevalence     % CRIM
BLACK          200           2       198  1: 5000               1%
WHITE       999800         450    999350  1: 1.0002000400       0.0450090018%
TOTAL      1000000         452    999548                        0.0452000000%

   B+W: 2.0896      B guilty: 1.9996            W guilty:  0.09

            TOTAL   CRIMINALS    HONEST    Racial Prevalence     % CRIM
BLACK          200         0.2     199.8  1: 5000               0.1%
WHITE       999800         450    999350  1: 1.0002000400       0.0450090018%
TOTAL      1000000       450.2  999549.8                        0.0450200000%

            TOTAL   CRIMINALS    HONEST    Racial Prevalence     % CRIM
BLACK          200         0.1     199.9  1: 5000               0.05%
WHITE       999800         450    999350  1: 1.0002000400       0.0450090018%
TOTAL      1000000       450.1  999549.9                        0.0450100000%

   B+W: 0.18998     B guilty: 0.09998           W guilty:  0.09

As you can see, the “criminality of blacks” is irrelevant in Oomphaloskeptic’s Example C. Even when there is not one single black criminal, but 450 white criminals, the ultra-low white rate makes black men "the likeliest candidate". I don’t know what “0.1” black criminal is, but it’s way less than “one single criminal”.

Apparently even “bad thoughts” by a black man should affect a cop’s decision of whom to chase, more than 450 actual White criminals, if the “white crime rate” is low enough.

Such ‘small number effects’ are non-linear enough to constitute a deliberately skewed sampling: e.g. a tiny black population so small deprives them of the “statistical benefit” of “B/B” scenarios. At a black racial prevalence of 1:5000, the B/B effect is 0.00000004 while the 99.98% of white crime is buried in W/W scenarios.

This makes Example C so sensitive to black misdeeds, and so forgiving of the prospect of white misdeeds that it actually says you should arrest the black when there isn’t a single black criminal, but there are hundreds of white criminals.

In fact, under Example C you could raise the White criminal rate to 100%, and the answer still wouldn’t be “arrest the white man”.

Interestingly, The most extreme case of an “Example C” scenario is “the only Chinese in town,” if that person has a criminal record, Example C says he should be chased because his visible ethnicity has 100% criminality. Yet, in reality, the policeman should chase the white suspect: he can pick up the Chinese man later, but the white suspect is still unidentified.

Just because a quantity can be calculated, doesn’t make it a sufficient basis for a decision. Personally, I’m more inclined to ‘follow the math’ than any other single factor, but I encounter situations daily when the math simply fails to provide the best solution in cases of limited information.

I find this problem interesting mathematically, but I think there is substantial reason to say that the cop’s knowledge of racial statistics is no more relevant than a thousand other details he would have also seen. Some cops are known for giving speeding tickets to sports cars or even just “red” cars, but despite studies indicating that these are worse offenders, I would argue that targetting this “high risk population” would be a poorer global practice than ticketing ‘at random’.

[I put red in quites, because I don’t have a cite on speeding rates by color.]

>> Just because a quantity can be calculated, doesn’t make it a sufficient basis for a decision.

Well, I may agree but I am not having to make any decision. I’d just like to know the answer. I have been stumped by probability problems before but this has to be in lesson 1 of Probability 101. it is the simplest probability cae you can imagine, with just two variables. I’m pretty sure there’s a correct answer hidden in there somewhere.

I’ll agree with Achernar that the limit cases should behave as he specified, so that provides a guide for ruling out some solutions.

I just thought of a way to refute this. Suppose 10% of people are Scorpios, and, no surprise, 10% of crimes are committed by Scorpios. The cop shows up at the scene, and using this logic, concludes that it’s 90% likely that it was not commited by a Scorpio. Checking two suspects’ IDs, he sees that one is a Scorpio and the other is a Capricorn. Should he suspect the Capricorn more?

You see a black guy and a white guy. There are exactly 9,000,000 possibly black guy white guy pairs.

Of those 9 million happy couples:

6,840,000 are an honest white guy and an honest black guy
1,710,000 are an honest white guy and a dishonest black guy
360,000 are a dishonest white guy and an honest black guy
90,000 are a dishonest white guy and a dishonest black guy

I believe we are assuming that we have one honest man and one dishonest man, so we can throw out 6,930,000 cases of honest/honest and dishonest/dishonest.

That leaves 2.07 million cases of which about 82.6% contain a dishonest black man.

Chase the black dude.

Lance Turbo:

You absolutely CANNOT throw out the dishonest/dishonest cases, and get the correct numeric answer; those cases are intrinsic to the problem. However, thus far, it seems likely to me they may not affect your final decision under an algebraic model, except possibly under very extreme discrepancies in population size or prevalence. The question looks to be even trickier under discrete math model (criminals and citizens must come in integer units).

Well, technically, Probability 101 does tell you that “Probability is only good for predicting the behavior of a large number of samples. It can’t predict single incidents, and is poor at small sample sizes.” Also, This is not a two variable problem. That is at the root of the original “paradox” you cited. It’s at least a three variable problem, where the third variable must be calculated by using two of the variables you provided. (see below)

I do understand what you mean, of course, but Probability is always a matter of inexact knowledge. I’ve been trying to prove to myself whether the ‘Probability 101’ model is the best possible approximation, the conditions where it is weakest (if any), and whether it either over-assumes or under-uses all available data. I had assumed that was the primary thrust of the thread, since the resolution to your OP has already been given. However, re-reading the thread, it’s clear that not everyone is debating the same issues.

I apologize if my focus has caused confusion (apparently it’s confused me at least once!), and I’ll concede there’s more than a small measure of ‘Devil’s Advocate’ in it (I was taught that critical analysis is essential) I don’t do it to annoy or mislead - it’s actually a fair amount of work!

To make up for that, here’s the derivation of…

The “Probability 101” answer:
We have T<w> white candies and T<b> black candies. Some of each are milk chocolate (M) inside and some are dark chocolate (D).

For a randomly selected candy of color c, M<c>:D<c> = odds that it has milk chocolate. The probability P<c> = M<c>/T<c>.

The number of milk chocolates in each color M<c> = P<c>*T<c>

HOWEVER, since M<c>=P<c>*T<c>, M<w> can be greater than M<b> even if P<b> is greater P<w>, if and only if T<w>/T<b> is greater than P<b>/P<w> [i.e. more crimes can be committed by the less crinimal group, if it is large enough] This is the resolution of the apparent paradox in the OP. (this is a 3-variable problem. You must know P<w>, P<b> and T<b>/T<w>)

Independence is not a trivial issue in problems like these. The “Monty Hall paradox” hinges on the issue of whether seemingly independent consecutive options are genuinely independent, and therefore whether they have equal probability.

If I draw a white candy and then a black candy from the bowl, they are independent events. Neither selection affects the other. The odds of each candy being milk chocolate are given by its respective ratio M<c>:D<c> in the bowl. P<c> can also be used.

If I draw two candies together, and then return them to the bowl, until I get a black and a white together, the two candies are still independent draws and the chances of each color being milk chocolate are still given by its respective P<c>.

HOWEVER, if I do the above until I have a black-white pair AND exactly one milk chocolate between them, the probability of the two colors being milk chocolate are no longer independent. in this case:

P<wb, md> = [P<w>(1-P<b>)] + [P<b>(1-P<w>)]

HOWEVER, this does NOT represent the crime case correctly. One of the parties commited the crime and must be a criminal, BUT the other party can be either a ‘criminal’ or ‘honest’ [i.e. his ‘criminality’ is immaterial] We can no longer use the marbles or candies that elementary probability texts so adore.

P = [P<w>(1)] + [P<b>(1)] - [P<b>P<w>]
because [P<w>(1)] + [P<b>
(1)] double-counts the instances
where both men are criminals (once in P<w> and once in P<b>)*

This tells us the probability of the situation, but it does not yet tell us exactly how to apportion the chances of guilt between the two suspect. Since the equation (and especially the third term) is symmetric with respect to P<w> and P<b>, we might think:

P<white guilty> = P<w> - [P<b>*P<w>]/2
P<black guilty> = P<b> - [P<b>*P<w>]/2
Note that the chances of black and white guilt are not independent.

I’m not 100% certain that this is the best possible answer, but it’s definitely as far as you’d get in Probability 101. There are several potential issues that this simple derivation does not address [such as the surprising ‘fractional black criminal’ in my last post]. When you get away from simple models (e.g. by moving from continuous to discrete mathematics) interesting results often fall out of the cracks.

Lance Turbo, I believe your analysis of counting possible pairs is correct but I do not think you should leave out bad-bad pairs so I would redo it like this:

You have 1,710,000 + 90,000 = 1,800,000 cases with a bad black guy.
You have 360,000 + 90,000 = 450,000 cases with a bad white guy
Therefore the probabilities are exactly 80% and 20% respectively. I believe this is the correct solution until someone can point out why it is wrong (and I am sure someone will come along shortly and do just that).

KP, I think you can throw out the dishonest/dishonest cases for the same reason that you can throw out the honest/honest cases. In both those situations, it doesn’t matter who you chase. When making a decision based on probabilities, there is no reason to look at cases in which your decision is irrelevant.

Lance Turbo, no you can’t do that. You have to take into account the bad+bad cases. We can disregard the good+good just because it is assumed in the definition of the problem I gave that there is a bad guy but the definition does not say there cannot be two bad guys. Those cases do affect the answer as I have alredy shown in my math in the previous post. I think your analysis is good except for that point.

The definition of the problem could have used a little work, but that is not really important right now. (An umambiguaously defined problem would be something like, “If at least one of the two men were involved in the crime, what is the chance that the police officer has apprehended a guilty man if he captured the white man.”)

The question you asked was…

The answer is the black guy, and I’m pretty sure that that is no longer in dispute.

However, the question of whether or not to include bad/bad cases is still unanswered, but it doesn’t matter if all you are trying to do is decide who to chase.

Without including bad/bad cases:

82.6% chase black
17.4% chase white

With including bad/bad cases:

79.2% chase black
16.7% chase white
4.2% it doesn’t matter

In both scenarios it is clear you should chase the black guy. The important thing is that odds of success for chasing black are exactly 4.75 times greater than the odds for chasing white in both situations. No matter what percentage of cases involve bad/bad chase black is always 4.75 times more likely than chase white.

If the question was, “If at least one of the two men were involved in the crime, what is the chance that the police officer has apprehended a guilty man if he captured the white man.” You would have to include the bad/bads to answer correctly 20.8%.

Also I should add that you calcualted your percentages incorrectly when you included the bad/bads. It should be 83.3% chase black and 20.8% chase white. They add up to more than 100% precisely because they both include cases in which your decision doesn’t matter. (You included the 90000 bad/bads in your denominator twice.)

Your error should underline the unimportance of including the bad/bads to decide who to chase. Including them once leads to the answer - chase black. Leaving them out leads to the answer - chase black. Including them 1.5 times (like you sort of did) leads to the answer - chase black. Including them ten times leads to the answer - chase black.

Ok, so let’s refine & improve the question asked in the op and word is: “What are the probabilities that the black man did it and what are the probabilities that the white man did it?” While the original wording was not as clear I think it was understandable to people that was the question asked.

No, I believe you are wrong. The probabilities that the black man did it and the probabilities that the white man did it have to add up to one because we are certain of that. they cannot add to more than one which would be meaningless. You cannot have a higher degree of probability than 100 %

I do not agree there is an error and I think anyone knowledgeable about probabilities will back me up. Anyone? (long silence ensues. . . )

1,710,000 cases have a good white guy and a bad black guy
360,000 cases have a bad white guy and a good black guy
90,000 cases have a bad white guy and a bad black guy

I think we agree on this.

Question 1: What are the probabilities that the black man did it?

A total of 1710000 + 90000 = 1800000 cases have a bad black man.
A total of 360000 cases have good black man.

There are 1710000 + 90000 + 360000 = 2160000 cases in all.

The black man is bad in 1800000/2160000 cases (about 83.3%)
The black man is good in 360000/2160000 cases (about 16.7%)

These add up to 1 (100%) because the black man is either good or bad. We are certain of this.

Question 2: What are the probabilities that the white man did it?

A total of 360000 + 90000 = 450000 cases have a bad white man.
A total of 1710000 cases have good white man.

There are 1710000 + 90000 + 360000 = 2160000 cases in all.

The white man is bad in 450000/2160000 cases (about 20.8%)
The white man is good in 1710000/2160000 cases (about 79.2%)

These add up to 1 (100%) because the white man is either good or bad. We are certain of this.

I hate to bring this up again, but I am wondering if we have reached an agreement.

Well, I agree with your analysis. But you probably noticed that my numbers of 19/23=82.6% and 4/23=17.4% (computed assuming that exactly one criminal is present) agree with your numbers for the same case, so that won’t surprise you much.

I still disagree with the result. Knowing only one of them did it, the probabilities that one and the other did it have to add to one. We agree that

Then it is obvious to me that P1 = 1800000 / (1800000+450000) = 80% and P2 = 450000 / (1800000+450000) = 20%

I am sticking to that answer as I find it totally obvious.

If you know that only one of them did it (I am interpreting this as “we are given that there was exactly one criminal at the scene”–if this is not what you mean, could you clarify?) then why are you including the 90000 cases where both the black man and the white man are criminals?

We have this table (these are the numbers originally presented by Lance Turbo)


                Black innocent  Black criminal
White innocent     6840000         1710000
White criminal      360000           90000

If you know that there is exactly one criminal at the scene, then the only two possible cases, of the four in this table, are the table’s upper-right corner (white innocent; black criminal) and lower-left corner (white criminal; black innocent). The 90000 cases where both are criminal are not relevant, since they are not possible in this scenario; even if they were relevant they should not be double-counted as you are doing.

The relevant fractions are then the ones Lance Turbo first calculated:
  P1 = 1710000/(1710000+360000) = 82.6% and P2 = 360000/(1710000+360000) = 17.4%.
Why do you find your answers more obvious than these?

This is misleading. The criminality of blacks is not irrelevant in my Example 3 (is this what you’re calling Example C?). From your examples it may appear so, but this is only because you haven’t lowered the “black crime rate” enough–even in your final case, with 0.1 black criminal, the black crime rate is higher than the white crime rate, and so it should not be surprising that the probability is greater than 50% that the black man is the criminal. If you had lowered it to 0.05 or 0.01, you would have found (going through the same math as in my original reply) that the officer should chase the white man: not surprising, since at that point the black crime rate would be less than the white crime rate.

Quantitatively:
 With 0.1 black criminal, the black crime rate is 0.050% (higher than the white crime rate of 0.045%); the probability that the white guy is guilty and the black guy is innocent is 47.4%.
 With 0.01 black criminal, the black crime rate is 0.005% (lower than the white crime rate of 0.045%); the probability that the white guy is guilty and the black guy is innocent is 90.0%.

The “bad thoughts” remark is just silly, and the remarks about fractional criminals are a red herring. You introduced the fractional criminals, so claiming that you don’t know what 0.1 criminal means gets no sympathy from me. But a value of “0.1 criminal” makes sense, and doing the above calculations is still correct, if (for one example–there are other interpretations) the table shows the results of averaging data over the past 100 months. The table can be viewed, when properly rescaled, as a table of joint probabilities for two random variables (RACE and HONESTY); in this formulation there’s no reason that the values in the table need to be integral.

This is, once again, only true as long as the black crime rate is higher than the white crime rate. I fail to see why this is so counterintuitive.

Is this really so surprising? I mean, Example 3 was the case where all blacks were criminals. If all whites are criminals too, well, it doesn’t much matter who we go after, does it? And, in fact, if you work through my original analysis in this case… that’s what it will tell you too! (NB: if you actually make everyone a criminal the particular analysis above will actually produce ill-defined behavior; but if you make rates of black and white crime identical, for any crime rate between zero and one they will produce the same, completely unsurprising, result: that the two men are equally likely to be guilty.)