Statistics & profiling problem

Reading a thread going on in GD this question occurred to me.

The town of X has a population of 10,000 of which 10% black and 90% white.
Of blacks, 20% are criminals and 80% are honest while of whites 5% are criminals and 95% are honest.


        Honest    Bad   Total 
Whites:   8550    450    9000 
Black:     800    200    1000 
Totals:   9450    650   10000 

The bad guys are about equally bad so that about 70% of crimes are committed by white bad guys and 30% by black bad guys.

A crime has been committed and an officer arrives on the scene and sees a white guy and a black guy leaving the scene in opposite directions. He cannot go after both so he has to choose which one to go after.

Hmmm, he thinks, 70% of crimes are committed by white guys which is more than twice the percentage of crimes being committed by black guys. I should therefore go after the white guy.

Oh, but wait, taken individually the white guy has a probability of 5% of being a criminal but the black guy has a probability of 20% which is 4 times higher. I should therefore go after the black guy.

What is the true, correct and definitive answer? Who should the officer chase?

He should wound one in the leg and then chase the other.

This looks like a problem that can be addressed by Bayes’ theorem, but I’m not 100% sure how to apply it here.

First the math, then the explanation of why the math is probably not applicable anyway.

So: mathematically, assuming that the black man is randomly selected from the set of all black people in the town of X, then the probability that he is a “bad guy” is 20% as you’ve stated. So in that sense, the black guy is more likely to be a “bad guy” than the white guy (assuming that the white guy is a randomly selected white guy as well). The fact that there are more white bad guys than black ones means that if you select a bad guy at random then he’s more likely to be white than black, but that’s not what’s happening in the situation you describe.

But of course, these people aren’t randomly selected are they? It’s not as if the census bureau picked one white man and one black man at random and dropped them at the scene. So statistical conclusions that assume the people are randomly selected, like the one in the previous paragraph, are of questionable merit at best. For all the cop knows there’s a secret white bad guy convention going on down the street.

The cop really has no information to suggest which suspect is more likely to be a criminal, without making an unfounded statistical assumption.

Well, while the cop is doing the math, both of them will get away. I suggest that we should agree with don’t ask. Too bad that we have a racial system in the US. Cops shouldn’t be asked to judge between black or white.

>Hmmm, he thinks, 70% of crimes are committed by white guys which is more than twice the percentage of crimes being committed by black guys. I should therefore go after the white guy.

I think this is a red herring. It would have influenced how likely it would be that one white and one black are leaving the scene, but that’s a given and so the statistic is not useful here.

>Oh, but wait, taken individually the white guy has a probability of 5% of being a criminal but the black guy has a probability of 20% which is 4 times higher. I should therefore go after the black guy.

I think this is the relevant statistic. It’s also the correct conclusion, if we’re given that the cop should take advantage of the statistical information and also that he has no other criteria to use. These aren’t necessarily trivial points.

>The cop really has no information to suggest which suspect is more likely to be a criminal, without making an unfounded statistical assumption.

I think this isn’t correct, in two ways. The statistical information is certainly “information to suggest which suspect is more likely to be a criminal”, though it is certainly weak information that still leaves the criminal’s identity quite uncertain. Weak information is certainly more useful than none at all. And, I don’t hear anything about assumptions here, founded or not.

BTW, sailor, you’re not afraid to ask the tough questions, are you? Perhaps we should substitute “army men” for “white men” and “navy men” for “black men”. After all, why offend anybody needlessly?

It’s not quite right to suggest that just because you don’t know everything you can’t know anything. In the absence of complete information you can only make guesses, of course, and these guesses will sometimes be wrong; but this is a far cry from having no information at all. Using probability theory is one way of trying to make better guesses: i.e. guesses which are, statistically, more likely to be correct. Of course this relies on various assumptions (e.g. statistical independence between various random variables), and if your assumptions are badly wrong then you might make worse guesses instead.

A Bayesian might analyze the situation as follows:

Initially the officer sees two men, X and Y, running from the scene, and assigns each of them equal prior probabilities of being the criminal:
    P(Y,¬X) = 1/2        (prior probability that Y is criminal and X is not)
    P(X,¬Y) = 1/2        (prior probability that X is criminal and Y is not)
As he approaches, he sees further that X is black (X[sub]b[/sub]) and Y is white (Y[sub]w[/sub]). He now updates his priors with the new information, using Bayes’ Law (linked by ultrafilter above):
    P(Y,¬X | X[sub]b[/sub],Y[sub]w[/sub]) = P(X[sub]b[/sub],Y[sub]w[/sub] | Y,¬X) P(Y,¬X) / P(X[sub]b[/sub],Y[sub]w[/sub])
    P(X,¬Y | X[sub]b[/sub],Y[sub]w[/sub]) = P(X[sub]b[/sub],Y[sub]w[/sub] | X,¬Y) P(X,¬Y) / P(X[sub]b[/sub],Y[sub]w[/sub])
The denominator (basically a normalization factor) is computed by summing over all possibilities:
    P(X[sub]b[/sub],Y[sub]w[/sub]) = P(X[sub]b[/sub],Y[sub]w[/sub] | Y,¬X) P(Y,¬X) + P(X[sub]b[/sub],Y[sub]w[/sub] | X,¬Y) P(X,¬Y) .

Now how can we compute P(X[sub]b[/sub],Y[sub]w[/sub] | Y,¬X) (the probability that X is black and Y is white, given that Y is criminal and X is not)? We might assume (in the absence of more comprehensive statistical information) that the probability that X is black does not depend on whether Y is criminal, i.e. that the actions of X and Y are basically independent of each other. In this case we can write
    P(X[sub]b[/sub],Y[sub]w[/sub] | Y,¬X) = P(X[sub]b[/sub] | ¬X) P(Y[sub]w[/sub] | Y) .
Now P(X[sub]b[/sub] | ¬X) (the probability that X is black, given that he is not a criminal) and P(Y[sub]w[/sub] | Y) (the probability that Y is white, given that he is a criminal) are given in the statistical tables provided:
    P(X[sub]b[/sub] | ¬X) = 800/9350 = 16/187 [note typo “9450” in table]
    P(Y[sub]w[/sub] |   Y) = 450/650   = 9/13
So
    P(X[sub]b[/sub],Y[sub]w[/sub] | Y,¬X) =   (16/187) (9/13) = 144/2431
and similarly
    P(X[sub]b[/sub],Y[sub]w[/sub] | X,¬Y) = (171/187) (4/13) = 684/2431
so
    P(X[sub]b[/sub],Y[sub]w[/sub]) = (144/2431)(1/2) + (684/2431)(1/2) = 414/2431
and the updated posterior probabilities are

P(Y,¬X | X[sub]b[/sub],Y[sub]w[/sub]) = (144/2431)(1/2) / (414/2431) =   72/414 =  4/23        (posterior probability that Y is criminal and X is not)
    P(X,¬Y | X[sub]b[/sub],Y[sub]w[/sub]) = (684/2431)(1/2) / (414/2431) = 342/414 = 19/23        (posterior probability that X is criminal and Y is not)

This approach, of updating a priori probabilities to reflect new information using Bayes’ Law, is called Bayesian inference. It’s an extremely useful statistical technique, though (as with all statistical techniques) it relies on having valid data and assumptions.

Depending on how you define “better,” it may be appropriate to consider factors besides the probabilistic results; some of these other factors come into arguments against profiling. (The game-theoretic aspects of policing, for example, mean that the actions the officer takes in this round may affect the approaches taken by the parties in future rounds.)

The assumption I was referring to was the statistical assumption that the men at the scene were randomly selected from the set of all such men in the city. It’s unstated, but it’s not possible to compute probabilities (such as Omphaloskeptic has done) without making some such assumption, as Omphaloskeptic has correctly noted.

Well, yes, but I think your post was unnecessarily pessimistic with

Of course all assumptions are mathematically "unfounded"—that’s why they’re assumptions; but some assumptions are more reasonable than others. If we require that the officer just throw up his hands in defeat, or maybe flip a coin, unless he can determine with certainty which one is the criminal, then we may as well disband the police forces. Even if he captured both suspects (like don’t ask suggests) and questions them both, and one admits guilt and the other protests his innocence, … well, maybe they’re both good liars. He still has “no information” unless he makes the (unfounded) assumption that they don’t have some bizarre reason to conspire to fool him. I think it’s reasonable to consider this last scenario rather unlikely, though, and more generally, reasonable to make inferences based on statistical information. Mathematical certainty is never going to be possible here; all you can hope to do is use all the information you have and try to quantify your errors.

So which one do you shoot in the leg?

Well. the black one, of course. It’s well known that they are bred to run faster.
:eek: :slight_smile:

Omphaloskeptic, the more I think about it the more confused I am and I arrive at contradictory results but none are what you say.

One analysis: The cop arrives on the scene and sees a crime was committed. He sees no one and thinks correctly that the chances are 70% that it was a white bad guy. Now he sees the black and the white guy. This yields no new information so he has better chances of catching the criminal if he goes after the white guy. Right now I think this is correct.

No, the presence of these two people at the scene does in fact yield new information: that they are the most likely suspects. Before you saw these two, your list of suspects was the 10000 people in the city, each with probability 0.01%. Now it’s (simplifying by pretending these are the only two suspects left) just these two, each with (for now ignoring the information about their race) 50% probability.

Beforehand, your suspicion that it was probably a white guy was weighted by the fraction of white suspects (large relative to the fraction of black suspects). But since they are no longer suspects, they (along with the rest of the 9998 other people in town) don’t weight the results any longer. What’s important now is how likely each of these two individuals is to be a criminal.

Or another analysis which if I understand ** Omphaloskeptic**'s post correctly.

With no other information other than a crime scene, you would assume it was 70% likely a white suspect because whites commit 70% of the crimes. However since the presence of a black person is noted from among the two individuals leaving the scene and by your stats any given black guy is about 4 times more likely to commit a crime as any given white guy.

This seems to correlate with the results of 4/23 chance of white perp vs 19/23 chance of black perp.

I’m going to have to think about that. I say this because I remember stumbling over another probability problem in another thread long time ago. It was the Monty Hall 3 door problem. I argued adamantly and suddenly, a little light turned on and I realized I was wrong. But it took quite some time and effort. This time I am going to be more prudent, especially because I am even less sure this time. Let me give it some thought.

Actually, there are a whole passel of assumptions here. Please indulge me as I toy with some of them. If you just want the “meat” [genuine objection] skip to the bolded “Final Analysis”.

ISSUE #1: you presume “Criminals” are always guilty, and “Honest men” never are - all criminals are born guilty, and all honest men are forever sainted. This is necessarily false, and weakens the derivation above. In real life, of course, all men are born innocent, and some become guilty at some point. Without knowing the rate of (convicted) first offenses - the rate at which honest men become criminals, we can only guess (or approximate) the actual probabilities

ISSUE #2: what do real cops think if they see two men running away from the scene of a crime? They think both are guilty! I’ll get back to this point

ISSUE #3: If all guilty men are “always guilty” then we get some truly funky situations. If a penny is stolen from the till, every criminal in the vicinity must be guilty - a neat trick, and a ludicrous assumption for the real world.

You may argue that the unstated principle (‘assumption’ is more like it) that only one is guilty is implicit in the formulation of this problem (and similar problems of its class) But that forces a revision of the math.

You may say “the crime was murder, and only one gunshot was heard” (we’ll ignore the crime of ‘fleeing the scene’- which only casts more doubt on the presumption that “only criminals commit crimes”) We still have to refine the derivation.

If both men were chosen at random, then a Real Cop’s initial assumption (both are criminals for fleeing the scene) is correct 1% of the time, and the “reader’s assumption” (there is only one criminal) is correct 23% of the time - but 76% of the time there’d be no victim at all!

The apparently relevant denominator ambiguous. it’s either
A) “all cases where at least one man is a criminal” (which I think is the only physically arguable case); or
b) “all cases where only one man is a criminal” (whose only merit is conforming to a common presumption)

Pa = 450/9000 + 200/1000 - [(450/9000) * (200/1000)] = 24%
(the subtracted term removes the ‘double-counted’ overlap)
Pb = 450/9000 + 200/1000 - 2*[(450/9000) * (200/1000)] = 23%
(the subtracted term removes both counts of the overlap)

In either case, we can only rely on the prevalence of “guilty” men in each racial subgroup 10% (B) vs 5% (W) - but the analysis is completely flawed, because in a random matching of candidates the actual murderer would be White roughly 450/650 of the time. Why were we wrong?

**
FINAL ANALYSIS

The flaw is: crimes are committed solely by criminals. Any number which includes innocent people merely obfuscates the issue. This includes statistics like “percentage of criminals” or “total population”, which are affected by the number of innocents. changing the number of innocents does not affect the probability that a man is guilty.

The statistically valid denominator is the number of potential murderers in town **. All the Chinese in China, or all the innocent Chinese in town are irrelevant.

Issue 4: the problem was set up to demand a black man and a white man at the scene.The universe of random pairings, howeverdoes not reflect the underlying events each black man, guilty or innocent, is “forced to flee” 9-10.6875 times as many hypothetical crime scenes as each white man. This dramatically overestimates the possibility of black guilt

EFFECT 1: In the universe of of cases where a black man is guilty, the analysis provided pairs him against 9000 white men [case A] or 8550 innocent white men [case B] while each white man is mathematically paired against only 1000 or 800. If you wrote a chart every "random pairing, each black man’s name would appear either 9000 or 8550 times, while each white man’s name would only appear on the chart 1000 or 800 times. Throw a dart at this chart, and the result isn’t “fair” - the black men have 9000/1000 or 8550/800 times as many slips in the hat.

**To illustrate this,make the numbers more extreme. Say that there are only 10 blacks in the city (and 1 black criminal) while there are 10 million whites (and 50,000 white criminals). By (improper) Bayesian analysis, that black man must commit virtually all the crime that occurs in his vicinity, while the 50,000 white criminals sit on their hands. With a year (before new stats can be issued, every black person would be shot many times, and no white man would be caught in this situation. **

EFFECT 2: Relying on prevalence in subpopulations will skew all future statistics, even if the population sizes are equal. In cases where men of both races are suspected, the black criminals will always be caught [and be counted], and the white criminals will always escape [and not be counted]. This effect is strongest as the population with the highest pre-existing

Effect 1 increases as the black fraction of the total population decreases. Effect 2 increases as the black fraction increases. Racial profiling is a Big Lose for blacks, guilty or innocent.

Well, at least the rest of us understand the problem as it was enunciated even if we’re still having problems finding or understanding the solution.

This is true, and it is undeniably a problem with profiling schemes. It would be nice to have an equal chance of catching all criminals (whether black or white) and to prevent white criminals from gaming the system by committing their crimes when blacks are around (thus hassling innocent blacks), but it would also be nice to maximize our chances of catching criminals. These two desires are in conflict here, and there’s no simple resolution; they can’t simultaneously be maximized for the problem stated.

There’s a reason my first response stopped where it did (with the computation of the posterior probabilities) and not with a complete answer to the OP’s question

which is, as you point out (I only mentioned it in passing) a much more difficult question, and not one with a GQ answer. What do you want to maximize?

Well, yes, but black crime is (in the real world; the statistics in the OP don’t cover victims) also a Big Lose for blacks, guilty or innocent. It’s not clear, to me at least, which causes worse problems in practice.

My point is that the problem, as it is enunciated, is flawed.

The “understanding” you cite is what creates the confusion. My final analysis points out why.

Sorry about the rest of the stuff, if it didn’t interest you. I probably should have taken the racial example less literally, However, picking assumptions apart is not irrelevant. It is an essential first step in mathematical analysis. If this were a more strictly mathematical forum, the assumptions would have been picked apart a lot more already. That’s just part of how the game of math (vs. arithmetic) is played

I don’t think the problem is necessarily flawed. If you wanted to use this Bayesian analysis as a Rigorous Proof of The Efficacy And Rightness of Profiling, well, that would be a problem, yes. But the question, at least as I understood it, was somewhat more limited in intent than that: just a tool for understanding the mathematical reasons behind profiling.