Probability Question

Suppose that there are two populations in the universe, X and Y. There are x X’s and y Y’s.

Characteristic H – let’s call it “height” – can be measured in each individual. Each population has a normal distribution of characteristic H.

Although each population has the same standard deviation, the mean height for population X is exactly one standard deviation higher than the mean height for population Y.

What is the probability that the tallest individual in the universe comes from population X?

No this is not a homework assignment.

First let’s say that the mean of population Y is 0. Its exact value won’t matter. Now let z be the common standard deviation of the two populations. By assumption it is also the mean of population X. Then any individual in population Y has a height with mean 0 and standard deviation z while any individual in population X has a height with mean z and standard deviation z.

So the probability that a given person in population Y is shorter than t is N(t/z) where N( ) is the standard cumulative normal distribution. Assuming that the heights are independent, the probability that everyone in population Y is shorter than t is [N(t/z)]^y.

Still assuming independence, the probability (density) that “Jim” is tallest person in population X and is exactly t is the probability that the x-1 others are shorter than t, [N((t-z)/z)]^(x-1), multiplied by the probability (density) that Jim is t, n((t-z)/z). Here n() is the sandard normal density function n(s) = exp(-s^2/2)/(2pi). Here we have to subtract z from t since t-z is the “distance” the height is from the mean of the X population. There are x different people in population X so the proabaility that the tallest person is t is x[N((t-z)/z)]^(x-1)*n((t-z)/z).

Assuming independence across the populations, the probability that the tallest person in X is t and no one in Y is taller is [N(t/z)]^yx[N((t-z)/z)]^(x-1)*n((t-z)/z). We now have to integrate this expression with respect to t from minus infinity to infinity.

I can’t exactly see anything wrong there, and I haven’t calculated it all the way, but at least intuitively it seems like for t to be the tallest person, then the probability that t<z (for X) must be zero. Otherwise you are assuming a priori that the population fits a distribution that does not fit the population.

Maybe it works out, I’m not sure. I’m also not sure if if the result would be valid if you merely integrated from z->inf (but it might be).

That is true (sort of) if z is the mean of the x actual of people. But that interpretation makes no sense for this problem. If there are only x people, then the actual distribution cannot be normal. You’d need an infinite number (actually a continuum) of people to have a true normal distribution. The way to interpret this problem (and most probability problems) is that each person is assigned a height at random from a normal population with a given mean and variance. Then it is perfectly possible that everyone is assigned a height less than the mean of distribution from which the population was draw. In fact for the normal distribution or any distributin symmetic about its mean, the probability that n draws are below the mean is (1/2)^n.