A question about probability and statistics.

Emilio_Lizardo · November 8, 2005, 5:32pm

In general, given a set of n members, the probability that a given member has properties A and B can be expressed as the product of those properties frequency, ie p(A)/n * p(B)/n, correct? Now what happens if I don’t know the exclusivity of those properties? For example, if I have a group of 10 people, and I know that 5 of them are men, and 3 of them are blonde. Can I speak at all about the probability that any one of those people is blonde AND male? It’s certainly solvable mathematically, (3/10 * 5/10) but that answer doesn’t make any sense because I could have no blonde men, 3 blonde men, or anywhere in between. Is there any way to talk about probabilities under these conditions?

Giles · November 8, 2005, 5:44pm

Generally, if the probability of A and B occuring together is the product of their separate probabilities, then the events A and B are said to be independent. And in a specific population, A and B may not be independent even if (among a larger population) they are. Indeed, in a small population suich as the one described, it may be impossible for the two events to be independent (since you cannot have 1.5 blond men in a population of 10).

Thudlow_Boink · November 8, 2005, 5:48pm

First a nitpick about “p(A)/n * p(B)/n”: if p(A) stands for “the probability of A,” as it commonly does in such contexts, then the dividing-by-n part is already part of the probability: you don’t need to do it again.

Now, I think the answer to your main question is no. Without knowing how many of the men are blond, the best you can do is give a range of possible probabilities, as you have stated. If A and B are independent—that is, if each one has no effect on the other’s probability/frequency—then P(A and B) = P(A) * P(B). In fact, this relation can be taken as the definition of, or test for, independence.

If they’re not independent, you can still say that P(A and B) = P(A) * P(B | A), where P(B | A) denotes the probability of B given A (e.g. the probability that a person is blond if that person is a man). But you do have to know this probability.

Pedro · November 8, 2005, 6:27pm

You’re mixing things quite a lot. What is “exclusivity” in this context? Someone more knowledgeable than me will come along shortly but a few points. First you should get acquainted with the basic concepts of probability, such as an event, the probability axioms, conditional probability, etc.

If two events are independent you have P(A and B) = P(A)*P(B).

P(.) is just a function that follows certain axioms. There are a few interpretations of probability, one of them being “frequencist”, that is,

P(A) = n(A)/N.

Define the random experience “pick a person randomly out of ten”. Define the outcomes:

M - person is male
B - person is blonde.

According to your OP we have:

P(M) = 0,5
P(B) = 0,3

It’s reasonable in this case to assume these events are independent, unless told otherwise. So

P(M and B) = .5 x .3 = 0,15 = 15%

Now why wouldn’t this answer make sense? If you pick a person randomly, from a population with these caracteristics (and only these) on average 15 out of 100 will be male and blonde, if you assume independence. You have to define the random experiment carefully. Why is this confusing you? You are working with information you have to make an educated guess about information you haven’t. The number of blonde men in this population is 1.5 on average. No problems there.

I think that what is confusing you is that you are not considering a different (random) population of ten individuals (half male and three tenths blonde) each time you do the experiment. You can only test a specific group of ten people once under these conditions. If you keep testing the same group of ten people that is a different random experiment and 15% only represents the probability of maleness and blondeness IF you use reposition on each draw and each individual is indistinguishable except for maleness and blondeness.

ultrafilter · November 8, 2005, 6:31pm

In general, P(A or B) + P(A and B) = P(A) + P(B).

Chronos · November 8, 2005, 6:34pm

I don’t think it’s a given that hair color and gender are independent (one gender might be more likely to dye, for instance). But if they are, then yes, the product of the probabilities is the probability that you’ll have a blond man. This does not guarantee that there’s a blond man in your group, but then, since when is probability about guarantees?

Shalmanese · November 8, 2005, 9:22pm

Pedro:

You’re mixing things quite a lot. What is “exclusivity” in this context? Someone more knowledgeable than me will come along shortly but a few points. First you should get acquainted with the basic concepts of probability, such as an event, the probability axioms, conditional probability, etc.

If two events are independent you have P(A and B) = P(A)*P(B).

P(.) is just a function that follows certain axioms. There are a few interpretations of probability, one of them being “frequencist”, that is,

P(A) = n(A)/N.

Define the random experience “pick a person randomly out of ten”. Define the outcomes:

M - person is male
B - person is blonde.

According to your OP we have:

P(M) = 0,5
P(B) = 0,3

It’s reasonable in this case to assume these events are independent, unless told otherwise. So

P(M and B) = .5 x .3 = 0,15 = 15%

Now why wouldn’t this answer make sense? If you pick a person randomly, from a population with these caracteristics (and only these) on average 15 out of 100 will be male and blonde, if you assume independence. You have to define the random experiment carefully. Why is this confusing you? You are working with information you have to make an educated guess about information you haven’t. The number of blonde men in this population is 1.5 on average. No problems there.

I think that what is confusing you is that you are not considering a different (random) population of ten individuals (half male and three tenths blonde) each time you do the experiment. You can only test a specific group of ten people once under these conditions. If you keep testing the same group of ten people that is a different random experiment and 15% only represents the probability of maleness and blondeness IF you use reposition on each draw and each individual is indistinguishable except for maleness and blondeness.

Just curious, why do you occasionally use . as the decimal delimiter and other times use , ?

Pedro · November 8, 2005, 9:37pm

Sorry, I didn’t even notice that. I’m Portuguese and around here we use a comma as the delimiter for decimal so I keep slipping back to that, even though I try to use a dot. This reminds me, once I lost an entire day debugging a piece of code in C for this very reason. I was going truly mad.

cerberus · November 9, 2005, 4:01am

For a pair of jointly observed events A, B, the probability of observing both is

Pr{A and B} = Pr{A|B}Pr{B} = Pr{B|A}Pr{A}, where Pr{|A} is the conditional probability of observing * given that A will occur with certainty, and Pr{|B} is the conditional probability of observing * given that B will occur with certainty.

In the special case of A and B being jointly independent, we have that Pr{A|B}=Pr{A} and Pr{B|A}=Pr{B}; hence, Pr{A and B}=Pr{A}*Pr{B}.

As for events A and B in general, Pr{A or B} = Pr{A} + Pr{B} - Pr{A and B}. In the case of independence, Pr{A or B} = Pr{A} + Pr{B} - Pr{A}*Pr{B} = (1-Pr{B})*Pr{A} + Pr{B}.

These formula work better when you use Venn diagrams…

Topic		Replies	Views
Probability Question Factual Questions	3	752	February 14, 2006
Probabilities Questions - What is P(A n B) = P(A) x P(B)? Factual Questions	4	81171	November 1, 2005
probabilities Factual Questions	14	989	October 17, 2008
More probabitlity questions Factual Questions	7	1192	October 24, 2008
Probability question: Independant events Factual Questions	9	1977	February 4, 2012

A question about probability and statistics.

Related topics