More probabitlity questions

When I was in high school and college, probability seemed so simple. It does not seem so simple now. I clearly am not getting any smarter.

What probability rule answers the question: What is the probability of being female AND being blonde.

What is the probability of being female OR being blonde?

What is the probability of dying your hair blonde 100 or more times during one’s lifetime among females?

The possibilities are:

Addition rule
Multiplication rule
Marginal probability rule
Conditional probability rule
Subtraction rule.

Most people would probably tell you that, by the multiplication rule, the probability of being female and blonde is the product of the probability of being female and the probability of being blonde. This assumes that blondeness is independent of sex–that men and women are equally likely to be blonde–which is at best a questionable assumption. In fact, assuming that someone is female if you hear that they’re blonde seems to be a pretty good rule of thumb.

The probability of being female or of being blonde is the probability of being female plus the probability of being blonde minus the probability of being female and blonde. I don’t know which of your rules covers that; the standard term is inclusion-exclusion.

Here are the basic rules of probability, from which everything else can be figured out:

  1. Equivalent properties have the same probability
  2. Probabilities are nonnegative
  3. Anything guaranteed to be true has probability 1
  4. If A1, A2, A3, …, are such that any two of them are incompatible, then P(A1 OR A2 OR A3 OR …) = P(A1) + P(A2) + P(A3) + … [This is the addition rule]
  5. P(A AND B) = P(A) * P(B | A), where the second term means “The probability of B given that A is true”. [This is the conditional probability rule]

We can derive some other things from these rules. For example, 1 = P(A OR NOT A) by rule 2, and this in turn is equal to P(A) + P(NOT A) by rule 3. So, P(NOT A) = 1 - P(A). [This is what I’m guessing you mean by the subtraction rule]

What’s P(A OR B) in general? Well, A OR B is logically equivalent to (A AND B) OR (A AND NOT B) OR (B AND NOT A), all of which are pairwise incompatible, so by rules 0 and 3, P(A OR B) = P(A AND B) + P(A AND NOT B) + P(B AND NOT A). This in turn is equivalent to P(A) + P(B) - P(A AND B), since A is equivalent to (A AND B) OR (A AND NOT B) and the “mirror-image” equivalence holds for B. [This is the inclusion-exclusion rule ultrafilter mentions]

So those are some nice rules. But what about P(A AND B)? Can we figure it out just from knowing P(A) and P(B)?

No, in general, we cannot. Just knowing P(A) and P(B) doesn’t tell us anything about the relationship between A and B; they might be incompatible, or they might be equivalent, or they might be independent, or any number of other things. We need to know which of these is the case in order to figure out P(A AND B). [By definition, A and B are independent if P(B | A) = P(B); i.e., being given B as true doesn’t change the probability of A].

If A and B are independent, then, by rule 4, we have that P(A AND B) = P(A) * P(B). [This is the multiplication rule]

So… to answer your specific questions:
P(female AND blonde) = P(female) * P(blonde | female) = p(blonde) * p(female | blonde). If we know that being female and being blonde are independent traits, then this equals P(female) * P(blonde).

P(female OR blonde) = P(female) + P(blonde) - P(female AND blonde)

P(dying your hair blonde 100 or more times during your lifetime among females) = whatever it is. It’s not something we can derive automatically from P(blonde) and P(female). Conceivably, if you have a particular probabilistic model for hair-dying in mind, we could come up with some formula for it in terms of other, simpler quantities, but you haven’t given any such model.

(Corrections in bold)

To re-use my example from last time:


o o o o o o o o o o
o o o o o o o o o o
o o o o o o o o o o
o o o o o o o o o o
o o o o o o o o o o
o o o o o o o o o o
o o o o o o o o o o
o o o o o o o o o o
o o o o o o o o o o
o o o o o o o o o o

If female is blue and blonde is orange, the probability of being female and blonde is the intersection (green), the probability of being female OR blonde is all of the colored area (all three colors.)

What probability rule answers the question: What is the probability of being female AND being blonde. Multiplication Rule (does not require independence of the events, but you may need some conditional probabilities).

What is the probability of being female OR being blonde? Addition Rule

What is the probability of dying your hair blonde 100 or more times during one’s lifetime among females? Depends on the assumptions, perhaps binomial

Hm… given how easy it is to misinterpret them (as saying P(A AND B) = P(A) * P(B) and P(A OR B) = P(A) + P(B)), the terms “multiplication rule” and “addition rule” are perhaps dangerously ambiguous.

When replying to this thread, it suddenly occurred to me that my rules 0, 2, and 3 above can be collapsed into a single fairly intuitive one. Thus, I present the following slightly revised list of rules:

A: Probabilities are nonnegative
B: Given a sequence of exhaustive, mutually exclusive outcomes (i.e., one and only one of them will be true), the sum of their probabilities is 1
C: We define the conditional probability P(B | A) so that P(B | A) * P(A) = P(A AND B)

[Rule 2 above follows immediately from Rule B by considering a sequence containing just one guaranteed outcome. We can derive P(A OR NOT A) = 1 directly from Rule B as well. Rule 3 above then follows from the derived negation rule and Rule B by considering the sequence NOT (A1 OR A2 OR A3 OR …), A1, A2, A3, … . And, finally, rule 0 follows from rule B and our derived negation rule by considering P(X OR NOT Y) where X and Y are logically equivalent.]