# Are there any statisticians in the house?

In the “Comments on Cecil’s Columns” forum, we are discussing the “boy or girl” problem. I am getting the idea that many, maybe most, people who think about the right way to do statistics and probability, would say that the usual way of phrasing the problem is significantly ambiguous.

The phrasing I’m talking about boils down to:

“There is a family. It has two children. At least one is a boy. What is the probability that the family has a girl?”

That is supposed to assume, of course, that boys and girls generally occur with equal probability, and that the probabilty attached to the gender of a child is independent of the probability attached to the gender of its siblings.’

Anyway, if there are people who really do know what they are talking about in regards to the question of whether this question is ambiguous, could you answer here or in the thread from the “Cecil’s Columns” forums? We’re kind of just going around in circles.

I am maintaining its not ambiguous, another fellow (I mean to use the term in a gender-neutral fashion) is maintaining (along with, for example, Martin Garndner and the editors of Scientific American in 1959) that it is ambiguous. He/they are saying the problem leaves open the question of how it was discovered that at least one child is a boy, and the answer to this question changes the answer to the problem, meaning the problem is ambiguous. I am saying that, while answering that question would change the answer to the problem, nevertheless, as stated the problem implies a certain so-to-speak “default” method of discovery which allows for only a single answer.

-FrL-

It works like this: Assume the odds of having a boy or girl is 50/50. If that’s the case, then the possible combinations of siblings, with equal probability for each, is:

BB
BG
GG
GB
So, since we know there’s a boy, that leaves the following combinations:

BB
BG
GB

2 of the 3 has a girl, so given that you know one sibling is a boy, 2 out of 3 times the other sibling will be a girl.

I would say that this phrasing is not ambiguous. The answer in this case is 2/3. However, I can envision more lax phrasings which might be ambiguous.

“There is a family. It has two children. [del]At least[/del] [O]ne is a boy. What is the probability that the family has a girl?”

This could be interpreted as follows:
A family has two children X and Y. X is boy. What is a probability that Y is girl?

The answer in this case is 1/2.

In constrast, the former statement would be interpreted as:
A family has two children X and Y. X or Y is a boy. What is the probability that the other is a girl?

Exactly as I, too, think.

But the claim on the table is that the question, as I posed it, could be interpreted as you do, or alternatively, in a different way which makes the answer 1/2. I am arguing that there is a single natural interpretation–the one you give. But in the thread in question, it is being argued by someone else that statisticians and others who work with probabilities consider the wording of the problem to be ambiguous.

He’s produced some citations in support of this claim, but it’s not clear yet that any of them are authoritative in any final sense.

I am hoping there are some experts or semi-experts here who might be able to shed light on the issue: is it generally thought by people who make a living at stuff like this that a question like the one under discussion is ambiguous?

-FrL-

Here’s the argument that the question as I phrase it in my OP is ambiguous, near as I can make it out anyway:

When we encounter the sentence “at least one is a boy,” we don’t know how it was determined that at least one is a boy. If it was determined by an examination of the pair, with a report being made that “at least one is a boy” just exactly in case, in fact, at least one is a boy, then the answer is 2/3. But if it was determined by, for example, an examination specifically of the elder child, with a report of “at least one is a boy” resulting from the elder child’s being a boy and with a report of “at least one is a girl” resulting from the elder child’s being a girl, then the answer is 1/2.

Since we don’t know how the determination was made, and since different ways of making the determination lead to different answers, it follows that the problem is significantly ambiguous.

My response to this argument is to say that by saying just “at least one is a boy,” the puzzle naturally implies that it was the first method of determination which was used. In the absence of reasons to suspect otherwise, we should take reports (especially in the context of a puzzle like this) to be as informative as (the speaker thinks) they need to be for the listener’s purpose. In this case, the purpose is the solution of the puzzle. Now, if the second method of determination had been used, then a report that, for example, “The eldest child is a boy” would have been more informative than the report that “at least one child is a boy” and so we should have expected that that is the report that would have been given. Since that is not the report that was given, we should conclude that the speaker doesn’t know anything but that the family is such that it does not include two girls.

I know that’s kind of a complicated paragraph, but I do think it’s right.

Anyway, any help from people knowledgeable about probability problems like this would be appreciated. (Or if there’s a such thing as “expertise” in the conventions of communication which I’v just discussed in a prior paragraph, and if someone here should by an amazing coincidence have training or expertise in that area, then I’d like to hear from them as well.)

-FrL-

Let’s start by saying I completely understand this problem, and am well aware that the typical answer (which I support) is 2/3 based on the most common reading of the problem.

The ambiguity is as follows:
[ul]
[li]Did the poser specifically look for a two-child family with at least one boy, e.g. could he/she have first found an all-girl family and skipped it in his search for a family with as boy? In this case we assume the poser could only have stated “one of the children is a boy”.[/li][li]Or, did the poser take the first family he found and randomly tell you the gender of one of the children? In this case, the poser could have stated “one of the children is a girl” or “one of the children is a boy”.[/li][/ul]

In the first scenario, the answer is 2/3. As has been repeatedly explained in several threads, the reason for this is that one of the family possibilities (girl-girl) is completely eliminated before the question is posed. Of the other three possibilities (boy-girl, girl-boy, and boy-boy, in birth order), it is 2:1 in favor of a mix, so it’s 2:1 in favor of guessing “girl”.

In the second scenario, the answer is 1/2. The poser has a 50-50 chance of choosing a family with same-sex children as mixed sex, and this probability is passed directly on to the person charged with guessing (again, we assume the poser could have just as easily said “one of the children is a girl” if circumstances warranted it). The first scenario tilted the odds because the poser eliminated one family type in advance.

BTW this ambiguity is included in the original “Monty Hall” problem with the doors. In this problem, we always assume Monty knows where the prize is and shows you the door where the prize is not. Remove this restriction (Monty can open either unopened door) and the odds float back to 50-50 on switching.

Which scenario is implied by the problem? My personal answer is the first, though I recognize the ambiguity enough to agree the problem is poorly phrased.

Sometimes, questions involving probability make more sense when restated in terms of frequency.

Like, “Of all two-child families that have at least one boy, what fraction (or proportion, or percentage) of these have a girl?”

I think that removes any ambiguity that might have been in the original question.