Monty Hall Question

Here’s another explanation.

Here are all the possibilities as to gender, birth order, and our being told about a single sibling’s gender. Bold means we were told about that siblings gender.

GG 1/8
GG 1/8
GB 1/8
GB 1/8
BG 1/8
BG 1/8
BB 1/8
BB 1/8

Now, as the problem has been set up, we would never be told about a boys gender–the problem states from the outset that we are told about a girl’s gender. So we can remove all the bolded B’s. Furthermore, when we do so, the probability associated with that possiblity gets absorbed into the complementary situation in which the genders are the same but we are told about the other sibling’s gender instead. I’m having a hard time figuring out how to explain why. If you don’t see why, please respond to this post and I’ll try again to explain it.

So anyway, we get:

GG 1/8
GG 1/8
GB 1/4
BG 1/4

Of course this doesn’t add up to 1. This is because in making my eliminations, I elminiated both possibilities in which both siblings were boys, since both of these were ones in which we are told about a boy’s gender. Removing both left no complementary situation to “absorb” their probabilities.

But that’s okay. What we’ve arrived at is a correct list of the proportions between the probabilities of all the possible events. Multiplying each number by 1/3 gives us a list adding up to 1:

GG 1/6
GG 1/6
GB 1/3
BG 1/3

Still not a clear explanation, I guess. This is hard!

-FrL-

Here is your error.

Your outcomes (2) and (3) are the same outcome.

If someone comes and says “I had a girl, and then had another girl,” which group would you put them in - 2 or 3?

I’ll try to example again…

The error here is that you can’t eliminate #4 because it’s untrue without removing its’ paired outcome. Outcome #4 is paired with #3 if the Child A is male, and with #1 if Child B is male. With the statement “family has a daughter” (and “a daughter” doesn’t make this a trick question), then we can make either all Child As or Child Bs, a girl.

It could be implied in Cecil’s “possible gender combinations” that there four possible children…but if that were the case the possible outcomes and their frequencies would be:

G1 and G2 16.67%
G1 and B1 16.67%
G1 and B2 16.67%
G2 and B1 16.67%
G2 and B2 16.67%
B1 and B2 16.67%

We know that this is not true because there can only be four outcomes (two outcomes of Child A TIMES two outcomes Child B). In fact, if you lumped them together, G2 and B2 would only exist if paired with the corresponding same sex sibling. So there are two separate events producing the results for Child A and Child B. Both events are independent of each other, with each having only two outcomes.

With two columns we can show the outcome that we expected.

Column A…/… Column B
Girl…/… Girl
Or …/… Or
Boy…/… Boy

Producing

If A is a girl, B is a girl.
If A is a girl, B is a boy.
If A is a boy, B is a girl.
If A is a boy, B is a boy.

This is fine as long as everything is random… but we are told that one is guaranteed to be a girl. Cecil stated that we should remove the possibility of “BB” and then the remaining three results are equally possible. But what is given is that there is at least one daughter, and that….and only that… is what makes two boys impossible.

As I stated in my earlier post, to solve with a “given daughter” you would have

Column A…/…Column B
…/… Girl
Given Girl…/… Or
…/… Boy

Producing
A is a girl, B is a girl.
A is a girl, B is a boy.

OR

Column A…/…Column B
Girl…/…
Or…/… Given Girl
Boy…/…

Producing
A is a girl, B is a girl.
A is a boy, B is a girl.

Both scenarios give a boy a 50/50 chance.

Cecil’s approach continues the original formula, but one of two things must happen.
1) If A is a boy, B MUST be a girl.( the reverse is also true. If B is a boy, A MUST be a girl.)
2) If two boys are received we must disregard and try again.

In both scenarios we are asked to change the results. In scenario #1 we are allowing the results of one child to effect the other, which is not possible. In scenario #2 we disregard the results, but these results are impossible in the first place if we are given “a daughter.”

Cecil removing “Boy and Boy” is true because one of the boys can’t exist, but what Cecil didn’t account for is that this boy doesn’t exist with a girl either. If Child A can’t be a boy then it rules out #3 and #4 of Cecil’s list. If Child B can’t be a boy then it rules out #1 and #4. :slight_smile:

You’re misunderstanding the problem. There is no pairing because there is no specificity–we’re not talking about Child A, and we’re not talking about Child B. The “trick” to this problem, if there is one, is that the following two problems are not equivalent, and have different probabilities:

a) There is a family with two children. You have been told this family has a daughter. What are the odds they also have a son?

b) There is a family with two children. You have been told this family tallest child is a daughter. What are the odds they also have a son?

The second problem contains more information, specifying which child is “a” daughter, which changes the probabilities. The argumants you’re making apply to the second problem statement, not the first.

To make it even more clear how the two are different problems, the first can be restated this way:

a) There is a family with two children. You have been told either this family’s tallest child or this family’s shortest child is a daughter. What are the odds they also have a son?

(I here assume the two are not identical in size, but a formulation allowing for that possibility is easy to construct.)

Juxtapose that with

b) There is a family with two children. You have been told this family tallest child is a daughter. What are the odds they also have a son?

and you can see that a) and b) are clearly different problems, in which different information has been given, and regarding which different sets of calculations will need to be performed.

-FrL-

I still prefer to think of the possible populations. Let’s say you have 100 families with two kids, in a typical arrangement of Child A (older) and B (younger):

Group 1, 25 families: A=boy, B=boy
Group 2, 25 families: A=boy, B=girl
Group 3, 25 families: A=girl, B=boy
Group 4, 25 families: A=girl, B=girl

The information that you’re given, “this family has a daughter,” has ruled out Group 1, but has done nothing else. What is the probability, based on counting up the numbers, that the other is a boy? Obviously it’s 50/75, or 2/3.

For ease of explaining I will raise it to 1000 families.
Group 1, 250 boys
Group 2, 125 boys and 125 girls
Group 3, 125 girls and 125 boys
Group 4, 250 girls

If you gather these children together and ask the boys to leave, you will be left with 500 girls. If you then ask the girls if they have any brothers:
250 will say YES
250 will say NO

50/50

Yes, twice as many households will have “a boy and a girl” compared to having “two girls”, but is that what we are being asked? Are we not being asked what is the sex of the child?

Nope.

By doing the above, you are unfairly giving mixed families one “vote” (so to speak) and the all-girl families two.

John, are all the girls equally able to be the “given” daughter? There is nothing in the original question that said that they weren’t. Because the family with two girls has two daughters does, in fact, give this family twice the probablity. Without being given more information about how this family and this daughter is chosen, it would be wrong to assume that the family is random but the daughters are not.

I’m not very good at exampling my arguments, so I did a Google search and found this discription that can example it better than I can. The part I am referring to is near the bottom of the page.

Start at Why are these two probabilities different?)

The original question is: “There is a family with two children. You have been told this family has a daughter. What are the odds they also have a son, assuming the biological odds of having a male or female child are equal?”

If the family has been chosen by first looking at the children, that’s stacking the deck.

Pretty much, yes!

No, in fact we’re not. We are being asked about sex, but not aobut the sex of any particular child. To put it in your terms: There is no such thing as the “given” girl. When we are told “one of the children is a girl,” we are not being told, about any particular child, that it is a girl. Rather, we are being told, about the pair, that it includes one girl.

We are being told that either the one is the girl, or the other is the girl. To be told this is different than being told about any particular child that it is a girl.

Notice the following are two different problems:

A. Either child one is a girl, or child two is a girl. Now, what’s the chance the other is a boy?

B. Child one is a girl. Now, what’s the chance the other is a boy?

These are two different problems with two different answers. You are trying to answer a version of problem B. But the problem given in Cecil’s article is a version of problem A.

-FrL-

When you’re asked the probability, it’s asking if you repeated this situation many many times, what proportion of them could you expect to come out one way vs. the other? The situation that we’re given is that you are like a census taker, about to go up to a house which someone has already told you has two kids, and at least one daughter. It’s the number of households that matters in the way this question is asked. In your example number sets, there are 750 families that qualify, and 2/3 of those also have a boy.

There is no “given” daughter here, and that’s what trips many people up. All you know is that the number of daughters is greater than zero.

For this to be true two things need to be assumed. First the family is chosen randomly. If we survey parents as they drop their daughters off at the 8th grade dance, we will find that the other child is equally likely to be a girl as to a boy.

  1. Girl has a sister
  2. Girl has a brother
  3. Boy has a sister
  4. Boy has a brother
    Why? If we only ask the girl’s parents we are limiting ourselves to possibility #1 or #2. Because of the limitations this subset is not a truly random event.

CurtC’s example of the census-taker uses a random family. Because it’s a random family, if you ask if they have a daughter they are 75% likely to say, “Yes.” If they do, they then are 66.67% to have a boy also. They also would 75% of the time say they had boys if asked. Either way, if we use this example then, yes, 66.67% does answer the question of the other child. But what is missing here is that this is a random family that could (25% of the time) say they don’t have any daughters. If that is the case then we automatically know what the make-up of the family is. If we disregard this family and find one with girls, we don’t have a random sample anymore. While the 66.67% would still be the result, you would be modifying the problem to fit your answer. I would say that academic America wouldn’t want your service, but you would be a perfect with U.S. Census Bureau.

The second equally important assumption that must be met to make Cecil’s statement truthful is the information of the daughter MUST be generic and non-specific. The source of the information must know the makeup of both children. This is important because if he only knows of one girl then… that girl is specific, and the question can be answered relating to her.
Child A is the girl with a sister.
Child B is the girl with a sister.
Child A is a boy with this sister.
Child B is a boy with this sister.

If the man gained his knowledge of the daughter from:

  1. Seeing the father on the phone with his daughter
  2. Seeing a picture of a daughter on the father’s desk
  3. He saw the father buy a birthday present for his daughter
  4. Someone told him they saw the father with a daughter.

Any of these events would apply to any daughter, not to only one child in each family. Therefore, the daughter he speaks of isn’t limited to girls in general but to a specific girl. All these examples are “relative” to the daughter not the man. The possible combinations allow this daughter to be either child, leaving a 50/50 chance that the other child is a boy or a girl. But… if our source has knowledge of both children, and other than being overly cryptic, none of this would apply because at least one daughter appears in three possibilities and he wouldn’t need to be specific. And if he had knowledge of all the children, then the possibilities of the other child being a boy is 66.67%.

If I meet a family that has two kids, with a daughter being present, then two boys were never even possible. The possibilities are then limited to what child combinations can be made with that daughter, not with the family. The daughter is then random and the possibilities pivot around her. The same criteria you use to separate GB from BG are the same criteria you would use to separate two possible GGs. But if this same family told us that they had one daughter, then because they know both children and they are not being specific, the other is likely to be a boy.

There are several versions of this question and each answer is based on how the problem is presented. The problem that Cecil described didn’t include a reference to a random family. The statement of a daughter also, didn’t say the family, or someone that knew the make-up of the family told us that. Based on the problem we have no reason to believe the source didn’t know just of the daughter and nothing more. Without assuming that the source knew both children, the child known was specified. If our source knew only of a girl then he equally likely to know any of the girls, including the two in the same family. If the child is specified then it is randomly found in any of the four places that girls reside in the scenario.

Corrected Question:
*You meet a family. In conversation you find out that they have two children of which at least one is a daughter. What are the odds they also have a son, assuming the biological odds of having a male or female child are equal? (Answer: 2/3.)
*
This is true because the source here is the family and the daughter need not be fixed. The daughter is generic in this problem. This is not true in the original problem because the source knew of a daughter, not necessarily both children.

I can not see what the difference is between “one of them is a” and “at least one of them is a.” I can see how one or the other might be more appropriate in different conversational contexts, but I can not see how the context of the question is such a context.

Just a query for information which I think might be relevant to what’s going on here: Are you a native English speaker?

-FrL-

Right! And both of these assumptions are by far the most natural ones to make given the question being asked! (And especially given the nature of “probability riddles” in general.)

In communicating, we (at least English speakers, probably all human beings) try to portray ourselves as being as informative as possible under the circumstances. For this reason, when someone says “one of them is a…,” we naturaly assume, since they are being as informative as possible, that all they know is that one of the children is a girl, and that they don’t know which one. If they knew which one, they would have told us.

-FrL-

Yes, Frylock, English is my native language, but surprisingly my college English teachers had their doubts. The difference between Cecil’s original question and the corrected one is the source of the information. As I tried to point out in my post, if the source only knows of one daughter that daughter is specific. There could be some highly unusual event that allows him to know of a generic daughter, but that is unlikely and not given. In the corrected version, it is the family that tells him of a daughter. The family would know the sex of both children automatically.

For a better illustration

Since we are given a family with two kids, one being a daughter, it should be easy enough to make an experiment that fits the problem. Get the fours Aces from a deck of cards. Put one black and one red Ace in one pile and the same in the other. Have a friend draw one card from each pile. Possible combinations:
Red Ace, Red Ace
Red Ace, Black Ace
Black Ace, Red Ace
Black Ace, Black Ace

Experiment #1 (representing someone that knows one daughter):
Have him look at one card and tell you what color it is. Your chance to guess the other is 50/50. No matter what card he tells you about the other card has the same chances to be black or red. It wouldn’t matter to you which card he looked at; the odds would be the same.
Experiment #2(representing someone that knows both children):
Have him look at both cards. Then have him tell you the color of one card. Chance of guessing the opposing color 66.67%

The difference between the two experiments is when a pair comes up. In the first experiment, at all times, the friend is talking about one card only. In the second experiment the friend is talking about both cards 50% of the time (for Two Reds or Two Blacks). This changes the whole dynamic of the problem.

Anyway, that I believe is the main problem with Cecil’s question. The source MUST know both children to get a 66.67% result. As I stated, the other versions of this question floating around the web always speak of a parent being the source, not a third person. *The devil is in the detail.

  • :slight_smile:

This is true assuming you told him which card to look at, or that he told you which card he is looking at. Otherwise this is not true. The problem of the OP, as worded, is not analogous to this experiment on either of the two assumptions I just named.

Much of the above sounds just about right, except when you say that in the problem as originally worded, its not given that the child is unspecified. It is given that “the daughter” is not specified. It is given–because “the daughter” isn’t specified in the problem! Were we supposed to think the “the daughter” had been specified, then “the daughter” would have been specified in the problem.

You’re right that someone somewhere along the chain of reporting must have known the gender of both children, in order for the report that “one of them is a daughter” to come into existence. But its not true that the very source spoken of in the problem must be this person. The person reporting the details in the problem could well be a “third person” not personally acquainted with the family, who only knows the bare fact that “one of the children is a girl.”

And besides, I can’t see the relevance of the fact in the first place. It doesn’t matter what the reporter of the information in the problem knows because what they know is not given as part of the problem. All that is given is what they say and what they say is “one of the children is a girl.” This creates knowledge in you the recipient of the information which is not specific to any one child but only to the pair of children in question. This is true whether the person reporting the fact knows only information about the pair, or knows specific information about which child is the girl. This is the case because that reporter hasn’t said which child is the girl, even if she does know.

Anyway, just to be clear, are we agreed that the following is true?:

I think you and I agree to the above. Where we disagree is on whether the OP should be interpreted as giving a situation equivalent to the case above where you are told “I don’t know,” or whether the OP is ambiguous between the two cases.

-FrL-

This is true regardless of what card he picks and regardless if I know which card he picks.
The card choice is irrelevant just as long as he tells you the results of just one card.

An even better illustration

Same problem but my friend tells me he has a red card.
If he only saw one card then it wouldn’t matter to me which pile he drew from the other pile has one red and one black card. The card is specific because it can only come from one pile, no matter which pile. So if the combinations were before he drew:

Pile 1…………………Pile 2
Red……………………Red……………possibility #1
Red……………………Black………….possibility #2
Black………………….Red……………possibility #3
Black …………………Black………….possibility #4

Knowing that he has a red card, either #1 and #2 is possible OR #1 and #3 is possible…NOT both. I don’t even need to guess which pile because it will all be the same. In either case the other card can be either red or black, your chance of guessing the other card is 50/50.

If your friend saw both cards and tells you one card is red, it becomes tricky. While its’ still either possibilities #1 and #2 OR #1 and #3, because of #2 or #3, his response rate differs. In the single card draw, 50% of the time for #2 or #3 he would have said he had a black card. By seeing both cards he will automatically be able to state (if he has #2 or #3) “I have a red card,” or “I have a black card.” This doubles the fact that these possibilities come up.

Now the original problem stated, “You have been told this family has a daughter.” From the statement we can conclude that it wasn’t someone in the family that told us. We can also conclude that the source of this information would have told us the other child if he known. From the statement, there is no reason believe this source didn’t have the same information that we have been given. If the source knew of only one child, for whatever reason, that child is specific.

As for your example Frylock, yes we are in agreement; I just don’t think it requires the distinction to be so obvious specific. Any distinction, no matter how small, is enough, this includes someone who has met just one child.

Okay gang, I think I had an epiphany. I must eat crow, because I assumed that the 2/3 was right when both children are known to the source. I was wrong along with Cecil and Marilyn Savant.

This is the normal distribution:

  1. Girl, Girl……… 25%
  2. Girl, Boy……. 25%
  3. Boy, Girl……. 25%
  4. Boy, Boy…….25%
    Or
    50% paired
    50% split

No matter how we guess, we don’t affect these results.

If we follow Cecil’s 66.67% rule this is what happens:

We find four families and each family is one of the possible combinations.

Family #1 has two girls:
What is given to me is that they have a girl, so I choice opposite, a boy. WRONG.

Family #2 has a girl and a boy:
Either child can be given to me, and choosing the opposite will give me a CORRECT answer.

Family # 3 has a boy and a girl:
Like in second family my choice is CORRECT.

Family #4 has two boys:
Like the first family, this time I am given a boy and I chose opposite and am WRONG.

These cover all four possible combinations and using the 2/3 rule I got only 50% right. These four families are equally like to come up. I could go on all day and the outcome will be the same. Accounting to Cecil I should use distribution #1, #2, and #3 with a girl and #2, #3, and #4 with a boy. Now if I were accurate 2/3 of the time by choosing opposite the then the real distribution would be:

  1. Girl, Girl……16%
  2. Girl, Boy……33%
  3. Boy, Girl……33%
  4. Boy, Boy……16%.

But that is wrong, because as you see in the four families, I used the rule and was only right half the time. Why is this happening? Should I not be getting 66% instead of 50%. This happens because in a random probability within two even events and two even factors, half the time a pair will appear. This contradicts Cecil and Savant’s theory. Cecil says that if we are given a girl, we should then remove family possibility #4, then the possibilities remaining will equally be likely.

  1. Girl, Girl…….33%
  2. Girl, Boy…….33%
  3. Boy, Girl…….33%

Now two of the remaining will include a boy, so the likelihood of a boy is 2/3.
But in fact this is an error. The distribution after eliminating family possibility #4 is:

  1. Girl, Girl……50%
  2. Girl, Boy……25%
  3. Girl, Boy……25%

This is true because again…50% of the time in random probability with two even events and two even factors, half the time a pair will appear. You can work this out in a survey, with cards, coins, or whatever else you can dream up, the results will still be the same. The four families’ example covers all the possibilities and you are still hitting a 50/50 chance. :smack:

Wissdok, I don’t know what you’re talking about regarding correct and incorrect “choices” using a “2/3 rule.” You may need to do a little more work to explain what you have in mind there.

You said at the end that if you do this as an actual experiment, then the answer will clearly be 1/2 instead of 2/3.

Is 100 families a good enough sample size?

From http://www.random.org/nform.html I generated the following list:

1 1
1 1
1 1
0 0
0 1
1 1
1 1
0 1
1 0
1 1
1 1
1 0
1 1
1 1
1 1
0 1
1 0
1 1
0 0
0 1
1 0
1 1
0 0
1 0
1 0
1 0
0 1
0 0
1 1
0 0
0 1
0 0
0 1
0 0
1 1
0 1
0 1
1 1
1 0
1 0
0 0
0 1
0 1
1 0
1 1
1 1
0 0
0 0
0 1
0 0
0 1
0 0
0 0
0 0
1 1
0 1
1 1
0 1
1 1
0 0
0 1
1 0
0 0
1 1
0 1
0 1
0 0
1 1
0 1
1 0
1 1
0 1
1 0
1 0
1 1
1 1
0 0
1 1
0 0
1 0
0 1
1 0
1 0
0 0
1 1
1 0
1 0
1 1
0 0
1 0
0 0
1 0
1 1
0 1
0 0
1 1
1 1
0 1
0 0
1 0

where I interpret each row as a pair of children from a family, and I interpret a “1” as a girl and a “0” as a boy.

This is a random selection of 100 families. Now I’ve been told, of the family in question, that “one of them is a girl.” So let’s go through and select just those families out of this random selection which have at least one girl:

1 1
1 1
1 1
0 1
1 1
1 1
0 1
1 0
1 1
1 1
1 0
1 1
1 1
1 1
0 1
1 0
1 1
0 1
1 0
1 1
1 0
1 0
1 0
0 1
1 1
0 1
0 1
1 1
0 1
0 1
1 1
1 0
1 0
0 1
0 1
1 0
1 1
1 1
0 1
0 1
1 1
0 1
1 1
0 1
1 1
0 1
1 0
0 0
1 1
0 1
0 1
1 1
0 1
1 0
1 1
0 1
1 0
1 0
1 1
1 1
1 1
1 0
0 1
1 0
1 0
0 0
1 1
1 0
1 0
1 1
1 0
1 0
1 1
0 1
1 1
1 1
0 1
0 0
1 0

The above are the 79 families left once we eliminate the all-boy families. Having been told that “one of the children is a girl,” I know that the above list includes all the families out of the original hundred which I could possibly be being told about.

Now, I’m asked, “what are the chances the other is a boy?” In other words, out of the list of 79 just given, how many have a “0” included?

The families out of the list of 79 which also include a “0” are:

0 1
0 1
1 0
1 0
0 1
1 0
0 1
1 0
1 0
1 0
1 0
0 1
0 1
0 1
0 1
0 1
1 0
1 0
0 1
0 1
1 0
0 1
0 1
0 1
0 1
0 1
1 0
0 0
0 1
0 1
0 1
1 0
0 1
1 0
1 0
1 0
0 1
1 0
1 0
0 0
1 0
1 0
1 0
1 0
0 1
0 1
0 0
1 0

That’s 49 families.

So here’s what I know: given that “one child is a girl,” I can expect the probability that “the other child is a girl” to approximate 49/79. That’s a whole lot closer to 2/3 than it is to 1/2.

“Given that one child is a girl” tells me I need to look at the list of 79 families generated from the original list of 100 by taking from that 100 only those families which have at least one girl, and use the number of families in that list (79) as the denominator for my probability fraction.

“What’s the chance the other is a boy” tells me I need to look at that list of 79 families, count the number which have a boy in them, and use that number as the numerator in my probability fraction.

-FrL-