Boy/Girl probability in Monty Hall

You have changed the problem to: “A family is selected at random, and then one of the children in the family is selecetd at random and found to be female. What is the probabilty that the other child is male?”

The answer to this is indeed 50%, but this is not the same as the original problem.

Xema, I disagree, there is but one problem here, not two. The families are random when the problem starts, but once a child is given (or meet) the child becomes the focus and the families are no longer random. This given child is equally able to be any child that is of that same sex. If we are given a girl (like Cecil’s puzzle) then the girl can be either of the girls in the family with two (25% each) or either of the girls in the two families that have a son also(25% each). If we are given a boy(like Marilyn’s puzzle) then that boy can be either of the boys from the family with two(25% each) or either of the boys from the two families that have a girl also(25% each).

As an example of proof that 2/3 is wrong, this problem cannot be repeated and guarantee a 2/3 result. As I explained above Cecil chose girls and Marilyn chose boys, these opposite events cannot be right. Above, Gazpacho spoke of 1000 families. If we repeat this puzzle to all 1000 families using the 2/3 “rule” of choosing the opposite, we should be right 66.67% of the time.
But you can easily see that it wouldn’t be 66.67% because:
On all GG families we would be wrong. (25% of the families)
On all GB families we would be right. (25% of the families)
On all BG families we would be right. (25% of the families)
On all BB families we would be wrong. (25% of the families)

We would be right 50% of the time and wrong 50%.

As Martin Gardner (correct spelling) explained in his opinion of this puzzle, this can only be 2/3 if we assume that that we are given a child and all families with the sex of child always tell (or are given) that sex of child. That assumption is not listed in the puzzle.

They’re just different ways of phrasing the same problem - rather like the difference between b = a + 2 and y = x + 2.

I believe this can easily be done.

This would be correct if BB families were included in the problem; in fact, they are eliminated by the condition that the selected family includes a daughter.

We thus are left with the first three of the four cases you list - 750 of the original 1000 families. In 500 of these (66.67%), the daughter has a brother.

We are not given a child - we are given a family and some information about the family: that it includes at least one daughter (which is exactly equivalent to saying that this is not a BB family). The family does not tell us anything, nor do we meet anybody. And yet the answer is 2/3.

Once you start specifying the order of the children, you change the problem. In the problem you are given a family with one daughter and one unknown child. That’s it. One quarter of all possible families have two boys, one quarter have two girls, and one half have a boy and a girl. Eliminate the 2B families and you have 2/3 BG or GB families. All the problem does is eliminate the 2B families. It doesn’t do anything else.

The problem could have been phrased “Given the set of all two child families with at least one daughter, what is the probability that a family picked at random will have a son?” answer: 2/3.

wissdok, you’re right that if I pick a daughter at random the probability that she belongs to a GG family as opposed to a {GB, BG} family is 2/3, cancelling the effect of eliminating the BBs and giving a probability of 1/2 that she has a brother. But that’s not the problem we were given. We’re picking families, not daughters.

The key point here is that the initial question “Is one of your children a boy/girl”, weeds out any two-child family where the answer is “no”. It is therefore not surprising (assuming two-child families are split evenly between BB, BG, GB, and GG–written in birth order) that once your initial question eliminates either BB or GG, the odds are in favor of guessing the opposite sex for the other child since 2 of the 3 remaining possibilities are mixed-gender families. By this time, I’d think this logic would be obvious.

What interests me is how this “weeding out” requirement can be short-circuited without careful consideration. Suppose instead of my asking specifically “Is one of your children a boy?” I instead ask “What is the gender of one of your children?” It seems to be a similar process, but it’s not. By assumption, half of all two-child families are BB or GG. In either case, assuming the opposite gender for the other child is a loser, but you win if you happen to catch a parent with BG or GB, i.e. half the time. In essence, the answer to the gender question didn’t give you any information to weed out possibilities.

As Larry Borgia points out, another way to short-circuit the value of this weeding-out is to specify the gender of the oldest child only. Suppose you ask “Is your oldest child a boy?” An answer of “yes” here weeds out both GG and GB. If you follow the “guess the opposite gender” strategy, this weeding eliminated half your winners and half your losers. Unlike the previous case–where the strategy failed because no weeding out was actually performed–this question does cause some possibilities to be weeded out, but does so ineffectively for the strategy you adopted.

The correct approach relies on effectively weeding out only GG families by specifically asking “Is one of your children a boy”. This then eliminates 1/4 of guesses which would have resulted in a loss by guessing the opposite gender. All my winning chances remain, half my losers are weeded out, so I have a 2:1 incentive to switch.

I appreciate the problem for the attention it brings to subtlety in language (balanced, of course, by the number of posts required to elucidate that subtlety:))

The Riddler, I’m sorry, but you’re wrong. If you think the problem is equivalent to flipping one coin, observing the outcome, and then flipping another, then you’re misunderstanding the problem Cecil is trying to state.

Have you read my explanations above? If so, and you disagree with them, please quote the part you think is wrong and explain why.

Anyway, here’s why it’s not equivalent to flipping one coin and flipping another, explained as well as I can.

In your coin flip example: You flip one coin and get heads. Then you flip another coin, and there’s a 50-50 chance it’s heads.

In the children example: The couple has two kids. Then, after both kids are born, they tell you that one of them is a girl. That’s not the same as telling you the first one was a girl.

To be equivalent to the coin flip, they couple would tell you they had had a girl after having the first child, and then would ask you to guess the odds of the second child being a girl. In that case, it would be 50-50. But do you see how that’s different than the stated problem?

Here’s a description of a coin flip problem that’s equivalent: Flip a pair of coins and write down the outcome of both. 25% of the time, they will both be heads. Do you see why? Likewise, 25% of the time, they will both be tails. So that means 50% of the time, you get a heads and a tails. Please tell me what you disagree with in this description, if anything.

Assuming you agree with the above, then consider this: You get a pair that contains heads and tails twice as often as you get a pair where both are tails. This follows from assuming the outcomes of the two flips are independent. So that means, if you tell me that your pair of coins had at least one come up tails, then I can consider it more likely that you got heads-tails (in some order) than tails-tails, since heads-tails happens more often.

Please explain what if anything you think is wrong with this argument. If it still doesn’t make sense, I urge you to get a pair of coins and flip them 100 times. I promise you it will come up with one heads and one tails more often than it comes up two tails. If you don’t believe it, see for yourself.

I stand by my earlier post, Cecil’s puzzle included a girl, Marilyn’s a boy. Their argument is that it doesn’t matter which child we are given, it is the same for either sex. As we looked at the earlier 1000 families, the 2/3 “rule” can’t be correct for both boys and girls at the same time. While the families with two boys or two girls will always give you the same “clue”, the families with both a boy and a girl can’t give an answer that makes both Cecil’s and Marilyn’s puzzle true.

With Cecil’s puzzle it is possible for 2/3 to be true IF we assume that all families with girls tell us of them. But at the same time that would mean that Marilyn would be wrong because only the family with two boys would tell us of boys. The reverse is true as well, when Marilyn’s answer is true, Cecil answer must be wrong. This is rather a unique scenario, as more often than not, neither could be true. If the families are random and we are told of a random child of that family, then most likely the families with both a boy and a girl would half the time tell you of either. Because of randomness, half of the 500 families in the earlier group (that have both sexes as children) would tell of girls and the other half, boys.
250 Family Group #1 GG—always gives the clue girls
125 Family Group #2 GB—gives the clue girls
125 Family Group #2 GB—gives the clue boys
125 Family Group #3 BG—gives the clue girls
125 Family Group #3 BG—gives the clue boys
250 Family Group #4 BB—always gives the clue boys

In true randomness, families in group #2 or #3 will be equally likely to tell you of either child. The result of this in either Cecil’s or Marilyn’s puzzle only around 500 of the families will ever give the same clue. Of that 500, half will automatically be from the families with two boys or two girls.

I again repeat: Cecil’s answer (with girls) and Marilyn’s (with boys) cannot ever be correct simultaneously. In fact, either can only be true when both families with mixed child tell of the same sex child AND we just so happen to be looking for that same type child.

Diagram of the puzzle:
Step #1
Chose a random family with two children. The family possibilities with two children are:

  1. Girl, Girl
  2. Girl, Boy
  3. Boy, Girl
  4. Boy, Boy
    The families are random but the children themselves are irrelevant other than to help separate the families.

Step #2
We are given(meet, told, …etc) the sex of a child in a two child family. The focus here is on the child and the families are no longer equal. Like both Cecil and Marilyn pointed out one type of family(GG or BB) can no longer be possible. The randomness is now on the child. The child can be any of the children of that given sex.
Type of families………If we are given a girl then………If we are given a boy then
…………………………….(% chance the child given)………………………….
Family #1 –Girl, Girl……………25 / 25…………………………………0 / 0……………
Family #2 -Girl, Boy……………25 / 0………………………………….0 / 25……………
Family #3 -Boy, Girl…………….0 / 25…………………………………25 / 0………….
Family #4 -Boy, Boy……………0 / 0……………………………………25 / 25…………

Here again the families with two girls or two boys are twice a likely to be the family where the child is from.

Let’s talk red and black cards rather than genders.

If you set up, face down, two columns of cards in the following pattern:

R R
B B
R B
B R

You have the four possible distributions.

If you randomly select one card and turn it over, what is the likelihood the paired card is the same color?

THESE COMBINATIONS ARE PAIRED! That’s the key statistical concept many people are missing here.

It doesn’t matter which side of the pair you choose to reveal, but once you’ve picked a side, that determines which other pairs are left in the possible solution set. Many of you folk want to keep all “girl” rows once you “turn over” a “girl” card. Doesn’t work that way. Once you select from one column, you have to consider only those rows from that column that match what you picked.

If you picked a B card out of the first column, you’re left with

B B
and
B R as the possible solution sets; or 50-50. You can’t keep the R B row!

If you picked an R card out of the second column, you’re left with

R R
and
B R as the possible solution sets; or 50-50. You can’t keep the R B row here either!

And yes, I TOOK Statistics at the University of California, but I’m never taught it.

The key point you seem to be missing here is that, as the original problem is stated, you do not know whether the child identified by gender is the older or younger child. In your card scenario, this is equivalent to not knowing whether you’ve turned over a card from the first column or the second column

In your card problem, I agree that the odds are 50-50 (as long as you can guarantee the specific distribution of cards in column 1 and column 2. However suppose you ask your buddy to pick a card, tell you the color, but NOT tell you whether he picked from the first or second column. Suppose he says he picked B. The only three combinations available then are RB, BR, or BB: 2 to 1 in favor of the other card in the pair being the opposite color. Note that if you further ask him “Which column is the B card in?”, and he says “the first column”, you can eliminate RB as well and drop the odds back to 50-50.

This is really not that hard, and the subtlety of the language has been explained every which way know to man. The connundrum is in the hidden assumption, and whether or not they are reasonable. The analogy of picking cards in this way with identifying one of two unspecified children’s gender does not hold up because there are different assumptions written into the rules of each game.

The key point you seem to be missing here is that, as the original problem is stated, you do not know whether the child identified by gender is the older or younger child. In your card scenario, this is equivalent to not knowing whether you’ve turned over a card from the first column or the second column

In your card problem, I agree that the odds are 50-50 (as long as you can guarantee the specific distribution of cards in column 1 and column 2). However suppose you ask your buddy to pick a card, tell you the color, but NOT tell you whether he picked from the first or second column. Suppose he says he picked B. The only three combinations available then are RB, BR, or BB: 2 to 1 in favor of the other card in the pair being the opposite color. Note that if you further ask him “Which column is the B card in?”, and he says “the first column”, you can eliminate RB as well and drop the odds back to 50-50.

This is really not that hard, and the subtlety of the language has been explained every which way know to man. The connundrum is in the hidden assumption, and whether or not they are reasonable. The analogy of picking cards in this way with identifying one of two unspecified children’s gender does not hold up because there are different assumptions written into the rules of each game.

Fifty percent, as you say, but you’re not solving the same problem as the one stated.

To translate from the original Boy/Girl Problem, you are not looking at any particular card. You are presented with a pair of cards, randomly chosen from the four you list above, and then told that there’s at least one red card in the pair. (So the B-B pair is actually eliminated from consideration.) You’re then asked what the probability is that there’s a black card in the pair. This probability is 2/3.

It can.

The problem is perfectly symmetrical. If we randomly select a family and are told this it includes at least one son, we can say that the chance this boy has a sister is 2/3. If we are instead told that it includes a daughter, the probability of this girl having a brother is again 2/3.

There is no contradiction or difficulty here.

[/QUOTE]

Sorry: I’m not making my point very well.

Speaking from the standpoint of statistics as a science (and I have to warn the listening audience my UC diploma was signed by a California Governor who was an actor and his first name isn’t Arnold… so it’s been a few decades), there are two different concepts here; possible outcomes and distributions.

The possible outcomes (sticking with cards) are B B, R B, and R R. These are not order sensitive and do not address how frequently a particular outcome comes up.

The distributions are
25% B B
25% R R
25% R B
and
25% B R

These ARE order sensitive. They have to be, in order to accurately reflect the distribution.

You can choose to label the columns anything you want (oldest, first in the door, what my friend chose, etc.), but if you’re attempting to analyze statistical odds you have to deal with the distribution in the manner I suggest. Look at it this way; consider the label of the first column “what was revealed to me”.

If you still disagree, I’d like to meet you on a streetcorner for a game of 3-card Monte; bring lots of 20s.

:slight_smile:

I think you are introducing an unnecessary and unhelpful addition with this concept of the family “telling” something. Is the family truthful? Do all families know the sex of all their children?

Irrelevant! The family is chosen at random, and you are told (assumption: truthfully, by whom it doesn’t matter) that at least one of their children is a daughter. This allows you to conclude that they are not among the 25% of families that have two sons, and thus there is a 2/3 probability that their daughter has a brother.

No; we could just as easily stick with the non-order sensitive grouping B B, R B, and R R and say the distributions are (based on our knowledge of how the groups are created):

25% B B
25% R R
50% R B

You agree these are the distribution percentages, right? From here its obvious that for a group with, say, R in it (R R or R B), the mis-match occurs twice as often, so it’s 2:1 in favor of claiming the unseen card is the opposite.

Knowledge of how the cards are distributed helped us determine the percentages, but once these are determined we no longer care in what order the cards were dealt.

This would be true only if you could guarantee that the “revealor” will always chooses from the same column first–equivalent to always identifying the gender of the older or younger child in the original problem. The whole point of the original problem is that it is set up such that you cannot specify in advance which child (or which card column) is chosen first.

Once again, from your distribution, pick all pairs which have at least one black card. Two of these pairs are matched with a red card, one is not. Thus, if you do not know in advance which column is chosen first, the odds are 2:1 in favor of saying the matched card is red.

Sure thing; be prepared to lose your shirt!

But what was revealed was exactly this: One, or the other, or both children are girls. Your analysis isn’t properly dealing with these three possibilities, each of which is equally probably.

For Riddler, wissdok and adbadqc (and anyone else who’d like to have a go):

Suppose I take 4 sheets of paper, number them 1 through 4, and write the following on them:
Sheet 1: BB
Sheet 2: GG
Sheet 3: BG
Sheet 4: GB

I proceed to take sheet #1 and burn it, taking care to stomp the ashes into the ground.

I then ask a friend to randomly select one of the remaining sheets of paper by a process that makes each of the three selections equally likely.

Two questions to answer:

  1. What is the probability that he selects a sheet of paper with two different letters on it?

  2. How does this differ from the original problem?

Agreed with this bit.

And here I think is the snag. No specific card has been revealed to you. No specific card has been identified as red. You are never given any information about any particular, designated card at all. You are only given aggregate information about a pair of cards: that there’s at least one red card in there, somewhere. Could be they both are. And the only thing this information lets you do is eliminate the B-B pair from consideration.

Out of the allowed population of possible card-pairs, exactly 1/3 are R-R (and have no black), and 2/3 are mixed (with one black). You are being asked what the probability is that there’s one black card in the pair — or effectively, the probability that the pair is mixed.

As an aside . . .

I don’t hang around city street corners much (well not anymore; please don’t ask) but is there really any probability analysis to be done for Three Card Monte? It’s purely a confidence game, I thought.

Even an “honest” game of Three Card Monte would just come down to a contest of physical skill: the dealer’s dexterity versus the player’s eye.

Yeah; you got me on the 3-card-monty(monte?).

Consider this;

Once you know you have one girl (let’s use a capital G for the “known girl”, and lower case for the unknowns), these are the following possibilities (using the column example):

25% G g
25% g G
25% b G
25% G b

The known girl “card” can be either in the first or the second column, and can be paired with either an unknown boy or an unknown girl in the other column… it’s a different distribution than the original 4, but it still results in a 50-50 split.