Statistical question about having 2 boys if one is born on Tuesday

Not sure what you’re getting at. If only one of them is a boy, then the odds of the other being a boy is 0%.

So who do you mean by “the boy” in your earlier statement “we know you have one boy, but we don’t know whether the boy is the first or second child” ?

It seems to indicate that just one of them is a boy, that being the meaning of the word “the”. This is the crux of the apparently counter-intuitive probability - confusion over whether we are excluding the possibility of both being boys, or not.

Well, disregarding the specific wording, surely the correct point EdwardLost is noting is that an event only has a probability relative to a probability distribution, which certainly can be thought of as a description of multiple possible outcomes, even if only one is ultimately realized in the actual world. When the boy-girl problem is specified, it’s often done so in a way which makes the relevant probability distribution highly ambiguous, thus explaining the differing answers.

Were the probability distribution made explicit (“Start with boy-boy, boy-girl, girl-boy, and girl-girl equiprobable; then restrict to the subset with at least one boy”), the answer would be both obvious and unsurprising.

Of course, in practice, there are situations where people get tripped up in the choice of what’s the most relevant probability distribution to use to model some real-world situation, which is fine, but not really the thing I think people often get tripped up on in presentations of the boy-girl problem.

That and I of course agree with everyone who has noted that the business about the non-referential “the other boy” is really unfair in the posing of the question.

Anyway, in response to the OP, though it’s already essentially been pointed out, here’s what’s going wrong in your reasoning: “at least one child is a boy” is not equal to “at least one child is a boy born on Sunday” + “at least one child is a boy born on Monday” + … + “at least one child is a boy born on Saturday”.

What do I mean by the + signs there? Think of these propositions as a way of assigning a number to cases; 1 when the proposition is true and 0 when it is false. To say that A = B + C is to say that the number assigned by A is always the sum of the numbers assigned by B and by C; in other words, that whenever B or C occurs, so does A, and, conversely, whenever A occurs, precisely one of B or C occurs.

There is a rule in probability that says that, if L = A + B + … + G, then the probability of X given L is always an average of the probability of X given A, the probability of X given B, …, through the probability of X given G.

In particular, if if “at least one child is a boy” was the sum of “at least one child is a boy born on Sunday” through “at least one child is a boy born on Saturday”, then we could conclude that the probability of both children being a boy given that one child is a boy is an average of the probability of both children being a boy given that one child is a boy born on [specific day] = 13/27.

However, it’s not true that “at least one child is a boy” is the sum of “at least one child is a boy born on Sunday”, “at least one child is a boy born on Monday”, etc., because it could be the case that more than one of the latter is true at the same time; for example, if there is one boy born on Sunday and one boy born on Monday.

Because of that possibility of overlap, you can’t get probabilities conditioned on “at least one child is a boy” just by averaging probabilities conditioned on “at least one child is a boy born on [specific day]”.

(I’d like to word this better, but that’ll have to do for now)

Real-life implications of this.

If you meet a woman who tells you she has two kids, and a boy walks into the room, the chance is 50% of the other boy being a boy.

However, if you meet a woman and she tells you she has two kids, and she answers positively on the question “is at least one of your kids a boy?”, then there is a 2/3 chance the other is a girl.

It’s all in the specific wording.

You shouldn’t say this, because you have more information than just that the family has at least one boy. You have the further information that the particular child who happened to walk out to greet you is a boy. This further information changes things so that the probability of the family having two boys is, on the most relevant probability distribution we might want to assume, 50%.

However, to answer the question you were getting at:

Suppose I randomly pick a number out of {1, 2, 3}. If you were to learn that the number was >= 2, there’d be a 50% chance the number was actually equal to 2. If you were instead to learn that the number was <= 2, there’d also be a 50% chance the number was equal to 2. And it’s always true that the number is either >= 2 or <= 2. But it’s not true from the start that there’s a 50% chance the number is equal to 2.

What goes wrong in trying to average the probability conditioned on >= 2 and the probability conditioned on <= 2 to get the overall probability is that >= 2 and <= 2 are overlapping cases. The same thing happens when trying to split “at least one child is a boy” up into “at least one child is a boy born on Sunday” through “at least one child is a boy born on Saturday”

Easier way to understand the last question.

There are twice as many 2-child families with one boy than there are families with two boys. Does everyone agree on this?

When a child walks out the door, there is a bigger chance that it is a child from a one-boy family as the chance it’s a child from a two-boy family.

However, if it’s a child from a one-boy family, there is only a 50% chance a boy would actually walk out the door. So when a boy actually walks out, there is still only 50% chance that you are visiting a two-boy family.

If you want the 2/3 chance, you have to ask the family “if you have at least one boy, have him walk out the door”. No one does that :stuck_out_tongue:

I think I can explain everything even more intuitively.

If you ask someone “is at least one of your kids a boy”? Then your very specific wording selects this subset.

http://dl.dropbox.com/u/81226/1.png

And 1/3 of this subset has the other kid as a boy.

http://dl.dropbox.com/u/81226/2.png
In the same way, if you ask someone “is at least one of your kids a boy born on a Tuesday?”. Then this very specific wording will select this subset.

http://dl.dropbox.com/u/81226/3.png

And 13/27 of this subset has the other kid as a boy.

http://dl.dropbox.com/u/81226/4.png

And why does the kid being born on a Tuesday matter? It just reduces that “overlap” in the subset selected by your very specific question. The overlap that prevents the probability of being 50/50.
It’s all in the wording of the question and what subset that question selects for you. That’s why it never works in any of the practical situations, except the situation where you get to ask someone that specific question.

It’s not a question of whether this “works”, whatever that means. The whole point of this exercise is to show that conditional probabilities are very sensitive to the information that you’re conditioning on. It’s a nice example of exactly how careful you have to be in conditional problems.

Well, it “works” in the sense that those probabilities arise if you get to ask that exact question. It’s just that just about no real-life situations correspond to that exact question.

Much better wording:

Suppose you wanted to count how many families there were in the world with exactly two boys.

One way you could do it would be to, well, just count up the number of families in the world with exactly two boys.

Another way you might think you could do it would be to first count up the number of families in the world with exactly two boys and at least one boy born on a Sunday, then count up the number of families in the world with exactly two boys and at least one boy born on a Monday, etc., and then add these seven numbers together.

You might think you could do it that way, but, of course, you can’t: that double-counts any family with two boys born on different days of the week. If you want to do it that way, you have to furthermore account for the double-counting by then subtracting out the number of families with two boys born on different days of the week; this subtraction is what causes the probability of two boys given at least one boy to be different from the probability of two boys given at least one boy born on Tuesday.

That’s the phenomenon you are stumbling upon.

It’s similar to how you can’t count up the number of families with two boys by taking the number of families with an older boy and the number of families with a younger boy and adding them together, without further taking the step of subtracting out the double-counted number of families with two boys, thus making the probability of two boys given at least one boy different from the probability of two boys given an older boy. [In the context of families with two non-twin children, of course]

OK, let me pose one last scenario. I take you to see some randomly selected 2 child family. The mother walks out and says “One of my children is a boy…” and before she finishes I say “I calculate a 33% chance that the other is a boy.” Then the mother finishes with “… and he was born on a Tuesday.” Should I then correct my earlier statment with “make that 48%”?
It seems to me that the initial calculation is based on “Out of all pairs of siblings in which at least one is a boy…”, and the correction is based on “Out of all pairs of siblings in which at least one is a boy born on a Tuesday…”.
The thing is, she could have said any day of the week. Sorry for being so thick if there is something that has already been said that I have missed.

Missed edit window
Made a chart similar to Danger Man: https://dl-web.dropbox.com/get/statq.JPG?w=704a71ed
My problem is any boy will be born on some day of the week. So then I can change the odds to reflect that of the 2nd chart?

“He”? Who’s “he”?

the child that is a boy.
My point is, no matter what day the child is born on, I could construct a chart that shows the odds are 13/27. So initially the mother tells me that one of her kids is a boy (33% chance of 2 boys) and then tells me that that kid was born on (insert day of week) so the odds change to 13/27. Clearly something is wrong here. What is the solution to this paradox?

Which one? There isn’t necessarily “the child” in this situation; it might be “the children”.

As long as all the mothers you would potentially meet would say that exact sentence if it was true, then your reasoning is correct.

The problem with the bold words is that they seem to refer to a specific child, which you don’t want to do. But that’s kind of distracting from your main point, and easily fixed by using the word who instead.

Danger Man has the right answer, but I think someone needs to expand on it. I was going to, but I’ve realised I just don’t have the motivation to do a good enough job.

OK, let me try throwing out some scenarios leading up to the woman saying “One of my children is a boy… who was born on a Tuesday.”

A) I get a large selection of two-children families, and tell the mother “when Saffer comes to the door, pick one of your children (e.g. by flipping a fair coin) and tell him the sex and day-of-the-week that child was born”.

B) I get a large selection of two-children families, and tell the mother “when Saffer comes to the door, pick one of your sons (or pick your son if you only have one) and tell him the sex and day-of-the-week that he was born. If you don’t have a son, tell him to go away.”

C) I get a large selection of two-children families with at least one son born on a Tuesday, and tell the mother “when Saffer comes to the door, pick one of your sons born on a Tuesday (or pick your only son born on a Tuesday if you only have one) and tell him the sex and day-of-the-week that he was born. If you don’t have a son, tell him to go away.” If you have a son, but not one born on a Tuesday, tell him you have a son and what day he (or one of them) was born.

D) I get a large selection of two-children families with at least one son born on a Tuesday, and tell the mother “when Saffer comes to the door, pick one of your sons born on a Tuesday (or pick your only son born on a Tuesday if you only have one) and tell him the sex and day-of-the-week that he was born. If you don’t have a son born on a Tuesday, tell him to go away.”

For all scenarios, I tell Saffer what the scenario is.

In A), Saffer doesn’t learn anything about whether the woman has a second son after the woman speaks. It’s a 50% chance she has two boys, not 33%, since half the women who have a boy and a girl will say they have a girl instead of saying they have a boy.

In B), once the woman says “One of my children is a boy…” Saffer knows there’s a 33% chance she has two boys. He learns nothing when she continues on to say “who was born on a Tuesday”. Her son had to be born on some day.

In C), once the woman says “One of my children is a boy…” Saffer knows there’s a 33% chance she has two boys. At this point, he doesn’t know if one of her sons was born on a Tuesday, all he knows is she has at least one son. When she continues on to say “who was born on a Tuesday”, then he knows there’s a 48% chance she has two boys.

In D), once the woman says “One of my children is a boy…” Saffer knows there’s a 48% chance she has two boys. This is because he knows she would have said “Go away” if she didn’t have a boy born on a Tuesday. When she does continue on to say “who was born on a Tuesday”, he doesn’t learn anything new. He still knows there’s a 48% chance she has two boys.

In the set-up for C), Tuesday is different from all the other days That’s why it matters when she says “who was born on Tuesday.” In B), there’s nothing special about any of the days, and it’s just random chance when she says “who was born on Tuesday.”

Hmmm, so maybe Danger Man’s answer wasn’t quite right. But his point stands: The set-up is critical. That’s also why Chronos is giving you a hard time about the “… and he was born on a Tuesday.” wording. The set-up is critical.

Hope I got all that right. And that it’s not just muddying the waters.

ETA: Actually, now I think Danger Man’s answer was precise enough that it is correct.

The easiest way of putting the scenarios is to have Saffer be the one asking the questions, and just receiving Yes/No answers.

“Do you have exactly two children?”
“Yes” [1/4 probability of two boys]
“Do you have at least one boy?”
“Yes” [1/3 probability of two boys]
“Do you have at least one boy born on Tuesday?”
“Yes” [13/27 probability of two boys]

The last question provides relevant information because families with two boys are more likely to have at least one boy born on Tuesday than families with merely one boy are (since, well, the two-boy families have twice as many tickets for the calendar lottery, so to speak); accordingly, even out of two-children families who have at least one boy, having two boys is positively correlated with having at least one boy born on Tuesday, so that learning one of these makes the other more probable.