Boys and Girls, there's lies, damn lies, and statistics. From the Monty Hall Problem olumn

ultrafilter · January 17, 2010, 10:26pm

The experiment is invalid from the start. Your responses to the poll aren’t a random sample, and you can’t draw statistical inferences from them.

bup · January 17, 2010, 10:36pm

Oh, it’s fine. The only thing that matters is if a person with two kids who participates at the SDMB is going to have more or fewer sons than the population at large. That’s all we’re asking. I’m willing to risk it.

BigT · January 18, 2010, 7:17am

The ruling them “out of hand” is intentional, as that’s exactly what would have to happen in real life. (Chronos gives a good description.) At first, I was going to sample 10000 groups of two children, and just throw out the girl-girl combinations, but I thought it would make more sense for each removed sample to have to be replaced. Just move looper++; back where you had it to fix it.

As far as I can tell, your original javascript would have meant picking a random male, and then adding either a male or female. (The if statement at the beginning only randomly flips the sample.) This doesn’t accurately reflect the problem, as there is no indication that the first child is the one the parent is talking about. In fact, it pretty much begs the question, as it assumes BG = GB, which is what you are trying to prove.

bup · January 19, 2010, 4:44pm

Hello? Tao? Poll question? I’ll do it if you don’t want to - you have me piqued.

zut · January 19, 2010, 5:21pm

I don’t know why a poll is any more useful than simply going through the expected numbers.

Start with this: There are exactly 100 families with exactly two children. Exactly how many families would one expect to have which sexes of children, exactly what poll question would be asked, and exactly what would one expect the responses to be?

ultrafilter · January 19, 2010, 7:50pm

It’s not, because all of the statistical theory that people will want to apply is fundamentally based on the assumption that the poll respondents are a random sample of the population, and is not robust with respect to violations of that assumption.

bup · January 19, 2010, 8:29pm

It’s not, but it’s good enough. I refuse to believe that anything about the genders of children people have will affect their likelihood to participate in the poll (except as the people with no boys will be asked not to participate).

zut - Look, and I agree that a bullet fired level from a gun will strike the ground at the same instant as a bullet dropped from the height of the gun barrel, but it was still cool when they did it on Mythbusters.

bup · January 19, 2010, 8:44pm

Here I did it.

Quercus · January 19, 2010, 9:35pm

zut:

Pretty much what everyone else said, but more specifically, this quote right here is the issue. What you’re doing is answering a different question than the one in Cecil’s column.

The difference between the two questions is that, in the one you’re illustrating, the two children are differentiated. In this case they’re differentiated not by age, but by one child being identified by gender and the other one not. As an example:

You’re at a party with 100 couples, each of whom has two children. You ask each couple: “Choose one of your children, but don’t tell me which one. Now: is the child you chose a girl?” Of those couples that say, “yes,” how many have a son?

You’re at a party with 100 couples, each of whom has two children. You ask each couple: “Do you have any daughters?” Of those couples that say, “yes,” how many have a son?

These two questions are different. Cecil asked the second, you answered the first. It’s not surprising you have different answers.

First, I don’t think Cecil did a very good job of stating his question, so I can’t blame anyone who thought he was asking #1.
But let me try and rephrase them

You [randomly] flip a penny and a quarter. If the penny doesn’t come up heads, flip again without counting it. Out of the times the penny is heads-up, how often will both coins be heads? (This is the same as just leaving the penny heads-up, and counting how often the quarter comes up heads).
You flip two coins, and if neither coin comes up heads, flip them again without counting that try. Out of the times that they’re not both tails, how often will both coins be heads?
Run a simulation if you want, but the answer to #1 is clearly 1/2 the time, wheras the answer to #2 is 1/3 of the time.

Frazzled · January 19, 2010, 10:14pm

Just for kicks and giggles, I wrote this program in C# (It’s what I use at work and I like it).

I created a class called Family which contained 2 children of random gender.
I then created 1000 families, excluding those which did not have a daughter and printed the break down:
Girl, Girl: 346
Boy, Girl: 324
Girl, Boy: 330

Percent of families with boy: 65%

Looks pretty definitive, and agrees with Cecil

BigT · January 19, 2010, 10:18pm

Because two different results have been offered, and which one matches real life expectations has been disputed.

Frazzled · January 20, 2010, 12:13am

I added a new function into my code and applied The Taos’s Revenge’s logic to it and came up with the following result set:

Girl, Girl: 496
Boy, Girl: 239
Girl, Boy: 265

Percent of families with boy: 50%

When I looked at his code, I realized why we have different numbers. He runs a 50 / 50 comparison at the start to determine if the daughter is the older child or the younger child, then he determines what the gender of the other child is. In other words, in a set of 1000 families, 500 have a daughter as the older child, and 500 have the daughter as the youngest child.

I wanted to test this, so I just ran my program one last time, creating 500 families with an older daughter, and 500 families with a younger daughter:

Girl, Girl: 533
Girl, Boy: 234
Boy, Girl: 233

Percent of families with boy: 47%

The Tao’s Revenge has built some bias into his program. Run your program again allowing the program to randomly select the gender for both children, then just discard the boy / boy case and you’ll see the same results as I did in my first run, and the same results Cecil gets in his column.

Frazzled · January 20, 2010, 3:55am

The Tao’s Revenge, I now see just what it is that is ultimately causing your program to give you erronious results.

In case 1 you expect the girl to come first, and then the second child could be boy or girl.
So Girl Girl, or Girl Boy

In case 2 you expect the girl to come second, then the first could be a boy or girl.
So Girl Girl or Boy Girl

Do you see the problem? You have Girl Girl twice, this is why Girl Girl is 50%, with the other 2 options each being 25%.

If you take out one of your Girl Girl choices (because it’s a duplicate), then each choice will be split into thirds, which is what the initial prediction stated.

md2000 · January 20, 2010, 6:54pm

The problem with your program is that you get a do-over on the gender of the first child if it’s wrong…

*var tester = Math.round(Math.random());

if (tester==1) {
	var kid1 = 1;
	var kid2 = Math.round(Math.random());
}

else {
	var kid1 = Math.round(Math.random());
	var kid2 = 1;
}*

Assuming you meant “tester” (not “testes”?) is the gender of the first child…
If it’s 1/True/Boy, then the next child is random.
If it’s 0/Girl/False, you proceed to force the second child to be a boy (kid2=1) but then you pick another random choice for kid one. Wrong. You’ve just established kid1 is a girl. The code should read:

else {
var kid1 = 0;
var kid2 = 1;
}*

I bet that gives the correct result.

CurtC · January 20, 2010, 8:11pm

I’d like to point out that when figuring probability, it’s valid to list out all the possibilities and count them up, as long as each possibility that you’ve listed is equally likely.

Here, you consider that F-M and M-F are the same set, which is OK for you to view it that way, but if you’re wanting to do probability calculations, it doesn’t work. The reason that Cecil and others list them separately is that M-M, M-F, F-M, and F-F are four equally likely possibilities. If you want to do it your way, you can list out M-M, M-F, and F-F as the three possibilities, but that way, the M-F group contains twice as many, so it makes the calculation more involved than simply counting.

md2000 · January 20, 2010, 11:43pm

md2000:

The problem with your program is that you get a do-over on the gender of the first child if it’s wrong…

*var tester = Math.round(Math.random());
if (tester==1) {
	var kid1 = 1;
	var kid2 = Math.round(Math.random());
}

else {
	var kid1 = Math.round(Math.random());
	var kid2 = 1;
}*
Assuming you meant “tester” (not “testes”?) is the gender of the first child…
If it’s 1/True/Boy, then the next child is random.
If it’s 0/Girl/False, you proceed to force the second child to be a boy (kid2=1) but then you pick another random choice for kid one. Wrong. You’ve just established kid1 is a girl. The code should read:

else {
var kid1 = 0;
var kid2 = 1;
}*

I bet that gives the correct result.

Actually, on reflection - I lied. This will not give a correct answer either.
It corresponds to the situation - “if the first child is a girl, adopt a boy”.
This results in a 75-25 distribution.

The programs above (Tao and BigT) have it right, Generate two sequential random numbers, for child one and child 2, and discard the one that is girl-girl.

bup · January 21, 2010, 1:21am

Right now in this poll, we have 96 observations, exactly 64 mixed and 32 daughter-onlies.

96 observations gives us a standard error of 4.8%, so going out 2 of those, there’s a 95% probability the real population split is between 57.1 (that is, 66.7-9.6) and 76.3 (66.7+9.6) percent. Going out one more standard error in each direction takes us to 99% confidence, and the range still does not include Tao’s predicted 50/50 split.

The_Tao_s_Revenge · January 21, 2010, 2:38am

66%, can’t argue with results. I was so sure too, I drew a diagram showing the “flawless” logic of why it’s 50%, but learning something new is worth the flavor of my own hat, and a wasted half hour on the diagram.
Thanks bup, bigT, and all.

DSYoungEsq · January 21, 2010, 2:39am

Was there any doubt? :smack:

That observed results should mirror probability calculations is not a shock. Tao made an incorrect restatement of the problem in coding his simulation. blah blah blah (darn simulposts!).

DSYoungEsq · January 21, 2010, 2:40am

I hope you do like I do and wear chocolate ones, just in case!

Topic		Replies	Views
Boy/Girl probability in Monty Hall Cecil's Columns/Staff Reports	144	7675	August 24, 2006
Couple has two children Cecil's Columns/Staff Reports	24	2938	April 3, 2009
Monty Hall [edited title] Cecil's Columns/Staff Reports	38	2293	April 13, 2006
Family with a daugter/odds they have a son Cecil's Columns/Staff Reports	66	7168	September 16, 2013
comment on On "Let's Make a Deal," answer Cecil's Columns/Staff Reports	24	1765	March 18, 2008

Boys and Girls, there's lies, damn lies, and statistics. From the Monty Hall Problem olumn

Related topics