The experiment is invalid from the start. Your responses to the poll aren’t a random sample, and you can’t draw statistical inferences from them.
Oh, it’s fine. The only thing that matters is if a person with two kids who participates at the SDMB is going to have more or fewer sons than the population at large. That’s all we’re asking. I’m willing to risk it.
The ruling them “out of hand” is intentional, as that’s exactly what would have to happen in real life. (Chronos gives a good description.) At first, I was going to sample 10000 groups of two children, and just throw out the girl-girl combinations, but I thought it would make more sense for each removed sample to have to be replaced. Just move looper++; back where you had it to fix it.
As far as I can tell, your original javascript would have meant picking a random male, and then adding either a male or female. (The if statement at the beginning only randomly flips the sample.) This doesn’t accurately reflect the problem, as there is no indication that the first child is the one the parent is talking about. In fact, it pretty much begs the question, as it assumes BG = GB, which is what you are trying to prove.
Hello? Tao? Poll question? I’ll do it if you don’t want to - you have me piqued.
I don’t know why a poll is any more useful than simply going through the expected numbers.
Start with this: There are exactly 100 families with exactly two children. Exactly how many families would one expect to have which sexes of children, exactly what poll question would be asked, and exactly what would one expect the responses to be?
It’s not, because all of the statistical theory that people will want to apply is fundamentally based on the assumption that the poll respondents are a random sample of the population, and is not robust with respect to violations of that assumption.
It’s not, but it’s good enough. I refuse to believe that anything about the genders of children people have will affect their likelihood to participate in the poll (except as the people with no boys will be asked not to participate).
zut - Look, and I agree that a bullet fired level from a gun will strike the ground at the same instant as a bullet dropped from the height of the gun barrel, but it was still cool when they did it on Mythbusters.
First, I don’t think Cecil did a very good job of stating his question, so I can’t blame anyone who thought he was asking #1.
But let me try and rephrase them
-
You [randomly] flip a penny and a quarter. If the penny doesn’t come up heads, flip again without counting it. Out of the times the penny is heads-up, how often will both coins be heads? (This is the same as just leaving the penny heads-up, and counting how often the quarter comes up heads).
-
You flip two coins, and if neither coin comes up heads, flip them again without counting that try. Out of the times that they’re not both tails, how often will both coins be heads?
Run a simulation if you want, but the answer to #1 is clearly 1/2 the time, wheras the answer to #2 is 1/3 of the time.
Just for kicks and giggles, I wrote this program in C# (It’s what I use at work and I like it).
I created a class called Family which contained 2 children of random gender.
I then created 1000 families, excluding those which did not have a daughter and printed the break down:
Girl, Girl: 346
Boy, Girl: 324
Girl, Boy: 330
Percent of families with boy: 65%
Looks pretty definitive, and agrees with Cecil
Because two different results have been offered, and which one matches real life expectations has been disputed.
I added a new function into my code and applied The Taos’s Revenge’s logic to it and came up with the following result set:
Girl, Girl: 496
Boy, Girl: 239
Girl, Boy: 265
Percent of families with boy: 50%
When I looked at his code, I realized why we have different numbers. He runs a 50 / 50 comparison at the start to determine if the daughter is the older child or the younger child, then he determines what the gender of the other child is. In other words, in a set of 1000 families, 500 have a daughter as the older child, and 500 have the daughter as the youngest child.
I wanted to test this, so I just ran my program one last time, creating 500 families with an older daughter, and 500 families with a younger daughter:
Girl, Girl: 533
Girl, Boy: 234
Boy, Girl: 233
Percent of families with boy: 47%
The Tao’s Revenge has built some bias into his program. Run your program again allowing the program to randomly select the gender for both children, then just discard the boy / boy case and you’ll see the same results as I did in my first run, and the same results Cecil gets in his column.
The Tao’s Revenge, I now see just what it is that is ultimately causing your program to give you erronious results.
In case 1 you expect the girl to come first, and then the second child could be boy or girl.
So Girl Girl, or Girl Boy
In case 2 you expect the girl to come second, then the first could be a boy or girl.
So Girl Girl or Boy Girl
Do you see the problem? You have Girl Girl twice, this is why Girl Girl is 50%, with the other 2 options each being 25%.
If you take out one of your Girl Girl choices (because it’s a duplicate), then each choice will be split into thirds, which is what the initial prediction stated.
The problem with your program is that you get a do-over on the gender of the first child if it’s wrong…
*var tester = Math.round(Math.random());
if (tester==1) {
var kid1 = 1;
var kid2 = Math.round(Math.random());
}
else {
var kid1 = Math.round(Math.random());
var kid2 = 1;
}*
Assuming you meant “tester” (not “testes”?) is the gender of the first child…
If it’s 1/True/Boy, then the next child is random.
If it’s 0/Girl/False, you proceed to force the second child to be a boy (kid2=1) but then you pick another random choice for kid one. Wrong. You’ve just established kid1 is a girl. The code should read:
- else {
var kid1 = 0;
var kid2 = 1;
}*
I bet that gives the correct result.
I’d like to point out that when figuring probability, it’s valid to list out all the possibilities and count them up, as long as each possibility that you’ve listed is equally likely.
Here, you consider that F-M and M-F are the same set, which is OK for you to view it that way, but if you’re wanting to do probability calculations, it doesn’t work. The reason that Cecil and others list them separately is that M-M, M-F, F-M, and F-F are four equally likely possibilities. If you want to do it your way, you can list out M-M, M-F, and F-F as the three possibilities, but that way, the M-F group contains twice as many, so it makes the calculation more involved than simply counting.
Actually, on reflection - I lied. This will not give a correct answer either.
It corresponds to the situation - “if the first child is a girl, adopt a boy”.
This results in a 75-25 distribution.
The programs above (Tao and BigT) have it right, Generate two sequential random numbers, for child one and child 2, and discard the one that is girl-girl.
Right now in this poll, we have 96 observations, exactly 64 mixed and 32 daughter-onlies.
96 observations gives us a standard error of 4.8%, so going out 2 of those, there’s a 95% probability the real population split is between 57.1 (that is, 66.7-9.6) and 76.3 (66.7+9.6) percent. Going out one more standard error in each direction takes us to 99% confidence, and the range still does not include Tao’s predicted 50/50 split.
66%, can’t argue with results. I was so sure too, I drew a diagram showing the “flawless” logic of why it’s 50%, but learning something new is worth the flavor of my own hat, and a wasted half hour on the diagram.
Thanks bup, bigT, and all.
Was there any doubt? :smack:
That observed results should mirror probability calculations is not a shock. Tao made an incorrect restatement of the problem in coding his simulation. blah blah blah (darn simulposts!).
I hope you do like I do and wear chocolate ones, just in case!