How to model the length of the average unbroken maternal/paternal line of descent?

I’m trying to figure out how many generations we would ‘expect’ a matrilineal or patrilineal line to persist, given n children each generation.
So if a given person has n offspring, and each of the same-sex offspring have n offspring, etc., how many generations on average before only opposite-sex children are born and the line is broken?

It’s not quite the odds of consecutive coin flips because you get multiple chances to get a successful flip at each generation. Maybe it’s just like assigning a 1/(2^n) chance of the line dying out at every generation, but that doesn’t feel right somehow - like it doesn’t capture branches. That’s just a gut reaction, though.

The 1/(2^n) thing would suggest that for n=2 children the odds of a line dying out are 1/4 at every generation, so the expected length of a same-sex lineage would be 4 generations. With four kids, it would be 16 generations. Is that correct?

What are you considering a line? If your patriarch has three sons, and the first two sons each have another son, and those grandsons each have a son, but the third son is childless, is that a line broken after two generations (for the third son), or broken after four (for the first two)?

And of course, anywhere you start is part of a succession of previous lines, too.

We’ll have to settle on a numbering convention. Let’s call the patriarch generation zero for the moment. I consider that a line only ends when every branch has terminated. If the great-grandsons don’t have sons, then the patriarch’s line has ended at three generations (son, grandson, great-grandson).

The patriarch had a father and grandfather, but I’m only interested in modeling forward from a fixed point. Every man comes from an unbroken paternal line of arbitrarily long length, but there’s no guarantee of that continuing into the indefinite future.

Using that convention and assuming everyone always has n kids …
G0 has 1 member; the patriarch/matriarch.
G1 has n members of indeterminate sex.
G2 has n^2 members of indeterminate sex.
G3 has n^3 members of indeterminate sex.
etc.

Failure is defined as the first time we encounter zero of the “correct” sex at any level.

And because of the way the exponents run, the first time any branch is trimmed, their entire would-be downline must be eliminated from consideration. Which is totally NOT what a naive 1/(2^n) calc will provide.

At least that’s true if I understand the OP correctly. IOW e.g. if n = 3 and if #3 son of the patriarch has 3 girls, it doesn’t matter how many unbroken generations of boys later flow from those girls; those many later boys do not count.

If that’s an accurate understanding, I do not see a way to compute this except by Monte Carlo simulation.


IMO the real problem with the OP’s question is that the assumption of everybody every time in every generation having n kids is utterly unrealistic. So any insight you might expect your calcs to give you as you vary n would be false insight. The answer you seek is highly dependent on the variance around n for each of the parents in every generation. Which variance differs for large n versus small.

It’s also the case in the real world that the male / female mix of kids is not constant as the number of kids grows. Lotta 2-kid households are 1 each. Rather few e.g. 6 kid households are 3 & 3. Because people engage in a form of sex selection.

Even assuming nothing actively evil like sex-selective abortion or infanticide, many people will keep trying until they get a boy, or until they get one of whatever they don’t have yet. So lots of BBG, or GGB, but very little BBGX or GGBX. The crowd chasing a boy or at least one of each who start with a matched pair stop after they get their targeted sex. If that happens on #3, there is no number 4.

I agree that there are a lot of simplifying assumptions here; the model of every of n offspring having n offspring is not realistic. Still, given that framework, I was wondering if there was a direct, analytical solution to the question rather than it requiring simulation.

Your understanding of what I intended is correct as far as I can see it.

In a slightly more realistic model in which families ‘stop’ after they get the targeted sex (let’s assume B for historical reasons), it’s not clear to me whether they prolong the line or shorten it, since going GGB beats GG but gets you fewer male offspring than GGBGBGG. Tricky to model either way.

I recall we had a thread, perhaps on sex selective whatever, on exactly this subtopic. IOW, how does a widespread (or even universal) “keep trying until you get a boy” policy affect the population-level outcome of the B vs. G mix?

Somebody, possibly @Chronos, had a well thought out math explanation of the issue. But damned if I can remember the conclusion. And my recollection is too vague for a search. If it was him, perhaps he can dredge it, or his memory, up.

If I remember correctly, there was a discussion about China’s (at the time) policy of one child, but (sometimes) allowing a second if the first one was a girl, and what I pointed out was that, if there’s any genetic bias in the parents towards one gender or the other, such a policy would, in the long run, lead to more girls.

If one literally every generation kept going until one got a boy, no matter how long that took, then all male lines would be infinite. If it’s instead “try up to n times until you get a boy, and always stop at your first boy”, then the average length of a line would be about 2^n, since 1/2^n would be the probability of the line failing at any given generation. If it’s “Have two children, and then if you don’t have a boy, keep going for up to (n-2) more children until you do get a boy, then stop”, then I’m pretty sure that the average length of line would be infinite (some lines would still be finite, of course), since branches double (from the first two children both being boys) more often than they end (from reaching n children without a single boy).

But back to the OP’s problem: Assume, for simplicity, that every generation has exactly two children, and that each child has a completely random chance of 0.5 to be male. It feels like the problem should be doable by setting up some sort of recursion, where you define the average line length to be x, and then assume that your male children will each have a line of average x also, and thus, if you have male children, you should have a line of x+1. But the difficulty comes from the possibility of having two boys, in which case what’s relevant isn’t the average length of line, but the average length of the longer of two randomly-selected lines. A lot of things in statistics are easy to calculate with means, but that’s not one of them.

I don’t know the thread you’re referring to, but there’s an old puzzle that asks, what would be the effect on the population sex ratio if a law were enacted requiring couples to stop having children as soon as they bear a girl. So you might have families with any number of boys, or any number of boys plus one girl, but never more than one girl. Assuming both sexes are equally likely to be born, the counterintuitive answer is it would have no effect whatsoever. There would still be approximately the same number of boys and girls in every generation.

Intuitive analysis without the math: each sex is equally likely. So a couple’s first child is either a boy or a girl with equal probability, and adding that child to the population does not skew the overall ratio. Now if it’s a girl, they’re done. If it’s a boy, they get to have another child, but the same analysis applies: the second child is equally likely to be a boy or girl, so adding that child to the population again does not skew the ratio. And so on, for every child.

Thank you. That was the issue and now that you explain it that way it makes complete sense.

A rigid rule like that alters which particular parents have the various numbers of kids from one on up to “lots”. But doesn’t alter the collective outcome of all the parents’ babymaking.

If, however, the assumptions aren’t true, then the conclusion fails: if some men are genetically predisposed to father girls, and some to father boys (rather than each man having a truly equal chance of either), then the number of girls might increase sharply over the short term (in the long term, since I’m assuming the tendency is passed down by men, lineages with a strong tendency to produce boys would die out).

To answer the OP - read up about Galton-Watson processes