Pooled sample statistics

To cut down own testing load, sometimes samples are pooled. Maybe 10 samples mixed and tested once. If a pool tests hot, then the individual 10 samples are run separately.

If the testing equipment breaks after the pooled samples are run, is there a way to back out a likely number of infections? Say 200 samples split into 20 pools of 10 samples each. Half the pools are positive. That could mean anywhere between 20 and 100 actual infections. I’m guessing the most likely number is somewhere between. My unjustified hunch is closer to 20 than to 100. A general expression as a function of pool and population size (or ratio) would be more interesting.

It’s been too many decades since I’ve done much with basic statistics.

Somebody check my mathematical reasoning:

Suppose p is the proportion of infected people in the general population being tested (or, equivalently, the probability of an individual being positive).
Then q = 1-p is the probability of being negative.
The probability of a pool of 10 testing negative, then, would be q^{10}.
If half the pools are positive, that means the other half are negative, so q^{10} = 0.5.
Then q = (0.5)^{(1/10)} = \sqrt[10]{0.5}, which is approximately 0.933.
So p = 1-q = 1-0.933 = 0.067 (about 6.7%).

I didn’t use the fact that there were 200 samples split into 20 pools, but I would if I tried to calculate a confidence interval or margin of error rather than just a single point estimate of p.
If you just want the likely number of infections out of the 200 people tested, that would be 6.7% of 200, which is 13.4.

I’m not the best checker since I’m the one who needed help in the first place, but that seems a logical way to go about it. And I got the same number. Thanks.

This calculation is based on the assumption that the samples in each pool are independent. This may or may not be the case: For instance, if samples from the same geographic area are chosen for a pool, then they’re more likely to be correlated.

Yes, good point.

Does that actually help or just generate a ton of false positives? Or more to the point, if someone gets a positive pooled result, how worried should they be?

They should get retested and find out. How worried they should be depends on a lot of factors; how many people in the pool, how much the people in the pool are in contact, number of past positives in the pools.

My wife is tested weekly in a pool at work, so far she hasn’t been in a pool that testing positive.

Why do you think there’s a bunch of false positives? When a pool tests positive, you have to go back to the original samples and test them individually.

If positive pooled results are very common, it may not be worth it to do it that way.

This was meant to be a more abstract question about the math, but it was prompted by a friend whose kid’s school was doing pooled testing and bungled something and couldn’t tell which kids were in which pool. So we were wondering about statistics. No doubt false positives make the math more exciting, but we hadn’t gotten that far.

I felt less bad about not knowing what to do because she majored in math and couldn’t puzzle it out either.

That’s kind of where I am, too. I was prompted to think about this by real life circumstances, but I’m also curious about the math.

Our school district decided to offer weekly pooled testing; it seemed like a good idea - doing our part to keep everybody safe and stuff. (We’re also all fully vaxed.) Saturday morning I get a text that my 8th-grader’s pooled test contained at least one positive sample and that more information and advice would follow. When 8th-grader asks me that afternoon for permission to go to a friend’s house, I still haven’t gotten additional information or advice, so I tell him he can go but has to stay outside. Since there are no video games outside, he stomps to his room and connects with the friend digitally. He vehemently insists that he doesn’t have covid and demonstrates his healthy lung capacity by Sighing Deeply all weekend. Monday morning I call the school, and they say to send him in and they give him an individual test (which turns out to be negative) there. Of course, I’m relieved that he’s healthy, but there’s a teensy part of me that’s annoyed that he was proven right after being so condescending. And I definitely wish the school had communicated better about this, and I’m wondering if I just signed up for a weekly fight.

But going back to the math. Let’s say we have the original 200 samples pooled into 20 tests. If exactly one sample in each pool is positive, we then have to test everyone inidividually anyway. We end up doing 220 tests instead of 200. How often would something like that happen?

I suspect that happens a lot when test positivity is high in an area during a surge. Sometimes positivity rates hit 40% or higher. At that point, 40% of all your tests are positive and I think pooling would be a waste of time and resources. On the other hand, if your positivity rate drops to below 10%, you have savings.

I’m surprised that the lab reported a positive pool before restesting. You’d think they would test right away before any report. Maybe their so overwhelmed they can’t keep up.