Interpreting Rare Disease Statistics

Suppose a rare disease has exhibited over a number of years in the US a stable occurrence rate of 1 case per year per million population. Now suppose that we find somewhere in the US a community of 15,000 which has 2 cases of this disease one year. What can statistics tell us about the possibility that something special is is going on in this town versus the possibility that it is simply a chance occurrence of a rare event?

I calculate (using binomial distribution) the probability of 2 cases in a group of 15,000 to be about 1 in 9000. But I also figure that the probability is 87% that, of the over 18,600 groups of size 15,000 in the US population of 280 million, at least one group of 15,000 will have 2 cases. The first calculation says this is a rare event, and something special is going on in this town. The second calculation says 7 times out of 8 we’ll find at least one partition of size 15,000 with 2 cases, so it’s nothing out of the ordinary.

What do SD statisticians say about these 2 seemingly competing interpretations? (By the way this is based on an actual event.)

That’s the difficulty of statistics: interpretation is hell. You’d probably need to do more research in this case.

They aren’t conflicting interpretations.

Your first calculation shows that you would need a whole lot of communities for there to be a large chance of finding 2 cases in one of the communities.

However, the data in your second example show that there are indeed a whole lot of communities in the US–over 18,600. Since there are a whole lot of communities, it’s not surprising that you find a low-proability event happening in one of them.

Unlike pure mathematical statistics, epidemiologic stats aren’t an exact science. Epidemiologists pull their numbers from medical records that are coded according to specific rules. One of these rules states that a person’s record may be coded as having a disease when in fact that person may have been admitted for testing to rule it out. The codes are then culled and recorded.

The OP didn’t mention specifics, like what disease, or if these were two actual cases, or if one was real and the other person under surveillance. But that’s how it works.

Robin

I posted this to get some background to interpret a situation where 2 cases of Creutzfeldt-Jakob Disease (CJD) are present in a town of 15,000 in Michigan. One of the cases is a childhood friend of my friend from college. We had been discussing the possibility of Mad Cow’s presence in the US when all of a sudden 4 days ago he learned of his friend’s CJD condition, and then 3 days later the friend succumbed. The victim’s family had learned of another CJD case in the same community and yet another in a different Michigan town some distance away, but we know no details of the other cases.

My college friend, an insightful physicist, is convinced that this situation points to high likelihood that Mad Cow BSE and/or its human manifestation, new variant CJD, is now here in the US. I calculate that, although the chance of having 2 cases in that particular town is 1 in 9000, there’s almost surely (well, 87% sure) at least one group of 15,000 somewhere in the US with 2 cases. I contend that he just happens to know one of them, and that he can’t interpret this as a strong chance of nvCJD’s appearance in the US.

Surely an epidemiologist would want to know more details of the 2 cases, but my intent in posting is to see what rational interpretation can be made from these statistics alone. Still looking for help.