How many people have to be affected by something before you're statistically likely

. . . to know one of them?

Sorry, ran out of room in the subject line.

I’ve never taken a class in statistics or anything so I’m not even sure if this is a valid question.

Let’s keep it within America for simplicity. For instance, X-Hugenumber people have died from AIDS, but I don’t know any of them. X-Hugernumber people have died from cancer, and I know several of them. Is the answer somewhere in the difference between those two numbers, or am I a statistical anomaly for not knowing anyone who has died from AIDS?

Also, if the first question can be answered, what are some things we statistically are and aren’t likely to know someone who has been affected by?

Cigarette deaths
Car accident deaths
Murders
Animal attack deaths
Lottery jackpot winners
Fame (there’s got to be some estimate out there of how many celebrities there are in America, though it would obviously have to be a very rough one)
Etc, etc.

I’m also curious about things like, in a city as large as Mumbai, how likely is a random person to have known one of the victims of the recent attacks? Or how likely was a New Yorker to have known someone in the World Trade Center?

Well, I doubt it’s very useful to necessarily say “within America how likely am I…” because some of those things are more location-dependent than others. I would guess, just off the top of my head, that you’d be more likely to have known somebody who died of AIDS if you were living in New York, San Francisco, or another large cosmopolitan city during the 80’s and 90’s. Likewise you might be more likely to know somebody who won the lottery if you don’t live in South Carolina, as we only got the lottery a few years ago.

You also have to define “know” pretty carefully. Is that someone you are friends with? On a first name basis with? Work with? A relative of? Does it count if they’re the relative of someone you work with? How about if you went to school with them but haven’t kept track? Same church or PTA or bowling league? Bump into them at the supermarket or bar or barber shop?

There are a million definitions for “know,” which is why silly sayings like six degrees of separation are thrown about all the time by people who use any definition they feel like and mush them all together.

I’m too lazy to look up the other statistics, but NYC has a population of around 8,247,500. Nearly three thousand people died in the WTC attacks. Divide 8,247,500 by 3000 and you get 2758. This means that one out of every 2758 people in NYC died in the WTC attacks. So if you live in NY and know around 3,000 people, your likely to know someone who died in the attacks.

This is ignoring other factors like victims of the WTC attacks that don’t live in NYC. Also people living in downtown Manhattan are more likely to know someone working in the WTC than people living elsewhere.

I thought it was a given that these’d have to be really simplified.

Like in math class we did those problems where you tag 75 deer and then later you go back and catch 20 deer and 3 of them have tags, how many deer are there? They didn’t ask if you entered on the east side of the forest this time and if it was Deer Ramadan and Venus was in the house of Joe’s neighbor’s cousin, because eventually you’re just asking a meaningless question. I’m curious about averages, likelihoods, you know?

I’ve often wondered about this myself, specifically because I encountered so many people immediately after 9/11 who “knew” somebody who worked at the WTC.

For the sake of facilitating discussion, I’ll propose the following definition for ‘know.’ You know a person if there is mutual recognition upon meeting that person, and there has been at least one instance of prior communication between the two of you.

I once read about a study constructed to test the concept of six degrees of separation. It involved giving a package to a person on the west coast, along with the name of a complete stranger on the east coast and the name of the city in which that person lived. The person on the west coast was instructed to send the package to the person whom they felt had the greatest probability of knowing that person. The recipient was given the same instructions, and each subsequent recipient was also. It turned out that sometimes the packages actually did arrive at their intended destinations in six transactions or less.

Yep, I’ve thought about that too. Seemed like every time the subject came up for the first few months, someone in ear’s range knew someone in the towers. And I’ve also wondered about second hand smoke deaths. According to one thing I read, 69,000 people die from it every year. That’s over 1.8 million people since I was born, and they should be somewhat evenly distributed. I’ve never know, met, or heard one single personal account of any of them :dubious:.

On the other hand, I’ve known a really unfortunate number of people who have died in car wrecks.

Ok, then. Mr. Simple is here.

I’ve heard a number of times that the average person by late adulthood has had personal interactions of some sort (enough to know or have known their name) with about 10,000 people. Let’s assume that about half of those are people you “know” for the definition of this question. Note that this is a memory on top of an estimate on top of a guess, wrapped in an enigma…but 5000 is a nice round number, so it has to be right.

After that, it’s just math. Let’s assume we’re in the US, and all 5K of our acquaintances are, as well. There are 300 million or so folks in the US, so a quick division tells us that our acquaintance web is about one person in 60,000.

Thus, assuming condition X affects 60,000 people in the US, (or equivalently, one in 5000 people), on average, you’ll know one. Note that this never gets to certainty: If 299,995,000 people in the US have condition X, there’s always a small chance that your 5000 friends, by coincidence, don’t have it.

Note that for many conditions, just because you know someone with condition X doesn’t mean that you’ll KNOW you know them. HIV, AIDS, cancer, etc. are all very treatable diseases that often have no visible symptoms, and carry some degree of either social stigma or other desires for privacy. The numbers would have to be much higher in that case to include only acquaintances that would tell you about the condition. (Consider the oft-quoted statistic that one in four women of college age has been raped. These things are very hard to research, much less casually know, because many of the victims won’t tell you. Given the numbers above, do you think you know ((5000/2)* .25) = 625 rape victims? And a similar (somewhat less accounting for multiple) number of rapists? There’s no good way to know, but most people who don’t work in prisons wouldn’t guess anywhere near that high.)

Of course, this is only based on the peoples’ best guesses as to whom would be the best person to pass it along to. And if the only information they had was a name and a city, I wouldn’t expect those guesses to be very good. Sure, you could try going by geographical proximity, but if you knew (say) the person’s profession, then you might have better luck passing it to someone you know in the same field. Or, of course, there might be a completely nonobvious connection, that you’d never find without an exhaustive search: Maybe Mr. Los Angeles and Mr. New York both have second cousins who happen to work at the same office in Tokyo, but how would either of them realize that?

Given the huge variations based on location and other factors, even if we did somehow come up with an average, it’d be a meaningless number because it wouldn’t apply to anyone.

Here’s another variable: I am a gay man who lived in NYC from 1970 to 1995. I knew literally *thousands *of people who died from AIDS. Now of course many of them knew each other, and that would affect the statistics *while they were alive, *but now that they are gone, those connections are no longer statistically relevant. The only relevant statistics are those of us who are survivors.

Technically speaking under some versions of “technically”, I “know” somebody who died in the terrorist attacks of 9/11. Now, I don’t live in New York or know many people (or any, honestly, unless you count people on the SDMB) who currently live in New York, and in fact my “connection” was that much more unlikely because it’s somebody who died on one of the planes. An immediate relative of somebody I kind of know by sight at the church I went to as a child that my parents are still very active members of. So on the one hand I’m a statistical improbability (outside of New York and a plane victim) but on the other hand it’s a pretty slim connection.

It’s possible that I’ve known somebody who had AIDS or even died of it, but if so I am unaware that a) they have it, or b) that they died, or c) that they died of AIDS. For all I know, McCree O’Kelley, who I have not seen since eighth grade who went to New York, last I heard, to find his fortune on Broadway has won the lottery, died of AIDS, or married Donald Trump. Would that count?

On the other hand, I know tons of people who have had cancer and several who have died of it; I’d be very surprised if any normal adult didn’t know anybody who had died of cancer. (By normal adult I mean somebody with the average number of social and family connections.) I don’t think my “cancer death total” is at all unusual - several acquaintances, an aunt, and an uncle. My “cancer so-far survivor” total is probably similarly common - an aunt, a friend, several acquaintances, and probably several people I don’t know had cancer.

ETA - oh, wait, are they a “cancer survivor” if something else killed them? Add my grandfather, if so. Do precancerous things on your skin count? 'Cause my dad has had some of them.

No it isn’t. Say you have two 40-year olds and two 45-year olds in a room. The number 42.5 doesn’t apply to any of them, but it’s not a meaningless number. If things worked the way you’re saying, statisticians, mathematicians, scientists, and actuaries all over the world would be out of business.

We could tell you the number of people for any particular thing and divide that into the number of people in the U.S. but you’ve specifically rejected that approach.

So either, as you indicated in your OP, it’s not in fact a valid question or you haven’t formulated it in any way that the rest of us can understand.

I’m currently of the opinion that it’s not a valid question because there isn’t an answer outside of the one already given, but if you want to try to rethink it I’ll try to reanswer it.

I guess now’s a bad time to mention my background in mathematics, statistics and actuarial science, huh?

The average is an appropriate measure of central tendency when you have independent observations from a normal distribution, or some other nice situation. What you’re talking about is certainly not normal, and certainly very highly autocorrelated. The techniques you learned in your intro stats class that apply to the former situation are not good here.

For the first, it depends. My Dad apparently got a favor from an aneurism, but at the time he happened to be on his death watch… the immediate cause of death was the aneurism, but if his aorta had been in perfect shape he would simply have lasted a few more weeks of agony. The immediate cause of death was the aneurism, the deep cause was that his genes’ expiration date had come up.

Would your Dad have died of the cancer without the immediate cause of death? Was he in remission/cured of the cancer at the time of death? If no and yes, then he can be counted as a cancer survivor.

Grandpa had that “takes forever” kind of prostate cancer that a zillion old men have, and that many of them die without even knowing they have. So that would really screw up the numbers - are you a cancer survivor if nobody even knew you had it? :slight_smile:

Anecdote only – but I was shocked when I was at my 20th year high school reunion to learn that one of my very good high school friends had died of AIDS some years earlier. So it’s possible that it’s happened already and you don’t even know it.

Ed

That’s the origin of the six degrees of separation concept, from a Stanley Milgram experiment. Note that there were many flawswith the experiment. It was more “something neat” than an actual scientific finding.