How do I calculate a really simple (and stupid) confidence interval?

iamthewalrus_3 · December 19, 2006, 10:29pm

Ok, I’m completely at a loss here. I was trying to make a joke about a statistical survey (read as: doomed from the start) that was very small, and thus wildly inaccurate, but I can’t seem to figure out how to calculate the confidence interval for my “survey.”

My survey size is 2. Myself, and the first person to walk through my door after I discovered a certain fact about the company we work for. He did not previously know the fact either.

The population size is 250, the approximate number of people working at my company.

I want to know the percentage of people who work at my company who were not yet aware of this fact, to a 95% confidence level. I expect that the range will be huge, but I have no idea how to calculate it. I found several formulae and online calculators, but they all seem to break down on such a small survey. Little help?

1010011010 · December 19, 2006, 11:26pm

The standard deviation is too large in small samples for confidence limits to be calculated.

Your data simply cannot be used… or generates obvious conclusions:

0.4% (I.E only you) know the fact.
99.6% (I.E. only your coworker didn’t) know the fact.
Something, anything, in the middle.

What’s the fact?

iamthewalrus_3 · December 19, 2006, 11:41pm

I think there’s a slight confusion (ambiguity in how I phrased the question): I’m saying that neither of us knew the fact. Or perhaps I should phrase it as “neither of us were told the fact on starting our employment here.” So, with that definition, the standard deviation of the sample is zero: we both answered the same way.

The fact is a particular of how our health plan covers eye exams. Not particularly interesting, except to those who could have saved money using it and didn’t know about it.

I suppose that I could say that the lower limit is 2/250 people not knowing this, and the upper is all 250 people, but it seems like there ought to be some statistical math involved somewhere (even if it is small enough to reduce to zero in this case).

Obviously, if I polled three people, the chances are still small, but if I polled 20, then the confidence interval ought to be higher than just 20/250 to 250/250, right? So, what’s the term in the calculation that drops away for a sample of 2 but is important for one of 20?

David_Simmons · December 20, 2006, 5:42am

I don’t know how useful this will be but my tables for small sample statistics give the following.

90% confidence level. Sample size of 5. With 0 successes in the sample the success rate for an infinite population is 0 to 41%.

95% confidence level. Sample size of 10. The success rate would be 0 to 31%.

Shalmanese · December 20, 2006, 11:46am

iamthewalrus(:3=:

Ok, I’m completely at a loss here. I was trying to make a joke about a statistical survey (read as: doomed from the start) that was very small, and thus wildly inaccurate, but I can’t seem to figure out how to calculate the confidence interval for my “survey.”

My survey size is 2. Myself, and the first person to walk through my door after I discovered a certain fact about the company we work for. He did not previously know the fact either.

The population size is 250, the approximate number of people working at my company.

I want to know the percentage of people who work at my company who were not yet aware of this fact, to a 95% confidence level. I expect that the range will be huge, but I have no idea how to calculate it. I found several formulae and online calculators, but they all seem to break down on such a small survey. Little help?

Obviously the upper bound is 250, that is, everyone is ignorant. For a lower bound, you want an n such that n/250 * (n-1)/249 = 0.05. which works out to be n = 55.29 ~ 55. So if only 55 people are ignorant of the fact, there is only a 5% chance that you two would be ignorant (assuming indepedence blah blah).

Thus, your confidence interval is {22.16%, 100%}.

iamthewalrus_3 · December 21, 2006, 6:31am

Thanks Shalmanese, that was exactly the kind of calculation I was looking for.

cerberus · December 21, 2006, 8:09am

Since you do not have a random sample, you cannot even think in terms of a formal inference. Additionally, since only your “sample size” is 2, the only possible estimates for your sample proportion are 0, .5 or 1.00. The standard error for these sample proportions, respectively, are:

sdp=sqrt(p*(1-p)/n)=sqrt(01/2)=0
sdp=sqrt(p(1-p)/n)=sqrt(.5*.5/2)=.5/sqrt(2)
sdp=sqrt(p*(1-p)/n)=sqrt(1*0/2)=0

At n=2, the usual stochastic properties of the sample proportion do not apply.
You need proper random samples to use confidence estimation, and that excludes cnvenience sampling.

At n=2, even the conditionally exact, permutation-based methods will fall short, even if you did manage to do a random sample at n=2.

iamthewalrus_3 · December 21, 2006, 10:54pm

cerberus, I’m afraid I don’t understand completely. Are you saying that Shalmanese’s calculations are incorrect, or are you just adding further information about the statistics?

I realize that my “sample” was not random. But let’s assume for a moment that I really did make a very small random sample. There should still be a calculable 95% bound, even if it is very large, right?

cerberus · December 22, 2006, 4:35am

It would be highly uninformative … think about n=2:

There are three possibilities: 0 of 2, 1 of 2, 2 of 2. In the extreme cases, you get a lack of variation in the sample, hence the zero standard error. In the middle case, you get an interval centered on p=.50.

The methodology for n=2 is utterly, completely useless, because the minimum resolution of a sample of 2 is 50%.

Shalmanese · December 22, 2006, 8:46am

This is computing a confidence interval post-experimentation. You already have the result. Your calculations assume normality which is not valid in this case. With a sample size of 2, you can go all the way back to first principles and work it out from there.

cerberus · December 22, 2006, 10:30pm

The approach fails, since it is not based on a random sample from a well-defined population. And again, even if you did have an appropriate random sample, the only available point estimates for the proportion based on a sample of 2 are: 0, .50, 1.00. Useless.

Topic		Replies	Views
Statistics sampling question: am I doing this right? Factual Questions	15	1660	August 31, 2012
Is my company being cheated? (Sadistics) Factual Questions	3	1170	July 12, 2006
How to determine sample size for binary variable Factual Questions	7	7072	May 15, 2012
Questions about doing statistics on sampled data Factual Questions	9	1085	January 14, 2017
Am I doing this basic math question correctly? Factual Questions	3	615	March 7, 2005

How do I calculate a really simple (and stupid) confidence interval?

Related topics