How to determine sample size for binary variable

I am very rusty at statistics. I am trying to determine how to calculate the sample size for a given confidence level and confidence interval (margin of error). I know the population size. The variable is binary. My statistics text talks about sample size to determine the mean of a statistic normally distributed in a population (like average scores for third graders taking standardized tests) but not my situation. Because the variable is binary, the type of distribution doesn’t really apply.

The scenario is how many data values from a database must be sampled to determine the density of errors in the database. A value either has an error or it doesn’t; we don’t define multiple errors in the same data value.

As an example let’s take 10 million values, 99% confidence level, and margin of error of ±5%.

I learned this stuff in school but never used it again. A link to a site with a good explanation would be sufficient if you know of one. All I could find by searching was similar to what’s in the text.

Binomial distribution?

I think this page describes it pretty well. If the errors are rare, you’d probably be better off using the Wilson score interval, but it’s not an unforgivable sin if you don’t.

This looks like just what I need, but it doesn’t take into account population size, only sample size. How can that work? Does it assume an infinite population?

Population size doesn’t matter unless your sample is a very large fraction of it.

I think this is a little different than what I need. This answers the following question:

If the probability of finding an attribute in a population is x%, what is the probability of finding that attribute in n members of a sample of size s?

My question is the reverse:

If an attribute is found in n members of a sample size s, what is the confidence level of finding that attribute in the population in the same proportion within a given confidence range?

Never mind made a mistake see next post

Actually just look the links by ultra filter.