How to Make a Point Regarding Statistics/Surveys

Hey All! I need some help trying to articulate a particular position with regards to surveys. Here is the quick background:

I have been tasked with managing surveys at my work. I have been provided a survey tool, I just need to create the survey and send it out. Typically we send it out to 1500-2000 people and get a 5-10 percent response rate. I know for a fact that every targeted person recieves my survey, I just have no control whether they complete it.

There are people who want to use the low response rate as an arguement that the work resulting from the survey data is not valid. I feel that the average of the responses from 5 percent will be similar to the results from 50 percent. Am I correct in this assumption? If so, how do I argue my point?

You don’t have a point. People who voluntarily return surveys are not a randomized sample, so their opinions about anything else do not meet the standards of sample randomness.

The low response rate is heavily weighted toward people whose personality is amenable to filling out surveys, which also colors other aspects of their personality, which are either over- or under-represented in the survey results…

Even offereing an incentive would skew the response. Say, if you offered a spa membership in a drawing of completed surveys, you would only get responses from people who want a spa membership – a self-selecting non-ramdom sample.

Not valid for what? As a piece of other data involved in a research program, or as the one, single factor on which a decision is made?

5-10 percent is actually a pretty good response rate for an unsolicited contact. BUT, you’re only getting responses from people who are likely predisposed to talk to you about that particular topic anyway. It is, as the statisticians say, a self-selected response group, like jtur88 pointed out before I could post this.

Does that mean the other 90 percent of your targets have no interest at all in whether the survey leads to a change in business practice, whether a product is discontinued, how much of a tax increase they’d be willing to pay, etc? Of course not, it just means they can’t be bothered to respond at this particular time. And that doesn’t help your argument.

If you wanted a better argument for your survey’s accuracy, I’d contact a statistically significant number of people who **don’t **respond to the survey and compare their responses to the ones who did respond. (How many is statistically significant? I dunno, maybe 50 completed calls out of the 2,000 names.)

The other thing you can do is show how well previous surveys have actually predicted final results.

If you’re talking about the survey being a part of a bunch of different research tools, focus groups, pilot rials, historical trends and what have you, then, again, 5-10 percent is a pretty fair response rate. What kind of rate do you get for things like direct marketing campaigns, mail-in rebates and other stuff that requires a customer to take action?

Thank to the both of you for your responses. I totally get the self-selecting thing and all. I don’t want to go into too much detail, but the survey is targeted to a specific group of our employees who all have the same job code. The survey asks whether or not they are doing the tasks we currently train them for, so it is more of a yes/no. We use those results as a part of the overall analysis. Since the target audience all share the same job code, I would expect that the self selecting variable does not apply as much as if it were a behavior or opinion based survey.

I would not. Whether or not they are doing the task trained for seems to me to be very likely to affect whether or not they chose to answer.

If they are employees, can’t they be told to complete it?

The only real way to find out is to get off your proverbial ass and go and ask them face to face.

I agree with njtt. There’s no way of telling what’s going through the mind of an employee who reads that question. “Are they going to give me even more to do? Are they going to use my lack of training as an excuse when the next round of layoffs comes around? I don’t really believe my response will be anonymous.”

One bias that often creeps in is that people are more likely to complete a survey if they feel like there’s something to complain about. That could bias you towards employees saying they weren’t adequately trained.

Yeah, with only a 5% response rate there is clearly a large self selection issue, so looking at an individual number may not be all that useful. However this data could still be useful in determining trends. If the number of respondents who say they use their training goes up from 60% to 70% you can conclude that the use of training has probably increased, even though in reality it might have gone from 40% up to 53%.

I’m convinced that 97.3% of people who give surveys don’t understand jtur’s point.

No one has addressed this point yet: The Law of Large Numbers says that you can’t assume the average of a small sample will be similar to that of a larger sample. Even if you didn’t have the bias mentioned by others up-thread, you’d expect that as the sample got larger, the responses of your sample would get ever closer to the average of the population.

That’s not a correct interpretation of the law of large numbers. All it says is that the sample average converges to the population average as the sample gets large. It says nothing about what happens for small samples.

Anyway, the discussion in this thread is generally spot on. The right way to handle this problem is to hire an outside firm to do interviews with a random sample of the employees. I don’t know of any, but this is a common enough problem that they’ve got to be out there.

I’m tempted to say “self reported data is useless”. Outside of a few instances, like exit polling, even if your sample isn’t biased (which takes a lot of work and sophisticated sampling techniques or else it almost certainly is), you’re only getting the answers people want you to hear, to the specific questions you asked. If your research is “how do people answer these particularly worded questions?” that’s one thing. If you’re trying to find out “what actually IS the answer to these questions?”, asking people and tallying up their answers is probably the worst way to find out.

So the OP can assume that a sample of 5% to 10% will have the same results as a sample of 50%? :dubious:

How be I change my first sentence to: “An outcome of the Law of Large Numbers is that you can’t assume the average of a small sample will be similar to that of a larger sample.”

I’m not saying the two results willbe different, just that because of the convergence to the population that occurs with the larger sample, there is a realistic likelihood the results will change.

Using thissample size calculator I determined that if 2000 is your population and you get 200 responses and you want a confidence level of 95 then your margin or error is seven percent. That is pretty big but your survey may not be completely useless.

Assuming a true random sample, that’s correct. If your sample isn’t random, you can’t calculate the margin of error. That’s why everyone here is talking about how self-selected samples are probably introducing bias.

That’s the thing with sample sizes: Polling 1000 people will give pretty good margins of error no matter what the population size, provided the sample is randomly chosen from the population. Poll 1000 voters in the United States before an election, and you know within a couple percentage points what the popular vote is going provided the sample is randomly chosen from people who are going to vote. The hard part of surveys is not in actually polling people, it’s in finding out who to poll, without introducing bias. Do you call telephone numbers? In 1936 Republican voters were more likely to have telephones than Democrats. What percent of land-lines vs cells do you call? And importantly, key to this discussion, even if you reach out to a truly representative sample, do the people who respond do so randomly? Are people dissatisfied with the current president more likely to respond than people who support him?