Need help interpreting polls

I feel like I’ve painted myself into a corner. A few weeks back, I conducted a series of polls in IMHO. I kept the purpose of the polls a secret so as not to affect the outcome, but my only purpose was to determine whether there might be one or more participants whose worldviews were compatible with my own. Basically, I wondered, “Does anybody think the way I do?”.

And now, after trying several times to interpret the responses, I feel like I might have messed up, and I was hoping the board’s resident stats experts could tell me whether that’s true. For one thing, I did multiple polls (three total), reasoning that each succeeding one would take nuances from the previous one. For example, after determining that a person was not a naturalist, I wanted to know whether he was an existentialist. That way, anyone who, for example, believed in a supernatural entity with contingent existence would be eliminated.

Unfortunately, I did two things that make the tally very difficult: (1) I allowed people to participate in polls II and III who did not participate in I; and (2) I allowed people to respond without explanations, reasoning that if we both believed the same thing, the reason we believed it didn’t matter.

But it does, as I see now. For example, one of the choices was between a quote by Eddington and a quote by Kant. One of the people chose Eddington (which is agreeable to me) but only because she felt Kant was generally incoherent. Therefore, even though we agree on the particular item, we would not necessarily agree on something else where Kant was right and Eddington wrong.

And as far as people participating in one poll but not another, it seems that I miss whether the person who matched me almost exacty in poll III would have matched me at all in I and/or II.

Is this salvagable, or were the design flaws too great? Is there a way to determine whether one compatibility is any greater than another?

If you’re trying to analyze the reasons why people picked a given answer, you’re really out of the realm of statistics and into psychology/philosophy/what have you. I don’t think we can really help you with that.

The other problem is at least a statistical problem, although it’s not one of those things where everyone agrees on the right way to handle it. For what you’re doing, the best options are to either throw out the subjects who didn’t respond to poll I, or add a third outcome of unknown for your reporting. Trying to guess someone’s answers to poll I is going to involve multiple regressions, and there may be some independence issues with the confidence intervals that could trip you up.

Unless you randomly sampled your responses from a well-defined population, no self-respecting statistician will go near your “poll.” You can analyze non-scientific “polls” pretty much any way that you please, since any analysis ultimately only describes your respondents, and no-one else. Sorry to be mean about it, but helping in the analysis of such a flawed approach only enables and encourages more abuses of proper sampling theory.

Actually, I’m glad you were mean. Based on what you both said (and no one has dropped in to contradict you) I think my problem is that I wasn’t taking a poll, I was giving a questionaire. I’m really not trying to determine what percentage of the general (or even a specific) population sees eye-to-eye with me. Rather, I’m trying to determine whether anyone who happens by the thread sees eye-to-eye with me. For that, I should give a single questionaire (or if multiple, don’t allow add-ins or drop-outs), and the questions should be degree of agreement or disagreement, instead of chosing A or B. Am I correct?

The problem is more fundamental; it’s that your respondents are self-selected, not randomly sampled. Unless you know that EVERYONE who read the thread responded (and even that’s not enough, but let’s go with it), you can trust any results.

For example, if the question implied (even slightly) that you supported one answer over another, it will skew whether or not people choose to answer. Many people don’t like to state an unpopular opinion, and will simply skip such a question. Others are generally argumentative, and will take opposition to almost any implied “right answer.” It’s unlikely these two groups cancel each other out, so rather than sampling everyone who read the thread, you’re really only sampling everyone who read the thread and chose to answer.

Doing it on the internet also skews results, since for issues of importance you get ballot-stuffing (people asking their friends to vote, who otherwise wouldn’t have known about the poll, and bias toward the opinion of the one asking), plus sock-puppets and the like. The SDMB is better than most places with regard to those effects because of strong moderation, but in general self-selected samples are worthless unless your question has something to do with self-selection itself and you have access to the population that didn’t select.

So, just to be sure I understand you correctly: if someone, say User007, responded to the questionaire, answering every question the same way I would, I still could not know whether User007 sees the world as I do. Is that right? I guess that would be a question of his/her honesty. If so, what does sampling have to do with it other than increasing the likelihood that such an honest user will drop by? If I can’t tell the honest ones from the dishonest ones, it seems to me that I still wouldn’t know even if every human alive responded.

Well, in this case, you’re right…sampling problems aren’t a big deal for you, becase you’re not trying to figure out the viewpoints of a population…you just want to find people who’s views are similar to yours, but you’re not going to say, based on the poll “15% of the SDMB agrees with me. (In other words, you’re not really even trying to get a population sample)”

But, that being said, like Timewinder said, you have to watch out for bias in terms of the way you asked your questions, because that will influence the responses.

Right.

If inference is not the goal, then the binding issue is clarity and balance in the survey items. Write the items in a neutral way, so that responses aren’t forced or influenced.

If you want a simple analysis, go with predefined categories - open response items are a royal bitch to analyze.

Predefined categories? Open response items? Could you please clarify in layman’s terms? Incidentally, I greatly appreciate your bothering to help me with this, and your patience in doing so.

An open response question is a question that the person can answer any way he or she wants…like

“What do you think is the biggest problem facing America today?” The person you ask can say anything, can make his comment as long or short as he wants, etc.

I think when he’s referring to a “predefined category”, he’s talking about a closed response…something like this:

"I’m going to give you a list of problems many people think are facing America today. Please tell me which, in your opinion, is the biggest problem facing America today:

a. terrorism
b. the economy
c. the erosion of civil liberties
d. the environment
e. the erosion of moral values

In a question like that, you’ve set up categories and are making the person choose between them.

Thank you! Thanks everyone. I believe I understand now.