The weirdness of polls... What's up?

You might think of an election as involving three variables if you want to analyze things the way OMG has: the size of each group (i.e., overall number of Republicans), the turnout of each group (i.e., how many Republicans show up to vote), and the split among each group (i.e., how many Republicans are voting for Romney).

You cannot ignore any of these variables because a candidate could be getting whalloped by two of the measures but win easily because of the third. That’s why it’s perfectly possible for Obama to win even though he wins a lower percentage of Republicans and independents than last time and holds his level steady. All it requires is for there to be some more Democrats, or at least more showing up at the polls.

That’s why the key premise of OMG’s is about the enthusiasm gap (which manifests as both a likely voter gap and a self-ID gap). This is the hardest stuff to measure, not only because it involves more slippery questioning but also because it moves most quickly and fluidly based on the campaign (whereas deciding whom to vote for does not move that much). The whole “oversampling” nonsense was largely a question of enthusiasm surges reflecting themselves in differential self-ID. You’ll note that after the first debate when Romney got a surge of enthusiasm, the so-called oversampling seemed to fade way.

The bottom line is that there is one reliable way to assess all of these variables at once: ask people how they identify, whether they are likely to vote, and whom they are likely to vote for. You still have to adjust the samples to meet stable demographic targets (which is the sophisticated version of the oversampling argument), but the possibilities for systemic error here are lower because demographic make-up of the electorate is much, much more stable than self-ID and enthusiasm.

Well for one thing Obama beat McCain 53-46 so it’s quite possible for him to lose a whole bunch of support from 2008 and still beat Romney by half a point. In fact quickly doing the math in my head the numbers in that post are pretty much what you would expect.

What I object to is the dichotomy of Gallup vs. “everyone else”. Gallup is part of everyone else, and the average of all polls would include Gallup.

If you set up Gallup vs. everyone else, of course you would be wiser to rely on everyone else. But there’s no reason to isolate Gallup.

Any random collection of statistical samplings will include outliers, and just eliminating or underweighting the outliers is not a valid statistical methodology, unless you have reason to believe there’s actually something funky about the outlying samples. Otherwise, the correct approach is to include all samples and let the best guess reflect all of them.

ISTM that NS attempted to suggest that there is indeed something funky about the Gallup results, both by picking a few other examples in which Gallup was ostensibly an outlier and by observing some examples in which their polls had a high degree of variance which (he claims) derives from “an endemic issue with their methodology”. Based on this, he derives his “context” which he presents in the form of a “Gallup vs. everyone else” dichotomy.

This is what I object to.

Of course there’s a reason to isolate Gallup! The only way to assess whether a poll is an outlier is to compare it to the average. Is your objection that he would want to label something an outlier? He does so all the time on both sides of the outlier spectrum.

The “everyone else” line you quote is him saying " It’s much more likely that Gallup is wrong and everyone else is right than the other way around." That’s what it means to call something an outlier, an assessment with which you seemed to agree above.

Are you suggesting Nate did anything other than what you describe as the correct approach? If so, I suggest you re-read the article.

Ok, this objection I understand. You think he has only pointed out a few times when Gallup was wrong, and that this doesn’t conclusively show that Gallup has a bad methodology. In order to establish that reliably, you’d need some kind of pollster rating that compares the relative frequency of wrongness across a large sample. If only someone out there compiled such rating…

And, in any event, this is a separate argument from your earlier claim that “Silver is being misleading in that article, in comparing Gallup to ‘the average’” and that “The idea that Gallup is some sort of unique outlier has not been established by anything in that article.”

And in any event, it isn’t even necessary to assume that Gallup is doing anything malicious, incompetent, or otherwise wrong. Like I said, outliers happen, and even a polling firm that does everything right will occasionally, just by dumb luck, end up significantly off. Which is what I think is mostly happening here.

The weirdness continues- Gallup (likely voters) has Romney up by 7, and Investor’s Business Daily poll (also likely voters) has Obama up by 5.7. That’s a 12.7 difference between two different polls on the same day.

I’m not sure if the Likely Voter models add anything but confusion.

Right now the pollsters are just saying, “Um, I really have no idea”.

You’ve split one argument into three. But it’s all one issue. Again in brief, there’s no basis for downgrading the outlier unless you have some reason for believing its methodology is suspect other than the fact of its being an outlier. Otherwise it should be included in the average and weighted as much as anything else.

As Nate Silver put it:

Wise words.

Since no one is downgrading the outliers, least of all Nate Silver, I can only surmise you’re retreating from your earlier statements calling his post misleading.

No.

I don’t think he numberically downgrades Gallup in his model, but after a lot of discussion of how much weight to assign them, the rest of his post was to downgrade them as being frequently wrong and erratic etc.

So you concede that he doesn’t treat Gallup differently in the model. Instead, you meant to use the term “downgrade” to mean criticize?

I’m sorry,** F-P**, but your argument is either a moving target or hopelessly unclear. Instead of using your own idiosyncratic definitions for terms like “downgrade,” why don’t you just state your position in plain English. You think it was misleading of Nate to single out Gallup for criticism as an outlier? Is that it? (I honestly don’t think that’s it since you haven’t disputed the fact that he singles out outliers all the time, but it’s the best I can come up with from what you’ve written.)

No, I think he meant to say that numerically he doesn’t differentiate between Gallup in his model, but he discounts it in suggesting that you should view the Gallup poll as being in a category of its own, distinct from the group of “everyone else”. As he puts it:

I don’t know what to tell you, I think I’ve been clear enough as well as consistent, and that my understanding of NS’s blog post is correct. It’s not a big enough deal to argue about forever. I don’t have anything to add at this time.

Polls never make sense to me. Obama was at something like 89% just before the first debate and now its down to 69%… based on just that first debate? I dont buy it. He didn’t do well in the debate but its not like he came out and called Mitt a “cracker ass cracker” or something.

I dont believe a single debate changes the mood of a nation THAT much. So why they huge spike in such a short period? Were the previous samples doing something different? Did the first debate really turn an Obama voter into a Romney voter? One debate did that?

Based on the trending that had to have happened and in large numbers. But I just can’t see how that many Obama supporters before the debate watched it and said, “screw it, i’m voting Romney”

Of course the only poll that matters is the one on Nov 6. So go vote!!

No. The polls reflect changes in enthusiasm of already-committed voters much more than they reflect switchers or the deciding of the undecided. The first debate popped the Democratic enthusiasm bubble and energized conservatives, making the former less likely to vote and the latter more likely. That was the principal effect.

I think it was probably a combination of the first debate and Obama’s post-convention bounce fading.

One thing to keep in mind is “how likely is he to win” and “how much of the country supports him” are two different questions.

Suppose, just as an example, that Obama was supported by 51% of likely voters. But suppose also that the uncertainty on that number was very, very small, such that we were 99% sure Obama had more than 50% support. Then we could say* that Obama had a 99% chance of winning the election (if it were held today).

Then the debate happens, and it shifts 2% of the electorate to Romney, so now we’re 99% sure Romney would win the election (if held today).

This is a very exaggerated, unrealistic example, but it illustrates a point, which is that a small shift in the electorate can sometimes result in a big shift in the likelihood of victory. On the flip side, if 90% of the voters supported Obama then a huge drop, to say, 60% support, wouldn’t do much to change the likely outcome of the election.

  • ignoring the difference between popular vote and electoral college, and ignoring any questions about what “likely voters” means or how good we are at identifying them

I get that to extent, but the polls dont poll un-likely voters. So pollsters are now callign folks that a month ago were all Obama, but now are …Romney? Are they just calling different people, and the new people dont like Obama??

Enthusiasm would hinder the actual voting, but enthusied or not if someone asks who you are voting for why wouldnt you say “Obama” if you said the same thing 4 weeks ago??

my head, it hurts…

That’s where the difference between “Likely Voter” and “Registered Voter” comes into play. If a pollster called you right after the Dem convention and asked you who you were voting for and how enthused you were to vote, and your answer was “Obama, and 100%” then they put you in their results as a “Likely Voter”. A few weeks later, Obama just got hammered in Debate #1 and the pollster calls you back. “How enthusiastic are you to vote?” and now you feel a little glum and so you say, “um … i don’t know, not very”. Now you’re not considered a “Likely Voter”, so the polls of “Likely Voters” shows a swing towards Romney.

What the polls show is quite possible. Your mistake is thinking that those were “Obama supporters” that switched their opinion. In fact, those were most probably people who were going to vote for Obama because he was a known thing, with only a very lukewarm level of enthusiasm, and who needed some idea of what Romney was like in order to switch their “support”.

No, it works like this: Pollsters call a whole ton of people, and only the ones who pass the likely voter screen get counted. A pollster calls and asks “How likely are you to vote in November?” They might give you some options, ranging from not at all likely to definitely voting.

If you’re an Obama voter after the first debate, maybe your response shifted from “definitely voting” to “pretty sure I’ll vote.” Conversely, if you’re a Romney voter, you might have gone from “somewhat likely” to “I already voted!” These shifts affect the likely voter numbers quite a bit.

Relatedly, you might just be more interested in participating in a poll in the first place (instead of hanging up on them) if you’re feeling enthusiastic about the election.

That phenomenon doesn’t entirely explain Romney’s bounce. But it explains a very large part of it.