Catty:
That link is missing data on the probability Silver’s model data releases are falsified. The strength of Silver’s article is that he does explain his method in a statistical way.
What’s unfortunate about Silver’s finding is that it will be used to discredit legitimate opinion research, replacing it with mere speculation, as in your link.
I hope that the pollsters Silver showed to be at high risk of polling misreporting, such as Emerson College, will either retract, or justify their methodology. I want to withhold some judgment until then.
Assuming Silver’s findings are true and fair, I think the polling industry will learn from what seems to be a real scandal.
I repeat what I said before:
To be fair this herding issue is something Silver has been saying virtually forever. And similar bad science happens in science: publication bias. A limitation to the usefulness of meta analysis.
Sliver to his credit is trying to suss out which houses are most prone to it and decreasing the weights they get. As his and other good aggregators decrease the weights of partisan shops as well.
I’m not sure that it is adequate. OTOH I’m not sure how much harm it does? Basically it results in more a lean to a median than average result? Tossing away outlier findings before they even get published. The unpublished outlier results should be equally distributed in both directions if the house is otherwise nonpartisan.
I don’t think people appreciate the degree to which the smartphone revolution has changed society. In 2012, they were present, but still far less ubiquitous and capable than they are now, and in less than a generation we’ve gone from looking up directions on Mapquest and printing them off so we could check them in the car, to literally having the entirety of all known human knowledge, experience, and art accessible on demand at any time in almost any place.
IMO, it’s had an even larger impact than the advent of the Internet did in the '90s. You probably have to go back to the dawn of the telegraph to find an invention that produced so much change so fast.
Unless I am missing something, yesterday was the first time he listed pollsters by name and gave the probability of each having reported false results.
P.S. Publication bias would be plausible if we saw a pollster that normally releases results at a certain interval skipping a release with no explanation. Has that happened? I think no.
Either Politico is misunderstanding “weighting by recall” in polling or I am.
AIUI the way it works is you make sure the people who told you they voted Trump or Biden in 2020 match the actual 2020 results with the goal of making sure you don’t undercount the group that voted Trump in 2020 (or hypothetically Biden, but practically this is probably due to perception of undercounting Trump). But the people being captured by that Blueprint survey are mostly people who are regular voters and mostly did vote for either Biden or Trump in 2020. Unless I’m misunderstanding something, someone who voted for Trump in 2020 but is no switching to Harris or not voting for either is exactly who you would capture with weight by recall.
What you miss with weight by recall is either people who don’t tell the truth about their past voting or any changes among people who didn’t vote in 2020. You might also miss demographic changes in the 4 years (IDK to the pollsters think they have a way to control for that). I think there are a lot of good reasons to think that weighting by recall is dangerous, might lead to undercounting Harris votes more than prior polling was undercounting Trump votes, or might just lead to wild unknowns that could swing either way unpredictably. But I don’t think 2020 Trumpers who voted Haley and are now voting Harris would be the group you would miss.
Fifteen years ago, when Silver a much fresher face:
Silver also has written about poll herding in the recent past. He usually writes something to the effect of “It happens, but it’s not that big a deal”. Yesterday, he put his foot down.
Precisely. In fact, that specific group does indeed turn up.
I like the term “herding”. It’s much more immediately obvious what the term means than “publication bias”. I hereby petition users of scientific English to use “herding” instead “publication bias” as well, in the future. As Marge Simpson said, “I just think it’s neat”.
While Silver says that herding could make polls look artificially too pro-Trump or pro-Harris, I don’t think undermeasuring pro-Trump support is as likely. It’s of course possible, if earlier polls were just in error due to pure randomness, and later polls regressed toward this bogus mean. For the polls to undermeasure pro-Harris support, that type of error is possible as well, but there is also the possibility that the polls do not recognize a genuine pro-Harris shift since too sudden of a shift would likely be thrown out. While a pro-Trump shift is certainly possible, I can’t think of anything that would logically create it in the past month or so.
Pollsters being terrified of under estimating Trump again is something that makes sense to me. While the Harris campaign might actually be perfectly ok being shown as a small underdog. This might backfire though, outside of very involved people polls are only judged on whether they picked the right winner or not, even in a close election. So showing all this small Trump leads might make them look even worse if Harris wins.
Disaffected Republicans coming home. This is basically what drove the error in 2016, and some of it in 2020. Most of the “undecided” voters were Republicans that hate Trump but just can’t vote for a Democrat (much less a black, female Democrat).
It’s entirely possible that the same thing happens this year but pollsters are too hesitant to publish those pro-Trump polls.
Then counter-argument is the pressure many of the independent pollsters have to avoid missing in the same direction for three cycles in a row.
Is there evidence that the October Trump shift was herding? I thought there just was a lot of herding and there was before and after the shift.
At any rate the race has remained near a coin flip by any metric you could use before and after the shift. I wouldn’t be shocked if late breaking due to the Trump rally fallout is bigger than any recent reported shifts.
Also yeah, if there is a real shift in Trump’s direction it’s probably just people coming home to Trump, not any new development outside of that. I think there are still a lot of voters who have some misgivings but ultimately were always going to fall in line with Trump.
If anything there’s been a shift in Harris direction the past few days and we are back to almost a virtual tie. Not that 50/50 is much different than 55/45 really.
I don’t recall him sharing the specific list by name of worse offender or that he included it in his weighting ever before. The general behavior of pollsters as a group though, as long as I can remember.
Well that and revising their turnout model until it is in range with the herd (“torturing” it) is exactly what he is presuming happens:
This is a clear-as-day example of what we call herding: the tendency of some polling firms to move with the flock by file-drawering (not publishing) results that don’t match the consensus or torturing their turnout models until they do.
But it isn’t quite the same thing in other contexts, at least most of time. Publication bias is often just that a study that fails to support a hypothesis is less exciting and therefore less likely to get published. Something outside the herd might even be more likely to get published. This the aggregation (meta analysis) is possibly going to underrepresent the negative findings and be more likely to provide stronger hypothesis support. The reluctance to publish something inconsistent with past results, presuming or afraid that there must be something wrong with it, which is herding, is just one sort of publication bias. My understanding anyway.
A possible hypothesis is that majority in the herd have heavily altered, possibly overcorrected, their models to avoid underestimating Trump support again. Early on those who used less severe corrections may have published anyway, but near the end they don’t want to stand out and be only ones who missed the same way again.
We’d need Silver to do a trend line on his analysis to know.
Adding to highlight this from Silver:
Polls in Kansas and the 2nd Congressional District of Nebraska — where herding is less likely because these races aren’t expected to be close and they don’t get much attention — have also shown conspicuously strong Harris data. If Harris approaches the numbers the polls show in these places, she’ll probably win demographically similar states like Michigan and Wisconsin comfortably.
To my ear this sounds like him trying to make sure that his bit about his gut telling it was going to be a Trump win (with the actual point being that his gut was as worthless as everyone else’s) is appropriately hedged. His preemptively setting up the argument that the numbers said a solid Harris win all along, with this article to be pointed as his “told ya so”.
Not sure I understand the implication of national vote polls moving more to Trump than the swing state polls. He does not offer his analysis of herding in the national dataset. More herding or less than in swing state polls?
Tonight’s Ann Selzer poll drop (for Iowa) should help clarify this. AFAIK, she is certainly no herder. And, Iowa has similarities to Kansas and Nebraska.
Silver mentions Selzer, her “consistently published seeming ‘outlier’ polls only later to be proven right”, and her past result, in the same breath as Nebraska’s second district and Kansas.
So yeah even he thinks so.
How do you understand his implied significance of the national polls moving more to Trump in the herding context? I’d get it if he was providing evidence that herding happens even more on the national numbers, so the GOP EC advantage decline may be being exaggerated. But I’m not sure such is what he saying?
Good questions, but I really don’t know.
The aforementioned Carl Allen has a Harris EC win forecast at 66% today. The image below shows the movement of Allen’s forecast over time against FiveThirtyEight’s and Nate Silver’s.
From the Comments section of your linked article:
Thank you Nate for this, because I’ve gotten quite turned off by these polling averages and the models built on them, including yours (sorry!!) and you have explained here the reason. When I listen to David Plouffe he talks about them and the Trump campaign having extremely rich data sets on the electorate, which as an online marketer I understand. Meta has hundreds of thousands of data points on us, for example, and so can target micro audiences with specific messaging, a capability I use every day as I test various marketing messages. Plouffe said they also do lots of quantitative and qualitative research on various subgroups and the undecided voters every day which deepens their understanding of who they are.
He doesn’t share their data, but he says the undecideds (3-5%) look very much more like Harris voters than Trump voters, and that is one reason why they have growing confidence.
And I found myself wishing that your analyses accessed richer data like they are using because you are being severely limited by these polls, as you have explained in detail. Would that be possible in the future? (You might need some venture capital to do it!)