You might turn out to be correct, but this is a guess. There’s nothing special we know about the polls right now that would necessarily make them better than polling in recent special elections, or Trump’s own GOP primaries, both of which (in general) turned out to significantly underestimate the anti MAGA/GOP vote.
It’s fine to make a guess, but everyone should recognize that it is just a guess.
We won’t, it is not hard to make accurate guesstimates when your margin of error is several magnitudes higher than close election results are likely to be.
@kenobi_65 and @DSeid, I appreciate your patient responses to me, as I think you’ve helped me refine my position.
Polling is not unscientific, but at the levels it’s being conducted the margin of error is greater than the polling difference between the candidates. And that strikes me as practically worthless for purposes of prognostication.
You are very welcome! It’s the rare occasion where I can put my Dark Side powers of market research and advertising to good use.
Agreed. When the actual numbers (that is, what you’d see in a census of all voters) are very close, the margins of error in most polling can make it look like Trump is up by a couple of points in one poll, and Harris is up by a couple of points in the next poll.
Combine that with the issues in generating a representative sample that I described, and that it appears that a number of “pollsters” are fielding intentionally-skewed polls, and it becomes hard to have much faith in what the numbers are saying.
One aspect that’s given me some sense of optimism entering the final week would be the “undercounting” of women who plan to vote for her, but can’t admit that to a pollster while their overbearing MAGAt husband or boyfriend is standing nearby glaring at them.
And also, re: the last two posts, is the consideration that in vote-by-mail states, inside a household with a (potentially abusive) spouse, there is not necessarily a truly secret ballot.
Which is something that could change not just poll results, but actual votes.
Phone-banking in 2020, I reached a couple who I could only describe as Frozone and his wife from The Incredibles (the “honey?? where’s my super-suit?” scene). I was targeting the man, who gave a kind of wishy-washy answer to me on not deciding/not being sure, and her voice comes from the background “I don’t know what he’s going on about he’s gonna get his ass right down there and vote Democrat.” In that case he laughed and said “whelp I guess I’m covered”. I had to agree there.
It has Harris up 50-47. Biden won 51.8-47 in 2020. So similar at this point.
You can find them by going to the 538 House race pages and checking out the various House polls and seeing if they included a Presidential preference question.
ETA: I should add a big problem is that most of the House-district polls are conducted by partisan pollsters or released by the campaigns themselves. So they probably can’t be trusted very much.
As far as turnout the only reliable numbers I’ve seen come from Nevada, which currently show massive GOP turnout in early voting in the rural parts of the state. Whether this indicates an actually increase in GOP turnout or just a switch from in-day to in-person early voting remains to be seen, but it can’t really be seen as a good sign for the Harris campaign.
Also those margins of error that are generally reported, are based only on the technical variability due to sample size. They don’t take into account the modeling variability.
Estimating variablity due to sample size is trivial. You just just plug
p = proportion of voters in favor of one candidate
n = number of voters counted in poll
and your relative error is given by sqrt((p*(1-p))/n), so this is what they report.
But discussed above there is a lot of manipulation of the data to try to get the response data to be representative of the voting public. This is a good thing, without it you could get whats known as response bias. But this manipulation relies on comparing a set of variables from the response data and matching it to an estimate of what those variables should look like in the voting public. But since the distribution of variables in the response data and their distribution in the voting public are themselves estimates and so have a degree of uncertainty of their own. Finally whether the variables are sufficient information to fix the bias is also unknown. These uncertainties are very hard to estimate and any calculations would require assumptions of their own which may or may not be true. So that is why pollsters don’t report them.
But they are definitely there and will always act to increase the error bars. On an NPR show I heard a person who studies polls discuss this issue and recommend as a rule of thumb you should probably double the reported error bars.
That’s a very important point that I was considering putting into my post but decided not to since it was getting to long enough as it was.
While the technical sample size variability is going to be independent from poll to poll, the modeling variably is going potentially be correlated across multiple polls, so that all the polls may be skewed up and down. Which adds much greater variability when you look across multiple polls or multiple states. This understanding was why Nate Silver only gave Clinton a 80% chance of winning while other sites who made the classical assumption of independence gave her a 99.9%