I’d disagree. 322 EVs is the Blue Wall (excluding ME-2) + FL, NC, and NV, all of which are polling as very close calls, maybe even close enough that the better GOTV operation makes the difference (as it may already have in NV).
IOW, Hillary could win all three with something between negligible change in the polls, and none at all. And I can’t think of a reason why they’d correlate strongly with ME-2. If Latinos and college-educated women put Hillary over the top in those states, ME-2 doesn’t necessarily change at all.
Anyway, I think Silver v. Wang (and other aggregators) will be settled by the outcomes of individual states in another ~65 hours. For instance, take Hillary’s three next best states after the Blue Wall. Predictwise, for instance, gives Hillary chances of 91%, 69%, and 68% of winning NV, FL, and NC. Silver gives Hillary between a 47.0% and a 49.7% chance of winning each of those same states, depending on which state and which model.
(Didn’t dig far enough into the PEC page to see if there were individual win probabilities attached to those states, but the map on the front page gives all three between a 60% and an 80% likelihood of Hillary winning each one.)
So somebody’s going to be right, and someone’s going to be wrong, about these three states. Hopefully we’ll know before going to bed Tuesday night.
(On election night in 2000, when my wife went to bed early, I told her I was staying up until we knew who won. That’s filed under ‘things you won’t ever hear me say again.’ :))
Each model’s math is based on sets of assumptions, which leads to a hypothesis. Silver’s assumes that undecided and third party voters are an unpredictable bunch, that the behavior of the electorate in 1996 and before is as predictive for today as how the electorate as behaved in the past 16 years, and creates a statistical model that “gives a greater likelihood of rare events.” He believes that the proper Bayesian prior is a close election based on econometric data and has not been convinced that there is a long term equilibrium of the polling that functions as the Bayesian prior (although he did raise the possibility once).
No one else agrees with those assumptions.
Other experts recognize that undecided and third voters are in fact moderately predictable.
Wang is the extreme perspective making the opposite perspective, a polarized electorate, “a hardened divide” his fundamental hypothesis. He made that hypothesis early on, somewhat tentatively, making predictions of how the election cycle would play out if the hypothesis was valid, and when those predictions came to pass expressed his hypothesis more strongly. OTOH when the cycle behaved according to the assumption that Silver expressed as “polls tend to revert to where they’d been previously” Silver stuck with his same model - hesitant mainly “because of the unusual nature of Trump’s candidacy.”
In Wang’s view Trump is in fact mathematically and statistically not so unusual. The math of the election has in fact been following the same math it has from 1996 on, one of relatively low volatility:
Wang is vocal in pointing out that undecided and third party voters are not a completely wild card group that we know nothing about. Instead
So if Silver’s assumptions are all correct - we have no idea what undecided and third party choosers will do; the consistent behavior of the polls from 1996 on is not any more predictive than are the behavior of the polls before that, and a statistical model that favors unlikely events is preferred; and long term polling does not provide any Bayesian prior - then his high uncertainty hypothesis could be the correct one. The math follows from them.
If we instead assume - that undecideds and third party voters will behave both as they have in past elections and how current polling of them says they will; that 1996 on is the better time frame to base current models off of than is before that; and that unlikely events should not be statistically favored in the model - then he is way off.
Oh there are other differences too. Silver’s model assumes that actual voter behavior can change rapidly and that transient polling shift reflect that. HuffPo uses a “stickier regression.” Elsewhere I have read explanations of how rapid news cycle shifts may be caused by polling artifacts more than real opinion shifts (such more willingness to participate in polls when your candidate is having a good news cycle). Silver believes that individual houses should be corrected for past leans while Wang keeps in simpler believing that using the median of recent polls to discount the outliers is prefered as the leans will otherwise cancel each other out in general with the relatively larger pool of states polls.
Drew Lizner, currently giving a 90% probability posting under the Daily Kos banner and historically better than Silver, also believes that the electorate is not so volatile. You can read details of hs methods here.
Bottom line is that deciding which assumptions should apply is key before the math applies. Silver is expressing opinions about which assumptions should apply that place him as an outlier.
Are there any results on Tuesday that would convince you that Silver has the better model? A Trump win or even a near miss? Silver calling more states correctly?
Normally I would say actual results don’t tell us much about probabilistic models, but when one guy is saying >99%, then it seems like a failed prediction should tell us a lot.
Several of the models including 538 and Upshot make state-by-state predictions of projected votes that would be the best way of checking their accuracy after the results.
I think it’s worth pointing out that 538 isn’t just making assumptions; on issues like the impact of undecideds on polling volatility and the corelation of polling errors they are using empirical relationships they have found in the data.
My hunch is that 538 has the better model for polling data but that it will prove to be less accurate because of intangibles like GOTV which no one is modelling AFAIK and which probably favor Hillary. For the next time they might think of developing metrics for GOTV and incorporating them at least in the polls-plus forecast.
Does he project vote percentages though? I looked around a bit on his site and couldn’t find it. I think comparing vote percentage projections especially for swing states is the best way of comparing model performance because it’s much more fine-grained than simple win-loss predictions implied by the probabilities.
Yes. I have in fact several times now tried to get a discussion going on how to judge the models after the fact. The Brier score seems to be the best metric. The final EV count having Silver’s no toss ups being right (which is significantly higher for Trump than are most of the others) … unless he suddenly flips them into the herd … is also a big divergence.
when it comes to probabilities there’s no hard and fast right and wrong answer. Even Wang’s model allows for a Trump win.
I think the true test is the assumptions the models are going by. Silver believes undecideds won’t break evenly, or at least that they might not. So if they do break mostly for one candidate, that’s a win for him regardless of who wins the election. Huffpo in particular didn’t like Nate’s “unskewing”, so if the final results are closer to Nate’s projections than the actual polls that shows that Nate’s system is sound.
Obviously, if Trump wins, that’s a win for Nate. Although I’m sure he won’t enjoy it.
I’d also note that 538 has the Senate nearly dead even, whereas Wang has an 80% chance at Democratic control. The betting markets emphatically agree with Silver on that count.
If you’re comparing two probabilistic models, you need a decent sample size to be confident which is better. But as you accumulate the sample, for any single result the model that assigned a higher probability to the actual result is a data point in favor of that model; and the significance is higher when the predicted probabilities from the two models are very different.
If Silver is saying 65%/35% and Wang is saying 99%/1%:
A Clinton win is weak evidence in favor of Wang, since 65% and 99% are not that different;
A Trump win is strong evidence in favor of the Silver model, because 35% and 1% are dramatically different.
Thanks. For some reason he has rounded out the margins but it should still be sufficient to compare to the others models. Huffpo also has projected margins so we should be able to compare all the four main models.
Can’t find any projections on Predictwise. Incidentally that site is such garbage when it comes to visual design. They have a unique dataset so I wish they would partner with a bigger site to present it properly.
Maybe there are legitimate concerns about the models Silver is using, but that HuffPo article was an unwarranted hit job and I don’t blame Silver for getting pissed. The media seem to be turning on him because they don’t like Trump’s chances, not because they have a more in-depth understanding of polling statistics than Silver does.
Some speculative thoughts about modelling this election:
Trump is a unique candidate. He alienates and enthuses different parts of the electorate in unique ways including people who don’t vote that often. This increases the possibility of polling errors and general uncertainty which helps him since he is behind and increased uncertainty means both a higher probability of a narrow Trump win and a Hillary landslide.
However Trump’s uniqueness also interacts with a diverse set of swing states and this reduces the correlation across states. For example you can imagine that polls undercount some rural whites and this helps him in Iowa but also undercount Hispanics which hurts him in Nevada. Because of the nature of map, this lower correlation helps Hillary because she only needs to win 1-2 key states to win.
In a nutshell this election may see bigger polling errors which helps Trump but also less correlated errors which hurts him. Roughly the two effects may cancel each other at least in terms of the overall result.
Even if all models agree that the mean expected resulted is (say) a 4% lead for Hillary in the popular vote, they may still generate probabilities for the less likely outcome in the wide observed range of 35% to 1% based solely on differences in variance under the models.
I find the extremely low variance in Wang’s model completely implausible. I don’t know enough about US politics to know if Silver’s much higher variance is correct, but I certainly hope he’s calibrated too high. The weight of money in the betting market is somewhere in between.
Again in non-probabilistic thinking average people’s terms, if Clinton wins Wang will look good and Silver not especially bad, as most people interpret a majority probability as ‘Silver says Clinton will win’ and Silver fans won’t argue to the contrary. If Trump wins Wang will look bad and Silver much less so, as at that point it will be emphasized by Silver fans that he gave Trump a pretty good chance.
Also though I didn’t read it directly from Wang but on this thread, he has apparently verbally at least equated 99% to ‘will happen’, Trump will not win. He won’t get far saying ‘it was 1 in a 100 chance, well that was the 1!’
On evaluating the models I’d reiterate that the probabilities can’t be strongly tested, just not anywhere enough independent outcomes for anything but ‘weak evidence’. IMO we will just never know if 35% chance for Trump is quite close or perhaps fairly far from his real chance.
But the problem can be thought as a mean-variance one. It’s relatively more testable whether Silver and others modifications of raw poll avgs (like RCP’s*) are correct. There’s also some correlation there among states, stuff like Clinton’s ground game would carry across most states, contested ones anyway. But other things would be more state or regionally specific (like which particular unreliable polls in RCP get filtered or modified by analysts like Silver, or states which more or less demographic potential for hidden non-college white vote).
But the variance part, which is a big part of calculating probabilities, very fuzzy IMO. Silver’s 35% chance for Trump seems closer to me, just seat of the pants, than 1%. For example Silver puts NH as 61% likely for Clinton but none of the polls still in the RCP avg in NH have Clinton leading (two ties, 3 Trump leads). I’m not saying Sliver’s wrong, but that to me is evidence of uncertainty, and he’s the uncertainty guy, generally.
*of course nothing is completely unfiltered since there are polls which don’t get included by RCP. But they don’t actively/dynamically filter as much or modify results. IMO it’s good to have sources which don’t as well as good to have ones which do.
Just to emphasize another aspect of the point that Lantern made:
If the 35%/1% difference between Silver & Wang is principally attributable to different variance rather than different mean, then Silver must also assign a higher probability to the outlier at the other end of the distribution, a Clinton landslide. But I can’t find numbers on that on either website, can anyone help?
If so, that would mean a Clinton landslide will be evidence in favor of Silver’s model. But most people will assume the opposite.