There are many differences between Nate Silver’s approach and Sam Wang’s. Silver applies weightings to polls based on the pollster’s historic accuracy and how sound he thinks their methodology is. He adjusts for pollsters’ “house effects”. He projects older polls forward in time based on a trendline adjustment. He combines each state’s polling average with regression estimates based on region and demographics, giving more weight to the regression in states with less polling. So far as I know, Wang does none of these things, pretty much taking the polls as is. (I don’t mean this as a criticism of Wang – there are certainly some reasons to favor simplicity.)
However, I’d like to mostly ignore those differences and focus on something else – how they estimate uncertainty. Wang starts with an estimate of where things stand today that has very little uncertainty, and then assumes random drift between now and election day, the magnitude of which is based on historical data. He also applies a Bayesian prior based on the long-term state of the race. Silver runs simulations with several random errors added in: one representing a systematic error in the polling nationwide, others representing systematic error in the polling in particular regions or states with particular demographics, and others representing systematic error in the polling of a particular state. The magnitude of these errors is based on the time until the election plus various other factors like the number of undecided voters and the size of the state’s population.
But even on election day, the error terms in Silver’s model can be far from zero. We can see this from the Now-cast, which sets the date of the election to today, and currently shows Clinton with only a 67% chance of victory despite having her ahead nationally by 3.5 percentage points.
Wang’s snapshot also addresses “who would win an election held today”, and shows Clinton with something like a 99% chance of victory. Wang deemphasizes this percentage (he doesn’t state it explicitly, but notes you can get it by summing the histogram), because after all the election isn’t today and he presumably doesn’t want to give the false impression that he’s saying Trump has essentially no chance. Likewise, Silver doesn’t treat the Now-cast as a prediction.
That said, I think the difference is important, because it tells me something about how much I should trust the predictions they will make once the week of the election arrives. At that point, there’s almost no time left for drift, meaning Wang’s prediction and snapshot should converge, and likewise Silver’s Now-cast and Polls-Only model should converge. (His Polls-Plus model, which considers fundamentals, should also converge to the others, since the weight it places on fundamentals goes to zero as the election approaches.)
So who to trust? Is it really reasonable to think that (given no time left for drift) the polls give us 99% certainty of the outcome? On the other hand, is it really reasonable to think that if one candidate is down 3.5 points nationwide in the polling aggregate come election day, they still have a 1 in 3 chance of victory?
(DSeid, I know we had some back and forth on this in a previous thread. I appreciated your take on things. Please don’t think I ignored it just because I’m bringing it up again for further discussion.)