Both have and made different (reasonable) choices.
As a result, when there’s a trend starting, Nate picks it up earlier than does Sam. However, at the same time, many times Nate’s prediction wiggles back, as the “trend” was just noise, while Sam’s stays constant throughout.
The preference is wholly based on whether you prefer your trends smoothed or not - statistically, how many false positives on trends are worth picking up the trend a week early? In particular instances (moving from aggregate statistics to individual events), e.g. comparing a single state like Iowa on a particular date, in a case where “Nate says trend, Sam says no movement yet”, all you can really do is wait and see.
And Nate Silver is a sports stats guy. I really doubt Wang would have chosen a simpler model out of convenience. He strikes me as someone who just wouldn’t bother doing it at all if the more complex way, that he didnt have time for, was the only way to do it right.
Well, 538 is owned by ESPN and is most definitely still doing baseball and football predictions.
Regardless, most of the extra work is just at the beginning when you build/program your model. I think Wang’s success in prediction is pretty decent and I see no reason to think that he made some lazy man’s version of a prediction model.
538 isn’t just about elections but as the name itself suggests that is the focus and you can bet that in an election year it is Nate’s overwhelming priority. Plus he has people working under him which I doubt Wang has for his election model. I don’t think it’s unreasonable to assume that there has been more effort put into the 538 model. Perhaps they have wasted some of that effort and put in variables that serve no purpose but that isn’t the way I would bet.
That’s just not the right way of looking at it. He’s not making a model that is “wrong.” He’s making a simple model that is more accurate than it really has any right to be. He doesn’t correct for anything–he assumes polls will balance themselves out He scrapes the top polls, not even checking to see if they are 2 way or 3 way.
It’s not lazy to choose a simpler system that’s nearly as good. Since he makes no money off of it and can’t devote his full time to it, it just makes sense to use a less involved model.
Having built a few of these sorts of models myself, I’d say the difference in effort between adding and calibrating a few extra terms in a predictive model is fairly trivial - the decisions here are likely almost entirely from differing opinions between two different data scientists (both have posted critiques of each others’ methods) rather than any imbalance of effort.
The time-consuming parts are data scraping/entry (both of them do this), and results delivery - where you really see Nate’s budget & time is in his fancy interactive website that let you drill down into each state and has many cool visualization features, while Sam has a couple minimal charts and offers a downloadable .csv file if you want his internals.
I suspect there is a difference in the quality of the data sets. I wouldn’t be surprised if 538 has a data set of every single presidential election poll ever conducted in the US. I don’t know if Sam would have more than the last few elections.
BTW have either of them discussed this specific issue about using national polls? It seems to me the kind of question where there is a lot of high quality data which should deliver an unambiguous answer. Either national polls have some predictive value for state polls or they don’t.
You know they aren’t sitting there working things out with sliderules, right? A more involved model is a couple lines of code. At this point it’s low volume data entry when a poll comes out. He chose his method because he thought it was as good, not “nearly as good”, it just makes less exciting graphs.
538’s added a little thing to the Updates section, which shows how the updates changes the forecast. So here’s last natch of updates:
NATIONAL Clinton +10
NATIONAL Clinton +9
LA TIMES TRACKER Tie (Clinton +5 /w house effect)
NEW YORK Clinton +23
KANSAS Trump +11
ARIZONA Clinton +6 (Clinton +4 /w house effect)
The result of these updates is… Clinton’s chances drop a tenth of a percent.
Honestly, I don’t get it. Nothing about these polls suggest Trump’s chance improved.
I would venture to guess that the Kansas and NY numbers are such that the model believes they are consistent with a lower national margin, and are also impacting states with similar demographics. So the Kansas result drags down Clinton’s chance in Missouri and Iowa slightly, or something, and Trump wins an extra one time out of a thousand when the model runs.
And I don’t know that even Nate could describe the precise causality of a specific set of polls’ impact on the model. Just that before he entered those polls the model spat out one result, and after it spat out the other.
OTOH who says the underlying statistical process of this election is at all similar to those decades ago, or even recently?
As I’ve said, I think Silver adds value with stuff like tracking and rating recent errors of polling firms, the ease of seeing various data directly from the webpages in intuitive form, and perhaps the ‘trend adjustments’ of stale state polls based on national polls etc. I don’t think the actual % likelihoods spit out by the model are particularly valid though. There’s just not enough data, again from what can reasonably be presumed a stationary underlying distribution over time, to test whether ‘15% winning probability’ really means something close to 15%, as opposed to just ‘this personal is fairly far behind for this point’ which the adjusted poll results can also tell you.
And again with regard to history, I think one is just as well off going by seat of pants to say 6.9% margin in RCP avg presidential poll is bigger now than it would have been in a less diverse/polarized electorate decades ago, and bigger now than it would have been several months before the election. I’m just not sold that quantification beyond that is very meaningful, and I see too many cognitive errors people make in considering the supposed % likelihoods (eg. they could take into account some Colombia peace vote type error in the polls missing an anti-establishment wave, but there’s no way to predict that, you just wait less than 3 weeks and see )
Who knows? As of right now, the two most recent updates were from Vermont and Virginia. The Virginia poll had Clinton up by 9, which 538 adjusted to up by 10. Either one is right in line with (actually, a hair above) the 8.7% lead that they’ve got Clinton at in Virginia.
But they adjusted Clinton’s win probability down 0.5% as a result. How can that make sense?
Then they upped Clinton’s win chances by 0.4% due to the Vermont poll. Sure, it shows her up by 28 in Vermont (adjusted to 29), but 538 shows Clinton up by 29.1% there overall.
They generate their percentages Monte Carlo style by doing a bunch of simulated elections, right? So even without new polling data they’d get a slightly different result by re-running the process.
Good point, borschevsky. There’s an inherent error in Monte Carlo computations. His 20,000 samples is not a large number. I’d rather see billions instead of thousands.
538 really needs to show an error range on the chance of winning, just like they do for electoral votes and popular votes. It’s likely they compute it but decide not to show it. Probably because it’s on the order of 20% and would undermine their presentation.
I just eyeballed the results of his Monte Carlo draws, and simulated 20,000 draws from a t-distribution (what he uses) centered around 350 with a probability of 90% of being more than 270 (about where things are right now).
Doing this many times gives me a standard deviation of 0.2% in win probability, his is likely a little noisier as the actual distribution he draws from isn’t smooth.
So I’d say drop the tenth-of-percent he’s reporting and you’re outside of the Monte Carlo noise range.
Which, er, is the exact same as a 20,000 binomial sample centered around 90%, overcomplications-R-us over here… point being, those individual +/- 0.4% in each update can’t be distinguished from method noise.