If there is anything off about the models this year, I’d think it’s the ‘likely voter’ model. This year’s election is so different, and with Covid and riots and whatnot going on, does anyone really have good handle on how all this will change voter turnout?
Brier score, and sure it can.
Looking at the last series of polls that have come in, I think Biden’s chances have increased a little. Based on current polling Ohio seems slightly more likely to win Ohio than Trump at this point. I’ve got Biden currently standing at about 309 EVs, but it’s tight in a lot of states.
OK I have a model which over 10,000 trials has an average Briar score of 0.25. Is that a good model or a bad model?
Answer:
It depends
Good model:
I always report 0.5 as my prediction of the likelihood that a coin will come up heads
Bad model:
I always report 0.5 as my prediction that it will snow tomorrow regardless of the date
and yest Brier not Briar. <>
In both cases your model has no more value than randomly picking, with equal weight, from two outcomes. Tautologically.
We can tell from 538’s Brier score that it is more valuable than guessing at random.
Better than picking at random is a pretty low bar. And in the first case picking at random was the “right” answer with the optimal Brier score. You can have a much higher Brier score and still be picking at random. For example I could guess the likelihood that it would snow as equal to 0.4 in the summer and 0.6 in the winter. That will give a score greater than .25, but will still be a pretty crappy predictor, and will in fact do worse than one that just estimates likelihood of it snowing as zero throughout the year. You can’t know what is or is not a good Brier score unless you know how difficult the problem is.
The big question regarding Nate’s predictions is how large his confidence bars are. I think they are probably about right, the ones he says he doesn’t know are really up to chance and the ones that he says he’s confident he’s likely to get right. But in order to evaluate his performance you need to really say, better than what?
Now of course if you have a model that frequently picks ones and zeros and has an average Brier score of 0.99 you can assert that you have got an awesome model. But that is why I specified in my statement that you couldn’t necessarily determine the value of a model by looking at its score alone.
ETA: I had a temporary brain fart, and thought that higher Brier scores were better. Please change the 0.99 to 0.01 and change “higher” and “greater” to “lower”.
I doubt there are many truly stochastic aspects to an election - people don’t actually roll a dice to decide how to vote. So assuming that the process is fully deterministic, the confidence intervals and probabilities in the models represent either data that is inaccessible, or imperfections in the model. Does Nate Silver talk explicitly about what he thinks is the source of the uncertainty in his models?
But random events affect turnout. A great number of people are eligible to vote but don’t, and the numbers aren’t the same year to year. There are going to be a few that would have voted if their car hadn’t broken down, their kid (or themselves!) hadn’t gotten sick, or if they’d have to stand outside in pouring rain for hours versus a mild day. With mail in voting you can eliminate those but add in all the postal questions.
Look, here’s the thing. We Teeming Millions — okay, Dwindling Dozens — are a sophisticated bunch, with our Brier scores and whatnot. We can get away with using words like “wrong” in this context, because we get the nuances.
But I’d rather we didn’t, because a LOT of people DON’T get it. They actually think “50%” is some magic threshold. If a 538 model predicts Candidate A has a 51% chance of winning, and they do, the model (to these uninformed folks) was “right” — and if he loses, it was “wrong.”
If a butterfly flapped its wings and the same candidate had a 49% chance, their reactions to the actual outcome would be reversed. Obviously, this is absurd, but they don’t get that.
Why should we care? Do I give a darn about protecting Nate Silver’s reputation? Of course not. It’s because this wrong (yes, truly wrong) understanding of probabilistic modeling might DISCOURAGE A FEW PEOPLE IN KEY STATES FROM BOTHERING TO VOTE.
Yes. In the podcasts, he says the big sources the model DOES account for are: 1. COVID (mainly how it affects actual voting), and 2. The high number of mail-in ballots (partly, in that people and places not used to this might mess things up a little, not intentionally).
The possible source he explicitly DOESN’T try to account for are deliberate vote-suppressing or vote-destroying shenanigans.
Well, car breakdowns and kids getting sick are random, but surely extremely unlikely to be significant. Weather affecting overall turnout is something I hadn’t thought of - that’s definitely a significant random element this far out, although from about a week out it could be modeled.
I think the postal questions are not really random, just unknown.
I suspect that this is a big part of that. People who don’t understand statistics and modeling well (which is, honestly, nearly everyone) may interpret the “77%” number that 538 is currently giving Biden (which is, in full, “a 77% chance that he will will the election”) as “we predict that Biden will win 77% of the votes.”
At some level nothing (other than perhaps quantum effects) is stochastic, but its impossible to account for every variable in model. For example, Nate’s model doesn’t account for the fact that it rains on election day in Pittsburgh so people don’t feel like going out. It doesn’t really matter exactly where the randomness comes from, in the end it all gets wrapped up into a Big Bundle of Uncertainty (BBU) that permeates the model. All that matters is accurately estimating how large the BBU is and what its structure looks like. A good model will have used all the available information well such that this BBU is as small as possible, and will estimate the magnitude of the BBU well. A poor model will either have a larger BBU that is necessary, or over/under estimate its magnitude. Mistakes in any of these directions will increase your Brier score.
Part of the uncertainty relates to the accuracy of the polling data. There is of course uncertainty do to the fact that each poll is made up of a random sample and so has a confidence interval associated with it. But also different polling operations conduct their polls in different ways which may result different levels of bias and uncertainty. By comparing polling agencies over time, Nate tries to estimate this degree of bias and uncertainty into how his model uses the data. So Nate adjusts the data from the polls before they enter his model based on how they trend relative to other polls. And of course.
The second bit of uncertainty is that the political landscape could change over time. If the polls were set in stone such that the polls today were going to be identical to the polls on election day then there would be no point in running a campaign. So there is time dependent variability that makes the model more uncertain the further it is from the election. I’m not sure whether Nate’s estimation of this effect takes into account the unusual stability of Trumps poll numbers relative to past presidents, but if not than he may be over estimating this effect.
Then there is just the overall BBU related to who actually turns out on election day and how accurate all of the other modeling assumptions were. As well as estimating the magnitude of this, you also need to estimate its correlation structure. Any unaccounted for effect that shifts Ohio in Biden’s favor is also likely to shift Georgia in his favor. These correlated effect make the uncertainty of the overall system larger than the sum of its parts. So modeling that structure is important as well.
Finally all of this is slapped into a Bayesian framework of “fundamentals”. If you are asked what the high temperature is going to be like in Phoenix on July 4 2021, none of your weather models are at all useful and so your best guess would be to go with what the historic average temperature on that date is. However as we get closer to the date you can start paying attention to the local weather patterns, until on July 3 2021 you entirely ignore the historical data and rely entirely on on the recent weather patterns. Nate’s model does the same thing using such things incumbency, a state of the economy to develop a baseline model that doesn’t rely on polling data. In the early part of the campaign, where the polling BBU is large, more weight will be given to the fundamental model. But as we get more state polls, and get closer to election day the model will rely more on the polling model and less on the fundamentals.
FiveThirtyEight has Biden at 81% likely to win the presidency, the highest he’s been at since June 1. He’s been on a mostly steady upward slope since September 1. I think the movement is mostly due to the time-based error slowly decreasing as we get closer to the election–there’s less time for something to change the dynamics of the race.
The Economist’s model likewise shows a steady improvement for Biden.
I’m still nervous about the election. Trump staying president at 1 chance out of 5 is frighteningly likely. But my hope is feeling more rational now.
Yeah, from Nate’s update last week:
Furthermore, the mere passage of time helps Biden in our model, because every day that Trump doesn’t gain ground is a day when his fate becomes slightly more sealed. (Lots of people have already voted!) Case in point: In an election held today — Trump has no more time to make up ground — his chances would be 9 percent, not 21 percent, according to our forecast.
Me too. I’m confident he is likely to lose, but the cheating and suppression still scare me.
FiveThirtyEight has their House model out. Their best estimate is 237 (+2) seats to the Democrats. The Economist estimates 241 (+6).