Doubting the statistical analysis at 538

Looking at the list of swing states, the electoral votes in play, and the probabilities of the candidates to win each state – the 27% chance of a win they are giving Trump in the overall election seemed wildly optimistic to me. A 1-in-4 chance is not bad at all. So I decided to do my own quickie analysis in a spreadsheet and surprise surprise, my result is totally different.

I labeled these as the swing states: NV, AZ, IA, OH, PA, NH, NC, FL

I gave MO and GA to Trump, and CO and VA went to Hillary.

I used a random number generator to determine the winner of each swing state, comparing it to the chance of a Clinton win in that state, and then applied the electoral votes to the total.

I only ran 200 scenarios (538.com claims they run 10,000 and I’m sure they include all the states) but out of all my scenarios, Trump only won twice. For a 1% chance to win overall. And looking at the numbers, this makes sense.

Clinton has 249 electoral votes in the bag. Trump would have to nearly sweep the swing states to win it, and with Clinton having well over 50% chance to win in some of the swing states, that’s extremely unlikely. 1% chance of a Trump win seems much more realistic than a 27% chance, given the electoral map and current swing states.

So who’s right? I’m sure including all the states adds a little randomness to the result, but is that even worth accounting for? And I doubt it could change the probability from 1 to 27.

Their exact methodology is a black box even with all the descriptions on their web page about how they do it. I’m not sure how you could hope to reconcile your (or anybody else’s) method to theirs.

I also find it odd. But understandable.

Assume Trump wins all of Nevada, Iowa, South Carolina, Missouri, and Georgia (some of which he is not favored to in in and none that are out Clinton reach). On the other side consider Virginia is out of reach for him, assume a loss there. NH too. (Pretty much following UpShot’s graphic here.)

He then still needs to sweep Florida, Pennsylvania, and Ohio. Current 538 polls-only odds at 0.36, 0.26, and 0.37 which comes to a 3 to 4% chance of doing all three … if all behaved as independent events.

That’s the kicker though. They wouldn’t behave as independent events. Silver assumes a high probability of states moving together. As stated in the explanation page: “If a candidate beats his polls to win Ohio, there’s a good chance he’ll also do so in Pennsylvania.”

Hence if circumstances occur that he is on the win side of his 0.37 odds in Ohio then there is a good chance that he will also be on that side of it in the 0.26 chance in Pennsylvania and a high likelihood of such in the 0.36 chance in Florida.

Your simulation assumes that each state’s outcome is independent of other states. Nate Silver’s model assumes that they are correlated to some degree.

He’s probably right.

That seems understandable for a polls-plus analysis, but in polls only, we are starting at a certain % chance for Hillary to win in each state. If you aggregate those probabilities, you get a 1% chance for Trump, not a 27%. Mine is a simple stats and math analysis, I really think 538 is just adding a ton of uncertainty as a hedge against future changes. IMO it is misleading to do that and then call it a polls-only approach.

Even the polls only analysis includes correlation between states. It’s not misleading because it’s more reflective of reality than assuming independence. It’s actually misleading to assume the way people in PA vote is completely independent of the way people in Ohio vote.

Polls-only is still a projection of who will win on Nov 8 (ie, it’s not the now-cast.) The same events that may push Ohio voters towards Trump would also push Pennsylvania voters towards Trump. It seems intuitively obvious that they’re correlated, and the 538 model quantifies the correlation based on historical comps.

Silver’s model also incorporates historical data.

Really, it’s how it works in the polls-only too. I gave the exact quote. What is unclear is tightly correlated states should be considered and what factor Silver uses.

The significant difference in polls-plus is the addition of a Bayesian prior based on so-called fundamentals.

BTW, the phrase is “covariance” of states and part of his completely opaque secret sauce is which states his models ties together by how much.

FWIW Wang’s more transparant take on covariance.

Interesting, thanks!

I still feel like taking into account things like “covariance” is adding possibilities beyond what the polls tell us. You could put a covariance factor into the random numbers by applying a single random modifier to all of your other random results, like -15 for example would effectively reduce the chance for her to win by 15% in each state. (That would be an extreme example)

If you don’t take covariance into account, then you wind up having trials that are (often much) less likely to actually happen being weighted the same as trials that are (often much) more likely to happen.

“The polls” doesn’t just mean “the State polls.” It involves the national polls and demographic polls. A model that sees every state as separate is a less accurate model. In reality they are connected.

The models are not unanimous. The NYT’s model (at the Upshot) gives Clinton an 87% chance, while 538 clocks in at 72%. Here’s the full set of probabilistic aggregators:

NYT 87% Dem.

538 72% Dem.

Daily Kos 74% Dem.

PredictWise 77% Dem.

PEC (Wang) 94% Dem.

Cook Leaning Dem.

Roth.1 Leaning Dem.

Sabato Likely Dem.
The odds for Trump are too high.

Hillary Clinton stands today at 41.5% of the vote in a four way race. Trump may be doing worse, but with that many undecideds and third party voters there’s just no way you can put such high odds on Clinton. 75-25 is about right. There are still 10-20% of voters out there that can be persuaded. And Clinton seems to have stopped trying, instead hunkering down.

Perhaps none of you are aware that Clinton’s RCP average lead has been cut in half:

http://www.realclearpolitics.com/epolls/2016/president/us/general_election_trump_vs_clinton_vs_johnson_vs_stein-5952.html

Clinton +3.2

Once more, with feeling: Not “cut in half,” but rather “returned to the long-term mean.”

Clinton 3 to 4 points up nationally is what one would expect, given everything we’ve seen over the past months, plus what we know about past elections (especially recent ones), plus a dash of what one would expect of the two parties’ relative status given the state of the economy.

It’s only “surprisingly close” if you start throwing things in like “given how much one candidate is so obviously more qualified than the other.”

Nevertheless, a 3 point lead with 10-20% committed to neither candidate is not a 90% chance of winning. Silver, as usual, has it right.

I think in this case, Wang is more likely right. The electoral map is too tough on Trump. And while it’s not proof of anything, a lead at this point has indicated the winner for the last 16 elections, iirc.

I agree that the folks in these polls who have expressed an intention to vote for Gary Johnson should be looked at very carefully. I’m a Hillary supporter, and this is just about the only thing that worries me somewhat.

Wang and Silver had it out in 2014 and Silver took him to school. Wang relies too heavily on the equivalent of a Nowcast. Silver projects into the future, which you kinda need to do 60 days out.