What if 538 is right, and Romney loses by 60+ electoral votes

Chimera · September 2, 2012, 2:53am

If Romney loses, the Republicans continue their march toward a smaller, more right wing party until they implode.

BrainGlutton · September 2, 2012, 2:55am

Then they’ll focus on the popular vote, which is sure the be a whole lot closer.

In fact . . . Imagine the cognitive dissonance among Pubs, if Obama wins the EV but Romney wins the popular vote!

Fuzzy_Dunlop · September 2, 2012, 3:30am

Ok I don’t know if this example will clarify or not but I’ll give it a shot… I realize it might sound out of left field.

But let’s say we have a sample of 8 data points and all I need to do is develop a regression model to predict future data points. The historic data is: (6, 9, 7, 8, 15, 33, 7, 43).

I have developed a model based on the 8 historical data points, namely: 0.1486x6 - 3.7394x5 + 36.488x4 - 174.55x3 + 426.25x**2 - 495.5x + 217. My model fits the historic data extremely well. In fact, it accounts for 97.41% of the variation in the data.

Unfortunately it didn’t do a very good job predicting the 9th point, which was 17. My model predicted it would be about 600.

But like you said, we’ll just add the new historic data and come up with a better model. The new model is -0.0481x6 + 1.4798x5 - 17.671x4 + 103.23x3 - 304.38x2 + 418.92x - 196 and it accounts for about 91% of the variation in the data. Not nearly as good but still a damn good fit.

But does it have any more predictive value or validity than the original model now that we added additional data to factor into the model? Of course not. All my data points were random. But it’s extremely easy to fit a model to it that very closely predicts the historic data, particularly when the training data set is small and you allow a large number of parameters as I did.

Don’t get confused by the fact that presidential elections aren’t random and economic data is likely to have an actual impact on it. It’s still trivially easy to over-fit your model to the data. One way to test if you’ve done that when you don’t have an explicit validation set available (i.e. the 2012 election) is with cross validation, but you can’t if you’ve used all your available data to build your model. And you certainly can’t call it accurate just because you managed to fit a model to your historic data each and every time.

lance_strongarm · September 2, 2012, 3:32am

Sigh.

Of course you have to test it. Otherwise there’s not point to it. Nobody said you don’t test it.

Of course you don’t pick random meaningless variables like vowels in a name. You test strong causal predictors, like the economy. You can do multivariate analysis to see how strong the correlations are and whether they are more likely to be causal.

But

Wesley_Clark · September 2, 2012, 3:33am

That is my impression/hope, but who knows really. A fear I have is the GOP will move really far to the right but still win elections.

lance_strongarm · September 2, 2012, 3:37am

Sure. But all that means is you need a better model. How else can you predict the next number in a series then by analyzing the trend though?

Yep.

Are you going to use it to predict the next number?

lance_strongarm · September 2, 2012, 3:41am

But that’s exactly what it’s supposed to do - fit the data it is based on. A model is nothing more than a theory that explains data. If it fits, it’s good. And the next election is like the experiment that tests the theory.

Fuzzy_Dunlop · September 2, 2012, 3:45am

This vastly untrue, as I apparently unsuccessfully demonstrated. I mean I demonstrated it clearly but evidently the point was lost.

It’s trivially easy to build a model that fits your training set. It’s not ‘good’ in any way and doesn’t deserve to be treated as potentially valuable just because there’s no explicit validation set.

Fuzzy_Dunlop · September 2, 2012, 3:55am

It’s complicated; much more complicated than trying to minimize in sample loss, which is not particularly useful. You have to select a model that is neither so small that you miss useful patterns nor so large that you mistake noise for pattern. You can penalize or constrain flexibility. Most of all you can cross validate, or at a minimum, not pretend your model has any validity before it’s been tested even one time.

I’m not really sure what you think that means but sure. My model predicted -47 and the next number was 15.

lance_strongarm · September 2, 2012, 3:57am

I understood. I shouldn’t have said it that way.

How about this: if it fits, and it uses logically connected variables, it’s probably as good as anything else you’ve got.

Fuzzy_Dunlop · September 2, 2012, 4:01am

If it fits what though? The training set? How many variables do you allow? If you keep adding variables it’s going to fit and it’s not going to be as good as anything else, even granting that the variables have a logical connection.

lance_strongarm · September 2, 2012, 4:02am

I think you’re more mathemetician than economist.

I note that your example isn’t a model like this one. It’s not a cause and effect predictor with two variables. It’s just a random string of numbers that you’re trying to put a trend on. I don’t think that’s the same thing. There are logical connections between things like the economy and election results, and there are good correlations between the two, and they are persistent. That’s a model that predicts something causing something else, in the real world. It’s not just random numbers.

For instance, I predict that ice cream sales will be higher whenever the temperature is higher, because there’s a logical connection between the two and it fits the past data, which is where I got it in the first place.

I mean would you rely on it to predict the next number after refining it?

If not, what would you use?

lance_strongarm · September 2, 2012, 4:03am

You try a bunch and see which are the strongest. Ask them how they picked.

adaher · September 2, 2012, 4:06am

It wouldn’t be the first time OBama lost the popular vote and won. Hillary Clinton got more votes than him.

He could be trotting out the same strategy: Focus on key races which are winnable and don’t worry about the popular vote. If he wins, he wins, that’s the rules we agreed upon. But if that happens, it doesn’t mean Republicans have to change course, it just means they didn’t play the game as well.

There are defeats and then there are defeats. Winning the popular vote and losing the electoral college is a technical defeat, not a popular repudiation.

Fuzzy_Dunlop · September 2, 2012, 4:16am

I picked it deliberately because it’s a very simple way to illustrate pitfalls of selecting a model with an overly high capacity. Whether or not there is a logical connection doesn’t change the fact that you can over or under fit your model and that in any event, closely fitting the training set is no indication of predictive value in itself.

When you start throwing in logical relationships that intuitively make sense you forget the basic fact that if you keep adding parameters to your model it’s guaranteed to fit your data better.

They’re random numbers, there’s no point trying to model them. But for any given set it’s trivially easy to find a model that fits them well - it just won’t have any predictive validity. It doesn’t matter if I give you 10,000 random numbers and 10,000 parameters, the model won’t predict anything.

Fuzzy_Dunlop · September 2, 2012, 4:20am

When you try a bunch of things and pick the best ones you guarantee that you will confuse useful pattern with noise.

Measure_for_Measure · September 2, 2012, 4:45am

GIGO: Reason #2 is worse: 2. The CU model depends on flawed assumptions.

Unemployment affects the chances of Democratic incumbents being reelected but not Republicans? Republicans’ results are linked to per capita income while Democrats are not? Really? In statistics you need to start with assumptions that make sense and then test them, not start with data and then allow for as many crazy assumptions as you need to create a model that “correctly predicted” all of your data. I’m a little disappointed actually. If your assumptions are that squirrelly, you create the impression that you just ran a bunch of models before landing on a preset conclusion. And the fact that they truncated the dataset before 1980 doesn’t look good either.

Sigh. I guess there’s a market for this kind of thing.

Yeah, Eisenhower.

Seriously Voyager, Reagan was awful. He brought the Religious Right to Washington. He gave us Antonin Scalia. And he fostered the illusion that Big Guv could cut taxes and the deficit simultaneously: in other words he advanced magical thinking. Launching the largest peacetime defense buildup in US history was reversible. But the memes he established did lasting damage to the body politic.

Measure_for_Measure · September 2, 2012, 4:54am

ETA! That said, most of Reagan’s opponents in the 1980 Republican primary were fine or better.

Bryan_Ekers · September 2, 2012, 4:59am

Wrong! Clinton, '96.

Frylock · September 2, 2012, 1:27pm

To a great degree, it’s the testing that let’s us figure out what the strong causal predictors are.

Topic		Replies	Views
Univ. of Colorado predicts Romney win by a landslide!! Politics & Elections	49	9428	November 8, 2012
The weirdness of polls... What's up? Politics & Elections	191	17099	November 1, 2012
Conservatives don't think they can't win Politics & Elections	42	10348	January 10, 2012
Prediction: Romney landslide win Politics & Elections	274	49137	November 8, 2012
Nate Silver / 538 Was Right Politics & Elections	142	29946	November 15, 2012

What if 538 is right, and Romney loses by 60+ electoral votes

Related topics