Oddly, I trust FiveThirtyEight more after hearing this. Do you?

I was just watching ESPN First Take, and they had a guy on from Baseball Prospectus who was apparently the only person in the known universe to predict that the Tampa Bay Rays could win the AL East. That person created a program to analyze and predict each individual player’s performance in order to come up with a prediction that even Vegas thought was preposterous. His name? Nate Silver.

At the end of the show, they gave him a generous plug of FiveThirtyEight.com. The whole time before that, I was thinking “Is it possible there could be more than one stats freak out there named Nate Silver?” So it is definitely the same guy.

Now, I haven’t looked to far into the methodology used on FiveThirtyEight, but if he could come up with something that predicted the Rays going this far, I have to think that the guy knows what he’s doing, more so than I thought before, when he was just some Democrat crunching numbers.

I’m not sure if there’s much of a huge debate here, but do you think more of Nate Silver after hearing that? Or do you think that there couldn’t possibly be enough crossover between the two arenas (statistical analysis of sports vs. politics) for him to be equally as precise at both?

You may want check out his initial coming out post. He makes a good case that the essentials of evaluating which statistics matter is very similar.

I have looked into their methodology, at least as much as they publish, and that’s why I trust them over the other electoral projection sites.

As I said in another thread, few people know more about statistics than baseball statisticians. And the guys at Baseball Prospectus are geekier than most.

Now if we only knew Barack Obama’s BABIP as a rookie…

Absolutely, you should trust him more given his background. These guys are all about relegating subjective, unprovable assertions, and about using objective facts to come to accurate conclusions. More importantly, they understand the numbers and how to use them, and they have tons of experience developing statistical models and applying them in a practical, *testable *way.

It’s also worth noting that Nate Silver is an avid poker player. When a serious poker player and world-class sabermetriciantells you that the numbers say x is true with y degree of confidence, you can be sure he did his homework.

I trust him in so far as he takes a very analytical approach and has a history of working with stats and getting them right. I had known he was a respected baseball stats guy (but I don’t follow the sport) but the Rays thing is kind of neat.

I’m also aware that politics and baseball aren’t the same thing. So I guess his Rays prediction doesn’t make me certain that Obama will take the election but his look at the numbers certainly seems credible.

I actually play a statistician on TV. In the academic political (and often election-predicting) literature, scholars tend to make their datasets and codebooks available for replication and verification. Unfortunately, Silver does not. I don’t tend to be very trusting of findings that I cannot actually verify myself.

That said, his methodology is sound. Although he is explicit in that he models some polling effects by assumption, his assumptions are reasonable. I am less comfortable in general with the nonparametric methods he uses, but it does not look like on the surface like he is screwing anything royally up.

I know little of Silver the man. Does he pay the bills off of his baseball predictions? I wonder if there’s a commercial interest in there for keeping his methods close to his chest. Particularly if his ways of prediction election outcomes is based in his ways of predicting baseball outcomes

I understand why that’d be frustrating to folks who want a gander at his methods for academic purposes but would also understand why he wouldn’t want to start handing out his secret formulas.

No, he is very explicit about his methods. He uses a technique called local area regression, and every time he makes a modeling assumption (like the half-life effect), he tells us. There is nothing particularly complicated or unusual about his statistical techniques.

What is a big deal is the dogged effort he made to collect all of the data. He has an enormous amount of polling data at his disposal. To replicate his findings would take a massive effort. For him it is a labor of love. For me, well, even though I was trained as a political statistician, I would rather spend that kind of time developing my own project. I would just love to be able to download his data and play with it a bit myself.

Why don’t you present your bona fides and ask him if you can borrow his data? The data ain’t exactly private and you wouldn’t be using it to set up a competing system. I’ve found people–okay, geeks, but they’re the only people most of us know–love sharing their stuff as long as they know you won’t make any money off it.

Silver’s baseball work is what pays his bills. He works for Baseball Prospectus and he was responsible for developing PECOTA, which is one of the most accurate baseball statistical projections known. More here.

To briefly summarize, PECOTA uses a player’s relevant statistics over the course of his career and uses a nearest neighbor search to identify players who have had similar career paths from all players back to World War II. The model then makes predictions of future output with varying levels of confidence based on the career arcs of the players whose careers closely track the player in question. I find this whole thing to be quite genius, actually. His prediction of the Rays’ success is just another demonstration of the accuracy and predictive ability of the model. 538 does somewhat of the same thing as well–I believe it incorporates past behavior of states and uses demographic trends to identify how polls in one state may predict about the behavior of another. Genius as well, IMHO.

Not a terrible idea. When things die down after the election, I might just do that.

I had a good enough handle on his methods to trust his figures before I saw that, but it did answer the question of “How did this guy get so much experience with statistics?”.

Well, to be fair, it is not exactly genius. I do much the same thing in my job, except instead of baseball players, I model the behavior of millions of people with credit cards. This sort of work is not as thrilling to the public as baseball, but it keeps an awful lot of stats geeks employed.

I am not familiar with the technical details of PECOTA since I am not very interested in basball (aside from its statistical applications). What impresses me the most about Nate is his incredible commitment to compiling and managing the data and some of his creative applications. It takes an enormous amount of dedication and pre-work to do some of the stuff he has done. That is the really hard part.

Fair enough. I would consider his ability to identify the systems and relevant statistics in baseball genius, not the methods themselves. Clearly he did not invent statistics, but applying them in unique ways to systems (baseball and elections) that nobody else was looking at in that way, I would classify as a stroke of genius.

Any reason why he chose then instead of, say, 1921, the beginning of the Lively Ball Era? Does tossing out a generation of players filter out most of the guys who got their start in the Dead Ball Era? It’s not like we don’t have statistics going back more than 100 years.

I dug this up

The full interview

Thanks! He makes good points.

…consistently.

I certainly agree with that. To get back to the point of the OP, reading that interview gives me the highest confidence that Nate knows what he is doing with the election. That is one well-informed dude.