538 takes on another pollster

I agree. It is disturbing on some level.

What amazes me is that they can produce reliable results surveying a couple thousand people in a nation of 350 million. I have seen the math, I know it works (when done right) and know that surveying 1 million people does not nudge the reliability significantly enough to be worth the trouble.

On the up side Hari kept trying to tell people he was not an Oracle and could not predict what would happen to them and could only predict larger patterns. So we have that at least and can pretend we are not subject to destiny as an individual (only as a group…try not to think on that one too much though :wink: ).

I agree with Sam Stone that much respect has to be given to Nate Silver. From reading his stuff I suspect he is a slightly left-leaning centrist but he has utmost respect for the numbers. Whether he likes the answers or not he follows the math wherever it takes him and dutifully reports it. Even better he calls bullshit where he sees it and holds people’s feet to the fire (as in the OP) when they mess with his beloved profession.

For that I find him to be one of the most trustworthy sources of info out on the net today.

Putting aside the counter-allegations about motivations and whatnot, how has the primary call of Shenanigans! been reviewed? Aside from the folks who discovered it and those at 538, have any Dopers checked their math? Have the wider statistical community re-checked their math? Not their arguments and conclusions, but the mathematical analysis itself.

It strikes me as beyond ballsy for R2K to take that defensive tack unless it’s somehow (even roundaboutly) justifiable. Are there any mathematicians out there defending them?

I cannot say for sure but my experience reading 538 comments is there are a lot of armchair mathematicians checking the work and putting their $0.02 in. Silver even responds to them sometimes and it seems some of them know their shit.

Dunno about support from the other side.

An interesting thing about this one is that Silver (with whom I corresponded a few times on statistics from The Baseball Prospectus) didn’t start this particular fight. Three other guys, Grebner, Weissman and Weissman who threw the flag on this one. Silver and 538 got involved when the three shared their research with Silver and he began publishing about the issue and R2K’s relationship with Daily Kos.

So, to be clear, Silver’s just basically standing around, checking out and reviewing polls and stats (and he had suspicions about R2K prior), when suddenly he’s in the middle of another one of these things. It’s a complicated event for him but one that can only benefit from his being straightforward and upfront.

Oh, and ignoring that cease and desist he got.

True not all numbers are going to be uniformly distributed, and so a generator would have to be used intelligently by someone who knows something about statistics (one would hope that even a fake polling firm would at least hire a statistician). If I was going to make fake data I would start with some other polling data, assume some model to simulate whatever trend I wanted. Use that to get the probability of a theoretical respondent responding in a particular way, and then simulate a set of respondents. Unless the model was artificially screwy it would obey Benford’s law just like real data does nor would it have a bimodal distribution of changes. I would also have a raw data set that I could present to anyone who questioned my data. It would take me a couple of hours, but would be much cheaper than actually running a poll.

The only thing that would be missing would be any hard evidence that any calls were actually made.

ETA: the other disadvantage of this vs making up numbers is you have to take the effort to analyze your fake data.

Obama/Silver 2012?

So here would be my uneducated attempt at creating a model.

I start with Smith against Jones for dogcatcher.

The story I want to tell is that Smith leads Jones by about ten points overall, and roughly fifteen percent of likely voters have no opinion.

So I write a little program:

Pick a random number between 1 and 1000.
If it’s 1-150, the “voter” surveyed had no opinion.
If it’s between 150-525, the “voter” picked Jones.
If it’s between 525-1000, the "voter picked Smith.

Repeat 584 times, and, hey presto, I have 584 voters “surveyed.”

I’m sure that’s discoverable, but I don’t understand why.

I’m not a statistician, but my understanding is that it isn’t this one poll that will look suspicious, but the statistically identically distributions generated by all your polls.

Indeed. From Benford’s Law any set of returns that shows a random distribution of leading numbers (the first number in an answer) should throw a red flag. There are others. It’s weird but it’s there.

Bear in mind, I do quite a bit of statistical stuff (some for fun) but these guys do stuff that makes my head spin.

That’s pretty much what I’m recommending, but the question is how you got the 10 point spread to begin with. If all your polls show a 10 point spread, whether its Smith Vs Jones next month or some other poll, it would start to look suspicious. The problem with the R2000 poll is that the probabilities shifted up and down each time the poll was run but never stayed the same. So I recommended an extra step in which you model the underlying probability so that across multiple polls the results look convincingly random.

Now a truly sophisticated statistician could look at those and not that the distribution of probabilities fits a beta distribution (or whatever I took as my underlying model) surprisingly well. But that would take so much data that I will probably retire before I was found out.

The main thing that would find me out is that my polls fail to match reality.

PS: just for the record I have a PhD in statistics so I may have a little extra appeal to authority.

It’s their only option. If any of this turns out to be even a little bit true, they’re done for. They could admit that they made some mistakes and pledge to work harder and be more open in the future, but since every poll they would put out in the foreseeable future would have to be looked at with a skeptical eye, no one is going to hire them.

The only chance they have (and it’s a slim one) is to convince everyone that it’s a 100% smear job. If they can make it into a personal thing with Kos and Silver they might be able to find some business later on in the right-wing media.

Of course, it’s possible that it’s all crap, and my knowledge of statistics is not nearly good enough to dig through this and say for sure. But I can’t imagine Kos coming out so strong with this if there were any real doubt.

Sounds too much like work. Can’t we just make up the numbers instead?

That’s the problem with cheating and other sorts of rulebreaking. If you make an effort you can avoid being caught, but the reason you’re cheating in the first place is that you don’t want to make an effort. So since cheating is a way of cutting corners, it’s no suprise that cheaters do a half-assed job of cheating.

Well, in cases like this you need to have raw data to show your employers. You don’t just have a paper that says “54% yes, 44% no, 2% undecided”, you get a spreadsheet (usually) of data that has the results of each individual poll.

Not having this raw data means you’re screwed if they ask you for your polling info.

I know,I know, crime never pays, cheater never win and winners never cheat, but I still find this surprising that they don’t at least try to make it look good. I mean here as I see it are the options.

Do it right:
Cost: Lots of $$, time, effort and staff to actually call people, record their answers and analyze the data
Chance of getting caught: None you did it right

Use Buck Godot’s handy dandy fake data generator.
Cost: About 3-4 hours spent by a single statistician to make up really good data and analyze it.
Chance of getting caught: Very low unless your model is way off from reality or someone asks about your non-existent call centers.

Fudge it totally:
Cost: Say 1/2 hour to make up numbers that add up to 100%
Chance of getting caught: High because human beings are really lousy at coming up with random numbers, and Nate Silver is out there with sharp grinning teeth ready to eat you for lunch.
It seems to me that the extra 2-3 hours it takes to make up good numbers is well worth the price of not getting caught.

This article in Politico yesterday was really fascinating to me:

Even before the R2K scandal, I thought it was fascinating how the media had collectively decided there was a rampant anti-incumbent furor and how roundly reality kicked them in the face when people started actually voting. Evidently there was a dearth of polling in the Arkansas Democratic Senate runoff and R2K’s potentially fraudulent poll was all the media had to go on. This majorly impact the perception of that race and people thought Halter had more of a chance than he did. It also influenced the broader media storyline about anti-incument furor that has turned out to be much less strong than they wanted it to be.

The problem here is that polling firms are a dime a dozen, the advantage R2K had was that gave results their client wanted. That put them square against your condition “unless your model is way off from reality”.

Reality sucks because it’s utterly unforgiving.

We’re probably seeing an instance of the Dunning-Kruger effect here.

I grade college classes. I’m no longer surprised by anything cheaters think they can get away with. “You didn’t show up to lab that day, and your report doesn’t even match the format of the experiment as we did it this semester. And last week, your lab report was word-for-word identical, aside from some amusing misspellings, with another student.” “But I didn’t cheat!”

I suspect that their demonstrated willingness to fudge data will make them VERY attractive to some right wing outfits.