Its hard to debunk this because there is so little detail. The only actual reference I could find was to the Stanford paper which was written by two graduate students with no indication of what department they come from. Hopefully, for the sake of their future careers, they aren’t from the statistics department.
There are a couple of things that look fishy. First they only looked at a subset of states (which didn’t include Wisconsin First for some reason they chose to analyze the number of delegates assigned rather than the number of votes.
This makes no sense, since what they are claiming is evidence of vote number fraud it would the potentially corrupted source is exactly what they would want to analyze. Also using delegates is going to add additional biases due to differences in methods used to assign delegates (fortunately it looks like they didn’t count super deligates which would entirely invalidate the analysis.) The only reason I could think of that they wouldn’t do this is if the analysis using actual vote totals didn’t give the results they wanted. Even then all that they show is that those with paper trails are more likely to go for Sanders. But it may be that those states whose are more on the left radical bent are more likely to elect legislators who feel the need for paper trails.
Their next argument is that since caucuses are more public and more tamper proof, the fact that Sanders won more of those is further evidence of fraud. This is clearly spurious since it is well known that the population that votes in a primary is very different from the population that votes in a Caucus.
They also make some statements regarding the differences between Exit polls and final results. The data suggest a fairly consistent bias of about 3%, with one big exception in Arizona (a paper trail state ;)) in which there was a 20 point bias. Overall its not surprising that there would be a small consistent bias for the reasons Ulf outlined.
Finally they say that the results they got didn’t happen in the 2008 primary, so there must be something weird about this election specifically. This isn’t exactly surprising, since that was an entirely different election with an entirely different set of issues that could create sampling biases.
The plots regarding the black voting patterns are shown without any context or statistical analysis at all, so I don’t know what to make of them. If anything I would think that they actually work against the claims made. Since the only way to find out the % of blacks who voted for Clinton would be through exit polls which wouldn’t be subject to voter fraud. Indicating that the difference they saw between such states was not related to fraud.
Next set of “evidence” is that Sanders has higher approval ratings than Clinton. Especially among Millennials. Well, so what, we knew that. Lots of voters don’t particularly like Clinton as a person all that much but think she is a better candidate. And lots of Millennials say they are going to vote but don’t actually show up.
Finally they have some plots, relating time at which vote totals came in and size of precinct in some cumulative way that seems like a very odd way to do the analysis. Again, I assume that they didn’t choose a more straight forward method because it didn’t give them the results they wanted. I guess it shows that for a selected group of states, Clinton support tended to come from the large precincts that may have taken a while to tally, but these may also be heavily urban precincts with large minority populations so its not surprising.