This article claims statistical evidence of vote flipping in favor of Romney in the Republican primaries.
I’m extremely skeptical of this analysis. I don’t like computerized voting and I’m uneasy about the possibility of undetectable manipulation, but the idea that this kind of manipulation could occur over a large number of precincts, in a number of different states, using a number of different voting systems, strains my credulity.
So, what’s the flaw in the analysis? Why would Romney’s percentage increase with the size of the precincts, except for those with paper ballots? Is it even true that that’s the case? Is this just a case of being selective about the data?
It seems to lack a little rigor. One would expect larger precincts, presumably in more densely settled neighborhoods, to have different voting patterns. It’ll be interesting to hear if anything comes of it.
I’d like to see a Federal statute calling for the death penalty for people who mess with voting or registration. It’s tantamount to treason.
ETA: This “They argue that the probability of this happening by chance alone is so small it exceeds the capability of statistical packages to handle” seems like total bullshit.
Well, I don’t think it is “bullshit” in a strict sense. I think it is almost surely true. However, that doesn’t mean that the deviation from pure randomness is due to some malicious factor. It could be due to some non-malicious factor like the possibility that there are perfectly valid reasons why larger precincts would tend to be more pro-Romney than smaller precincts (because larger precincts are generally less rural and Romney was less popular in the socially-conservative rural districts for example).
What it shows you is that there are statistically-significant deviations from the model that votes in the precincts of different size are completely random, i.e., that there is no correlation between precinct size and voting pattern. But, I imagine you would find such a thing to be true in lots of elections for the sort of reasons that I mentioned. I wonder if the original paper in any way tried to control for such things.
What they question is why the size of the precinct seems to directly correlate to the pro-Romney effect. Buried within the links, there was a study that they did to test the theory of rural vs less rural areas that seems to disprove at least that hypothesis. In the material that I read, they ask for other theories and provide contact links.
The statement about a statistical probability of this happening by chance being so small that the statistical package (Excel) cannot handle it was about the direct correlation between precinct size and pro-Romney effect.
No other candidate in the primaries had this same effect in any of the states and precincts that they tested. Only Romney benefited from this anomaly.
The pro-Romney effect as the total number of votes counted rose did not appear in any primary precinct that was manually counted and tabulated. Those precincts experienced the same flat line effect as has been seen in the historical elections that they provided for comparison. Again, buried in the links is an explanation for why the flat line is the expected trajectory as the number of votes counted increases. They also provided an example from a non-US election to show that the flat line effect as the total number of votes counted is the expected norm even in other countries. Buried within the links is a discussion about how news organizations use this flat line effect to predict vote outcomes in advance of the final votes and how the high number of prediction errors cause some researches to look at the cause of these errors.
So, the symptoms that they show:
High (direct?) correlation between size of precinct and pro-Romney vote
Lack of the classic flat line as the total number of votes counted increase.
No candidate beside Romney experienced a positive effect from this anomaly.
This (the Romney upswing as vote counts increase) is a new effect , not previously seen in polling data.
These effects have a high correlation with the use of automated voting equipment and were not been observed in the manual precincts that they’ve looked at.
From a purely technical (technology) perspective, the creation of a vote flipping ‘device’ would be trivial to design an manufacture. It could be a modified camera, an android device, … there are lots of possibilities.
The most complex part would be the statistics involved in the design of a central controller to ensure a low level effect over a wide group of states and precincts because they would need to ensure that they ‘tweek’ the counts enough to achieve the desired result while not taking it so far as to set off alarms.
With a good design, the ‘tweeking’ devices could be updated with new data targets through the course of the day over the internet, so the central controller could actually provide real-time response to vote tallies in non-target precincts.
DavidM - The data from these machines are stored on either SD memory cards or thumb drives (depending on the manufacturer). All you need is a device that can read and re-write the data on the device. With a properly designed “tweeker” device, it would take only seconds, maybe a minute, to alter the data. I’m saying a camera only because it is a small device, easily carried and the ‘guts’ could be ripped out and replaced with a simple processor and multiple storage reader devices (e.g. thumb drive, SD Card, etc), the processor can be triggered by a button and provide an LED readout of results (e.g. process complete).
Pseudo-codish-example:
Tweeker is deployed to Precinct X. They connect to the central controller to download the expected machine type and the target data levels (e.g. candidate X gets 43% of vote)
JoeBloe carrying the tweeker is working at the precinct and volunteers to log in the data cards as they arrive from the polling sites. For all purposes, he’s supposed to open an envelope, verify a serial number, write some stuff down, then provide the data card to someone who processes it. But, he is also slipping it into his tweeker, pressing a button, waiting for a green light before he passes it on.
Pseudo-ish processing description:
If candidate X has 43% or more, no action.
If candidate X has less than 43% {
determine number of votes needed (N) to put Candidate X at 43%
for 2/3 of N [
find vote for Candidate Y
flip vote to Candidate X
]
for 1/3 N [
find vote for Candidate U
flip vote to Candidate X
]
update totals on the storage card to reflect new totals.
A simple PERL script could process thousands of records in seconds.
The key in this scenario is that
(a) They need access to the storage devices at some point after the voting but before the ‘official tabulation’.
(b) They need access to the voting machine design specs to create the tweeker code.
IF they didn’t change the data on the voting machine storage media, that would leave them open to easy discovery of the manipulation. So I think that they would have to alter the data on the storage device itself.
As pointed out, vote flipping has to potential to result in a negative vote count, so very small precincts are risky. Some of the polling stations they talked about had only 10 voters. It would be easy for them to discover at the next community picnic that no one actually did vote for Candidate X.
That is why the direct correlation is presented as a symptom of automated manipulation. Theoretically, the large the number of voters, the more they can tweak because the larger anomaly can be lost in the size of the voting group.
If the assertion is correct, there have been oddities in the past that may have been trail runs. Back in 2004 (I think) there were machines that counted more votes than actual voters. It was written off to random error or chance at the time. But, it would make sense that if you’re going to run an operation like this, you would start out with feasibility tests, then a slightly larger test run. The only problem is that you only get so many ‘live voting’ windows to test in. The results that are being highlighted in the articles are from the primaries. I would expect them to update the central controller to provide more randomness so that the direct correlation effect does not happen in the general election in November.
Well sure, if they have someone with physical access to the equipment. But how many people would that involve?
I could accept manipulation at a county or two here and there. I could even accept machines intentionally programmed for cheating at the factory (1 person replacing the standard drive image with altered code).
But being able to train and place that many completely trustworthy people in the right positions all over the country, or being able to sabotage a number of different companies’ software at the factory starts to sound like tinfoil hat conspiracy theory.
Some things I’d like to know:
How many different brands of tabulators were in use in the counties that showed this pattern?
Were updates applied remotely? Were these tabulators internet accessible, or accessible via dialup? Via WiFi?
Beyond the points others have made here, it’s worthwhile to consider the old saying about statistics: “if you torture the data enough, it will confess to anything”.
There are virtually unlimited number of ways in which you can cut and slice the data and analyze to find anomalies, and with the power of modern computers you can find them. Among those ways will be some which are extremely unlikely to have happened by chance alone.
Here is a copy of the 2007 audit on Ohio machines. There were 3 vendors in use in Ohio. I’m not sure that there are very many more vendors out there.
The PDF file gives complete descriptions of the machines (including pictures) how they’re set up and their weaknesses. This document has been publicly available for some time… so everyone has had access to it if they were interested in the topic.
They also claim that John McCain seemed to have benefited from the anomaly both in the primaries and against Obama. During the 2012 primaries, the effect was seen in every state except Utah and Puerto Rico.
They claim the effect was observed in South Carolina, and subsequently re-appeared virtually everywhere else in a systematic manner. I don’t think this is a case of data mining.
It could be a case of fabrication or bad faith (or fooling oneself, by seeing patterns where they don’t exist and reporting the cases where the patterns are clear). Given the allegations, the matter deserves review by a statistician.
The authors maintain: This is not a large conspiracy involving a complex network of perpetrators. Such an alleged election fraud could be accomplished by only a single, highly clever computer programmer with access to voting machine software updates.
He would have to have access to the updates for machines from a number of different manufacturers. Unless this effect is only seen in counties using one or a small number of brands of machines.
I haven’t had a chance to look at Enkel’s PDF yet, but he says that it discusses machines in Ohio. The authors are claiming that this effect occurs in a number of states. Are those same three manufacturers in use in all of the states that show this pattern?
I’m not trying to dismiss this out of hand. It’s extremely troubling. I work in IT and I DO NOT think that we should be voting in this manner. But the first step in investigating is determining whether or not what we’re seeing is real.
I do not want to believe that these findings are valid. I have been scouring the web for evidence that it is simply a tin hat conspiracy. But so far, no one has effectively discounted this disturbing information or the concomitant conclusion that there is some kind of tampering going on.
Many have suggested alternative explanations that the authors have already ruled out:
-they have already shown that the pattern is not due to differing demographics for big vs. small precincts
-they have already shown that precinct size is unrelated to “republican-ness”
-they have already shown that the pattern does not emerge in other elections–it is not some rare but still possible “glitch”–it only occurs when a central tabulator is used and never in elections involving only Democrats.
It seems clear that something weird must be happening, if the “N” remains the same for a precinct and just the distribution of the votes is changing systematically within that N.
I am desperately seeking some very smart person to point out the logical flaw in the conclusion that this is the result of intentional malfeasance. Even some benign alternative explanation for that weird effect would be welcome (but one that has not already been ruled out).
Essentially, my question is, if we assume that the data are real and the analyses sound (these might be big IF’s but…) what could explain this effect, other than vote flipping? Are there persuasive logical explanations that do not require intentional tampering?
It could be a combination of underestimating the chance of it happening by chance, along with the hundreds, if not thousands, of chances of anomalies like this happening across America every year. If there is a .1% chance of something happening by chance, then each election season there will probably be an anomaly somewhere in America that looks awfully suspicious, but yet actually happened by chance.
Then again, we could eliminate even the appearance of this by implementing more secure systems. It would increase the electorate’s confidence in the outcome of the election.
The problem I have with that explanation is that you’re essentially saying that, in cases involving large numbers of data points, statistically unlikely patterns aren’t indicative of anything.
I can accept that, but then how do we tell a real anomaly from a statistical fluke? What are the hallmarks, if any, of a real anomaly? What needs to be present before we decide that something is worth investigating? These are sincere questions, not attempts at rhetorical points.