Vote-counting and statistics: When are preliminary results accurate enough?

(Inspired by the California GMO foods proposition vs uncounted ballots issue)

It’s been a while since my last stats class, but I thought I heard something about being able to estimate, with high confidence, the final vote count of an election once a sufficient amount of them have been counted?

In this case, 5.3 million people (52.9%) voted no, 4.7 million (47.1%) voted yes, and (as of the article’s writing) 3.3 million votes have not yet been counted.

The article says that 58.5% of those 3.3 million uncounted votes would have vote yes for the initial prediction to be overturned.

Statistically, is there a way to predict how likely that is? And how do you know the votes counted thus far are representative and random enough?

They almost certainly have to count until it’s impossible for the losing side to win. That is the uncounted votes are fewer than the difference or difference plus automatic recount margin.

Statistically, you’d use a binomial test (or normal or t approximation to same), but you’d have to assume the uncounted votes were statistically similar to the counted ones and were independent of each other. These may or may not be good assumptions. If the uncounted votes come from a few precincts, are absentee, or are provisional ballots they may be correlated with each other because similar people cast them.

If you can make those assumptions, then the remaining ballots should have approximately normal distribution with expected value of (1-0.529)3.3 million = 1.55 million “yeses” and a standard deviation of sqrt(0.529(1-0.529)3.3 million) = 907. The needed number of yeses is 0.5853.3 million = 1.93 million. This is 380,000 above the expected value of 1.55 million or 380000/907=415 standard deviations.

The probability of a 415-sigma event is something like exp(-40000) or 10^(-17000) = 0. 0000… (17,000 of them)…1. That is so unlikely as to essentially never happen