Statistical Analysis question re: online polls/voting (non-political)

rexnervous · January 24, 2005, 10:39pm

There are plenty of sites out there that let people vote on things by rating them, then ranking the subjects by rating average. Many of these play out ok over time because they’re open-ended and every entry will eventually get a large number of votes (I guess). But what about closed-ended systems, like a contest, where it is not required that every entry receive the same number of votes, but that will end in a short period of time.

Basically, what is the statistical method to ensure that an entry that gets one vote of “10” (for a 10.00 average) doesn’t win over an entry that has a 9.8 average but had 1,000 people vote for it?

Do you only allow entries to be ranked that receive a minimum amount of votes?

Do you pre-seed all entries with “X” # of average scores, so one vote won’t skew it overly (every entry automatically gets assigned 20 votes of 5.0, for instance)?

Some more sophisticated method based upon total aggegrate votes, or average # of votes per entry (not ratings, but raw votes), that then determines the minimum votes needed to be eligible, or applies a bonus or penalty to low-raw-vote entries?

This math stuff be way over my head

ultrafilter · January 24, 2005, 11:29pm

Could you give an example?

rexnervous · January 24, 2005, 11:57pm

Sure, for the web project I’m working on (note that this isn’t exactly it, but a close enough facsimile).

Contest entrants create a song, and upload it. Site users can listen to the songs. There’s a one-month period for people to enter contest, followed by a two-week voting period. Users can (in theory) vote only once per song, but they can vote or not vote for any of the contest entries. They vote on a one-to-five scale.

Or if you want a real-life example, think
Am I Hot Or Not. Except there you are sort of forced to vote, but with my project you won’t be forced to.

ultrafilter · January 25, 2005, 12:50am

I see two ways. If r is the average ranking of a song, and n the number of votes it received, sort by the quantity max(1, r - 5/sqrt(n)). Or, sort by the sum of the votes per song.

rexnervous · January 25, 2005, 5:22am

ah, i see, sort of. Not sure about the basic sum of the votes, since then you may have the opposite of what I first wrote about happening - a song that is bad, but because of certain circumstances gets a lot of votes (even be they ratings of “1”), so that a bad song with an average rating of 2 but with 1,000 votes beats out a good song with a average rating of 4.5 but only 300 votes.

Problem is I can’t guarantee pure randomness in how things appear on site. For instance, the latest entries will be published on home page. If they get published on the weekend, then those songs may get more viewers (who knows) and thus more votes.

Trying to walk a thin line here, I know.

js_africanus · January 25, 2005, 2:08pm

Personally, I’m going to say that you have to switch your paradigm. (Heh heh, I used a buzz-word.)

You are never going to be able to claim statistical validity from such a survey, so drop that line of thought completely. What you are looking for are vote aggregation schemes, although that isn’t the technical term (if there is one), and I can assure you, without fear of valid contradiction, that the only guaranteed problem-free method of doing this is to have one person do the ranking. Google for Arrow’s Impossibility Theorem or Arrow’s Paradox for the reason’s why.

I suggest that what you want is the method of aggregation known as approval voting. In approval voting, each voter is allowed a number of votes up to the number of candidates. The voter then gives one vote for every candidate for which she approves. If she votes for all candidates or no candidates, then her vote is essentially null, because she has failed to differentiate between any of the options. The winner of the election is the one with the most votes of approval.

For your program, you define approval as being a score of X or higher, where X the number on the scale from 1 to 10 that indicates, for example, that a person would go out and buy the song or album. How you choose X will ultimately be up to you—since you don’t have a market research department, X is going to be fairly arbitrary.

Suppose you decide that a vote of 8 or higher qualifies as approval. (Or you could go with 6 or higher since it is above the midpoint and, I suppose, that would suggest that the song good rather than bad.) The song with the most votes of 8 or higher is the song that wins the competition.

This method obviously has flaws; however, a flawless system is well beyond your grasp. With the notion of approval voting, it may be better to just ask “Would you buy this?” and a yes indicates approval. Alternatively, you could ask Did you enjoy this song?, Would you like to hear this song on the radio?, or something like that.

That’s my suggestion.

Did you enjoy this post?

andymurph64 · January 25, 2005, 2:21pm

If I’m understanding you correctly, it sounds like a Bradley-Terry analysis might fit your bill with people not ranking a song as the lowest ranking.

I am unable to explain it right now but it gives you a name of an analysis to research.

don_t_ask · January 25, 2005, 2:26pm

Check out the bottom of this page at IMDB, who use Bayesian estimates to overcome such problems. I’m sure there is a better explanation on the site but I can’t find it.

anson2995 · January 25, 2005, 3:18pm

The formula they use is (according to the same page):

weighted rank (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C where:
R = average for the movie (mean) = (Rating)
v = number of votes for the movie = (votes)
m = minimum votes required to be listed in the Top 250 (currently 1250)
C = the mean vote across the whole report (currently 6.8)

This eliminates the problem the OP seems concerned with, which is different candidates getting a widely different number of votes. Seven Samauri got less than 30,000 votes while Shawshank Redmption received over 135,000.

rexnervous · January 25, 2005, 6:56pm

Wow, thanks for all the helpful replies today.

My first take on the true Bayesian estimate was that that would be exactly what I was looking for, but I ran some sample numbers (granted, on very low vote totals) and it really seems to skew towards those that receive lots of votes, regardless of what their true average is.


	Tr Avg.	Bayes	# vote	cum score
Entry A	3.833	4.607	12	46
Entry C	3.000	4.138	11	33
Entry E	2.545	3.939	11	28
Entry D	3.500	3.909	6	21
Entry F	4.500	3.846	4	18
Entry B	5.000	3.174	2	10

	3.391		46	156

What’s weird is that the entry with the lowest True Average (Entry E) actually comes in 3rd place with Bayesian, which I don’t follow.

So I’m a little hesitant to use that, esp. since I really don’t know how many votes these entries will be receiving. Very short window of voting, small target audience.

Leaning now towards just a plain True Average with a minimum # of votes to be eligible.

Guess there is no perfect system.

Chronos · January 25, 2005, 8:45pm

I would assume that people rank the songs which they think are particularly good or particularly bad, and don’t bother with the ones they think are just mediocre. So if a person doesn’t vote at all for a song, that could be interpreted as a vote of 5.5. So you could pad each song with enough votes of 5.5 that they all have the same number of votes as the most-voted song.

To get a little more sophisticated, it might be that people are more likely to vote for a good song than for a bad song, or vice versa. In this case, you could pad with an amount that would bring the overall average to 5.5. That is to say, if you have a lot of votes of 9 and 10, but few of 1 or 2, then that indicates that most of the un-votes indicate that a song is pretty bad. So the un-vote padding should be a fairly low vote.

Both of these assume that any given song is equally likely to be considered. A song that just isn’t heard much (because it was entered into the contest just before deadline, say, or it was on the last page of the list) would then tend to look more average than it really is (even if it’s actually unusually good or unusually bad). In this case, it would probably be better to require some minimum number of votes for a ranking to count. If one person votes 10 on something, that might just mean that that was the guy who wrote the song, or something. But if, say, 100 people all vote 9 or 10 for something, and nobody votes lower, then you could safely say that some unbiased folks are listening to it and liking it. I’m not sure where to set the cutoff, but it would probably depend on the number of people you expect to be involved (if only 50 people are going to see the website at all, then obviously a cutoff of 100 won’t work).

I would also advise that, whatever ordering system you use, that you give the later viewers as much information as feasible. So, for instance, your number 1 pick might say “<name of song 1> 9.82 (736 votes)”, followed by “<name of song 2> 9.76 (3644 votes)”, etc. If you wanted to get really fancy, you could include a little histogram image with each song, so you could tell which “average” songs everyone thought were average, and which ones some folks loved and others hated.

js_africanus · January 25, 2005, 10:00pm

I would strongly advise the exact opposite if the desire is to get an accurate opinion. Giving info about how previous voters had voted can bias the results.

rexnervous · January 26, 2005, 1:28pm

Yes, we had already decided that we wouldn’t show ratings on a piece until after a person had voted on that piece (in other words, only s/he would see the score, and only after s/he had voted).

Topic		Replies	Views
Statisticians or electoral process specialists - question on polling In My Humble Opinion	20	1862	May 24, 2002
Any way to make this poll statistically meaningful? Factual Questions	15	1143	April 22, 2001
Math question (both mundane and pointless!) Miscellaneous and Personal Stuff I Must Share	4	795	October 26, 2002
Discussion of alternate election methods for the U.S. Great Debates	42	1936	May 5, 2004
Fun with small-sample statistics Factual Questions	14	1231	September 30, 2002

Statistical Analysis question re: online polls/voting (non-political)

Related topics