Say you’ve got three Best Movies lists: the IMDB Top 100, the AFI 100, and the recent EW Top 100. How would you have to weight the three lists to make up for the fact that they each have different cut-off dates? What about movies that appear on one list, but not the others? Are there other factors I’m forgetting?
Please explain to me like I’m five, I don’t remember much from my college statistics class.
I don’t think that there is standard way to analyze this data and certainly nothing that you forgot form your college statistics class. In fact, there is a theorem which shows that it is impossible to combine ranked lists in a way so as to meet all of the following conditions.
So the best we can hope for is some ad hoc algorithm.
An easy method would be the following.
If a movie is eligible to appear on the list (i.e. wasn’t produced outside the lists date cut off) but fails to do so, define its rank for that list to be = (length of the list )+1.
For each movie give it a score equal to the average of its ranks on a lists for which it is eligible
Rank the movies according to this score.
This method won’t be perfect and may have a bias in favor of movies that are eligible for the most lists, but correcting this bias would be a lot of work with little return.
Thanks! This is exactly what I thought would be the answer, but it’s nice to have someone else back it up.
One question though, what about a bias in the reverse direction? Wouldn’t movies that are eligible for the fewest lists may be overrepresented on the master list if they appear on all of those lists?
I guess it depends on whether you consider it a bias or a reasonable way to rank them. If a movie is only eligible for 2 out of 5 lists, but it is #1 on both lists, it’s not unreasonable to think that it would have been #1 on the other 3 lists if it could have been.
To some extent, the reasonableness may depend on how much the lists have in common. If one list is the AFI 100, the second is “Movies released in June 1972” and the third list is “Things Honey Boo-Boo likes” then the three are probably so different that no weighting algorithm will produce reasonable results.
I think you’d have to consider the bias in the lists as well. For example, the recent EW list does not include any Westerns. It’s very genre-biased and I don’t know how you correct for that.
:o While I don’t know The Wild Bunch, I cannot believe I didn’t see The Searchers! Wow. I think once I didn’t see any of the Eastwood movies or even the more famous JW movies, I quit looking. I still contend the list is seriously flawed*. Some of their rules were a bit bizarre and hard to interpret - which goes to the OP. For example, including The Godfather apparently made TGII ineligible - which is bullshit no matter how you look at it.
I was very disappointed in EW. Their music lists were better (although ironically a bit too modern-weighted).
My reasoning was as follows. A movie is more likely to be ranked high on a list with a small number of eligible movies than on a list with a large number of eligible movies since it has less competition. So movies that are on the most restrictive lists should do better. Assuming list restrictiveness is monotone, then these will also be the movies that are on the most lists.