What principle do you use when ranking items?

No, there are lots of “natural” rating systems, and most of us are arguing to use one as the first pass, and then slot those ratings into the artificial 1-5.

is a natural rating system.

is a natural rating system. So is

All of those give strictly more information that your quintile system does, because the numbers actually tell you something.

One of the things I like a Swappa, a site to buy and sell smart phones and some laptops and tablets, is that they clearly state the natural rating system they want you to use:

FAQ: What are the requirements for product condition categories on Swappa? - Swappa

They have words (New, Mint, Good, and Fair) to describe the conditions, not numbers. But it’s 4 groups, and it’s four groups that are really useful. (“Fair” is the lowest, because they don’t allow gadgets rated less than “fair” to be sold on their site.)

No. You are really not understanding the fundamental difference here. Take @puzzlegal’s example of eBay’s rating system. If I have only had a bad experience with 1 seller, and every other item I bought has had no issues, all of my ratings will be 4s or 5s. You would say that 20% of good sellers should be given a 1 out of 5.

You are misunderstanding - nobody is saying that there is nothing arbitrary about their rating system . But being arbitrary about the criteria you use to judge restaurants isn’t the issue people have with your rating system. It’s the arbitrary division into percentiles or artificially fitting scores into a bell curve that people have a problem with. Your system is like having an employee evaluation system with five categories and putting 20 % of the employees in each category , so that 20% are “unsatisfactory” and 20% are “needs improvement” even if half of the “needs improvement” group is indistinguishable from half of the “unsatisfactory” group. When grades are artificially fit into a bell curve , it gives you some information ( how well a student did relative to other students) , doesn’t tell you other information ( whether a student has mastered the material) and hides still other information ( you won’t know that the average score was 35% if the bell curve has turned that into a C - and that information might be important. )

Of course, this system is somewhat useful if you’re required, for some reason, to apply some function to a large group of people or objects. With the example of employees, let’s say we need to fire 20% of the workforce. Then it makes some sense to group them into 5 categories of equal size, and then fire the 20% who got 1 star ratings. Sure, there’s going to be an arbitrary cut-off, where Alice with a score of 80% keeps her job, while Bob with a score of 79.9% is out of work, but we have to establish some kind of cut-off, so Bob’s out of luck.

But while this is easy, and is just objective enough to deflect most criticism, it also might obscure some information. Maybe our worst 20% is better than the industry average, and the only reason we’re firing anyone is because our new CEO has stupid ideas about how to manage a company.

Another way to describe it is like this:

Let’s say you’re a professor and you give a test out to your students. Due to some quirk of natural ability, studiousness, and sheer luck, it turns out that ALL your students score between 80 and 100 on that test.

Is the OP really saying that they’ll apply a curve to that and fail the bottom 40%, even though they got 80 or higher in an absolute score?

That’s what the OP is implying in effect.

Or maybe Carl is a lousy worker, but he’s the only one who knows how to do what he does, so you have to keep him; or Dave and Emily are both very good employees, but they both do basically the same thing and you don’t really need both of them.

A single rating system is useless. I might like the food better in one restaurant, and the ambience better in another. How can I rate them against each other? Sometimes I want better food, and sometimes I want to go to a place that looks nice. So now I have 2 rating systems: one for food quality, and one for ambience. But what about price? Now I have 3 rating systems.

Another problem with arbitrary ratings is that thery don’t give you a sense of the distance between groups.

If I rate restaurants from 1-5, are the restaurants that scored a 5 hugely better than the restaurants that scored a 1, or just slightly better? If everything is clustered around high quality, then the distance between 1-5 may not be that great, and the ranking system is not just useless, but misleading. Is the difference between a 5-star and 4-star restaurant big enough to make a difference to the average customer?

You see this in car reviews ranking the ‘Ten Best’ cars, but you find out that the car in 10th place is only marginally different than the one in first place. All the ‘ten best’ cars are probably good enough that your buying decision will come down to personal preference, yet the ranking system implies that one is objectively better than another.

So the first thing I’d do in a ranking system is determine the absolute distance between the first and last item in the list for whatever criteria I’m ranking. If it’s not very big, perhaps trying to individually rank the items is a waste of time, and you should look at larger grouping (“These are excellent, these are very good, and these are good”) or not ranking them at all.

Ranking works best on items with more of a Pareto-like distribution where there is a huge difference between the best, the okay, and hte bad. Even then you might find big clusters where you can’t really discern a difference between many individual items and ranking them individually just creates false information.

I wouldn’t do that.

When I wrote the test, I’d be establishing the criteria for the results. And presumably there would be some objective basis for my decisions about what the right answers were. For example, if I was writing a history text and wrote the question “Who was Abraham Lincoln’s first Vice President?” there would be only one correct answer to that question. And any student who answered Hannibal Hamlin would have answered the question correctly. After the students took the text and I checked all their answers I would have objective data to assign grades by.

But if you think a decision to rate a restaurant a three is the same type of objective data as answering that Hannibal Hamlin was Lincoln’s first Vice President, that’s the cognitive dissonance I was talking about. One’s a fact and the other is an opinion.

If you disagree, tell me where the data on the “absolute score” for restaurant ratings comes from. What produces those objective ratings?

If you concede that you’re making up the numbers, explain why you’re making up numbers the right way and I’m making up numbers the wrong way.

Of course not; which is why I’ve already said in this thread that I dislike and avoid using systems which require assigning each item a specific number. This may be possible for certain limited situations, but I can’t make it work at all for something like restaurants (or for that matter employees), for reasons that I’m not going to spell out all over again, especially as several people have already given most of them again in posts following the one I’m replying to.

But I don’t see how your system works at all without applying such precise numbers, and a different number to each item in the list. Otherwise how do you tell that something’s number 20 and belongs in your first category, instead of being number 21 and belonging in your second?

ETA: What I presumed that @Eyebrows_0f_Doom meant by “should have been rated a 3” wasn’t “should have been rated #3” but “should have been in the 3rd category” – that is, for instance, in a category of “restaurants that provide a decent but not unusually good meal” instead of in a category of “restaurants you really don’t want to eat at, including those which may give you food poisoning”.

Okay. And now take it a step further.

What is the objective basis for saying that the third category is "restaurants that provide a decent but not unusually good meal” and not “restaurants you feel belong in the middle twenty percent of all restaurants in terms of overall quality”?

As has been previously said in this thread, though possibly not by me: because there are almost certainly more than twenty percent of restaurants that provide a decent but not unusually good meal (as well as far fewer than twenty percent that provide a serious risk of food poisoning.)

I’m not at all sure what you are saying here - whoever is ranking/rating the restaurants will define the categories they use and nobody is saying that one category is objectively better than the other. What people are saying is that they find a category of “restaurants you feel belong in the middle twenty percent of all restaurants in terms of overall quality” less useful than "restaurants that provide a decent but not unusually good meal”. You don’t agree with that - apparently you are more interested in a restaurant’s relative ranking and would be fine giving the highest rating to the 20% of restaurants that were the best, even if 95% of the restaurants had terrible food and worse sanitation.

I’ve said all along the five imaginary groups or approximately equal in size. It’s not like I perform precise mathematical calculations and say things like “This meal is pretty good. I’m giving this restaurant a rating of four which bumps that pizza place I ate at nine years ago down to a three.”

Instead, I’ll use a process more like “This meal is pretty good. Do I feel like it’s one of the best meals I ever ate? No. Was it really close to being one of the best meals I ever ate? No, not really. Was it an average meal? No, it was definitely better than just average. So based on this meal, I’d rate this restaurant above being just average but below being one of the best. Which is a four.”

Now you can say, “Nemo, you didn’t do any math! You just decided how good the meal was and gave it a rating!”

And I would respond, “No, I did do some math. It just wasn’t obvious.” I decided this wasn’t one of the best meals I ever ate. And that required me to have a definition of how many meals belong in that category. I could decide that half of the meals I’ve ever eaten in my lifetime were one of the best meals I ever ate and the other half of the meals I ever eaten were one of the worst meals I ever ate. Or I could decide that the two best meals I have eaten were the best meals of my lifetime and all other meals were among the worst. Both definitions are literally correct. I’ve just chosen to say that any meal that’s among the best fifth of all meals is one of the best meals I’ve ever eaten. And then I decide if this particular meal fits in that top fifth or not.

Yes, this, exactly, and this is why I mentioned above that the rating system needs to be transparent to have any utility. If I know why you gave it the score you did, I at least have a shot at translating that score into something I personally find useful.

It’s like a local movie reviewer we had back in the 80s and 90s. He was pretty consistent - if he didn’t like a movie, or complained that it was “confusing”, I’d almost always like it, and find it “interesting”. His opinions were utterly different from mine, but they were different in a sufficiently consistent way so as to still be useful - so long as I didn’t take his ratings at face value.

With the “even 20s” ratings, I’d probably figure anything 3 or above was a good bet to at least have a decent meal. I’d probably avoid anything that’s 1 or 2, because I’d know the reviewer was possibly including a few terrible restaurants just to keep the numbers straight. And I’d likely miss out on a few decent restaurants out of fear of hitting a terrible one, but so long as there’s enough restaurants to keep me occupied, that’s not much of a loss.

And that bumps some other meal out of that top fifth? Which it has to, if it’s going to remain a fifth. How do you decide which restaurant gets bumped?

And I and others in this thread are saying that our imaginary groups of restaurants are not approximately equal in size.

Do you really think that restaurants that might give you food poisoning, or for that matter restaurants with extraordinary food, service, and ambiance, exist in equal numbers with restaurants that provide a decent but not unusually good meal?

Or is it just that those distinctions somehow don’t matter to you, so you don’t think they provide any useful information to anybody?

I could understand if those distinctions don’t matter to you; especially if you always have lots of restaurants easily and affordably available to you, and so can easily afford to write off everything you think is in the bottom X percent without caring whether the reason it’s low rated is because there was rat shit on your plate or because the salad was flavorless tomato on iceberg lettuce or just because you didn’t like the salad dressing. What I don’t understand is the insistence, not that you find your method gives the information that you in particular want, but that your method gives by its essential nature more information than any method that results in different sized groups.

Do you understand that other people aren’t aware of what distinctions you are making? (Or what distinctions I am making.)

You might feel it’s vital to distinguish between restaurants that will give you food poisoning and restaurants that will merely serve you food that tastes bad. So you give those restaurants two different ratings with different numbers to symbolize those distinctions.

But the person who knows nothing about your system other than the numbers that they read isn’t going to know this. So they may go to a restaurant that you rated a two and eat the worst meal they’ve ever had in their life. And they’re wondering “How the hell did thorny locust give this place two stars? This place should be down at the bottom of any list.”

Then, by an amazing coincidence, they encounter you the next day. And they complain about their horrible dining experience. And you respond “Yes, I fully agree. That place also served me the worst meal I ever ate. But it didn’t put either one of us in the hospital.”

Here’s a fact; I have eaten in hundreds, maybe even thousands, of restaurants in my lifetime. And I have never required hospitalization after any of them. So I feel reserving a special rating that applies only to restaurants that give you food poisoning is not necessary.

I use my lowest rating for “restaurants you shouldn’t eat at because the food tastes bad.” If I ever encounter a restaurant where that you shouldn’t eat at because it will poison you, I’ll just add it to that lowest group.

But it won’t be at the bottom of your list, if you happen to be in an area with 20 restaurants that are worse.

Yes, it won’t be the worst meal I’ve ever eaten in a restaurant if I’ve eaten twenty other meals in restaurants that were worse.

If I was planning on rating a restaurant as one-star and then stopped and reflected on the fact that I’ve eaten at twenty other restaurants that served worse food, maybe it’s a sign that I should reconsider my standards. Maybe I should acknowledge that this restaurant isn’t as bad as those twenty other restaurants and maybe it deserves a two-star rating instead of a one star rating.

Guess what? That’s the same process that leads to me deciding whether a restaurant is in the bottom fifth of all restaurants or the second to the bottom fifth. I decide how good or bad a restaurant is by comparing it to other restaurants.

And not everyone does. I was recently on a cruise and had at least one meal at ten different restaurants. None of them were bad- they were all good. Two of them were less good than the other eight but that was as far as I could go. And even the two that I would rate lower than the other eight would have been based on the ambiance and the menu items- those two were the buffet and the pub. I could not possibly say that one of them was better than the other nor could I have ranked the other eight. But if I had to rank them and divide them into five equal groups , then two restaurants with good food and good service would have been in the lowest group.

You say you’re not comparing restaurants. But you are. You said that two of the restaurants were less good than the other eight.