# Ordering a list of variables making only direct comparisons - minimum needed?

The original context for this question is in making a survey, but I think it might be more helpful to think of it in purely math terms.

Say you have a list of n variables, and can compare any two at a time to find out which is the greater value. What’s the minimum number of evaluations needed to get the entire list in the correct order? It has to be less than the total number of combinations, because you can apply the transitive property if you know A > B and B > C, then you also know A > C and don’t need to compare those 2 directly. But in a list of 3 if it turns out A < B and B > C, then it could either be BAC or BCA, and you do need to compare A with C to get the correct order. So for 3 items you need either 2 or 3 comparisons depending on the answers. For a list of 4 items there are 6 possible combinations, but I think the minimum would be 4 comparisons to get them in order.

It’s late and I’m having trouble visualizing this on a larger scale, such as for a list of 10 or 20 variables, but that’s what I ultimately want to determine.

Past the edit window but I realized in the example of 3 variables if it turns out A < B then you can compare A to C instead of B to C and end up using the transitive property again, making 2 the minimum number of comparisons. So is the answer simply n-1 for a list of any size?

The best comparison sort algorithms require O(n*log(n)) comparisons in the worst case, and that is probably optimal.

To see that, note that there are n! possible orderings of n objects, which means there is log_2(n! ) bits of information stored in an ordering. Every comparison determines one bit of information, so you need that many comparisons to do the sort. As it happens log(n!) ~= n*log(n) for large enough n.

Nevermind this- still can’t use transitive property if the evaluations don’t correspond, i.e. if A < B and A < C then the 3rd evaluation is still needed. Too tired to think right now!

It’s been a while since I’ve had to use logarithms so the reasons for this aren’t immediately apparent to me. I’ll have to brush up on them when I come back to this thread after getting some rest. But how can you demonstrate that’s the minimum? (and what is the “O” in the first formula?)

The “O” means “of order”. That is to say, the actual number of operations needed is going to be some constant times nlog(n). What constant? That depends on the precise method, and on the implementation of the method. But in a sense, it doesn’t matter: If one method is O(nlog(n)), and another method is O(n^2) (like a bunch of naively-designed sorting routines), then no matter the constants, there’s always going to be some sufficiently-large data set for which the first method is quicker.

This discussion might be a bit easier with a concrete example of an n*log(n) method. My favorite is mergesort: It’s not the quickest or most efficient, but it is the easiest to understand intuitively. Start by splitting your list in half. Then, sort each of the halves. After you’ve done that, look at the start of each half-list and compare those two. Whichever one is smaller, put into the first position of the final list, and remove it from the half-list. Then compare the new starts of the half-lists, and so on. When you run out of elements on one half-list, just stick in the remainder of the elements on the other half-list.

But you’ll notice there that the second step of that algorithm was “sort each of the halves”. How do you do that? Well, if each half is really short, then it doesn’t much matter what method you use, since any method works about equally for a really short list. In the extreme, sorting a two-element list just means comparing the two elements and maybe swapping them, and a one-element list is already sorted. But what if the halves are longer? Then you apply Mergesort again on them recursively until the halves are short enough to use some other method.

So, how many steps overall does this take? Well, in each level of merging, you might need as many as n comparisons (actually n-1, but that’s a detail we needn’t worry about). It might be less, if one sublist runs out more quickly than the other, but it can’t be worse than that. And the number of levels of merging you need is something like log_2(n) (when you encounter logs in computer science, they’re usually base 2). So the total number of steps is going to be n*log(n).

There are n! possible orderings, and just 2[SUP]k[/SUP] possible results from k yes/no comparisons. So you can’t possibly handle all the different orderings unless
2[SUP]k[/SUP] ≥ n!
To solve for k, take the logarithm of both sides, using Stirling’s approximation for n!. You’ll get
k >= n log[sub]2[/sub] n - 1.443 n + …
where I’ve written “…” to avoid the tedium of small terms, or
k ≈ n log[sub]2[/sub] n
The expression "O(n log[sub]2[/sub] n), which can also be used even when there is a constant multiplier, just formalizes that we’re not bothering with terms that become unimportant for large n.

So, for example, since 5! = 120 and 64 = 2[SUP]6[/SUP] < 120 < 2[SUP]7[/SUP] = 128, we know that sorting 5 elements cannot be guaranteed with 6 yes/no compares. As you can see, 7 comparisons would be a very tight squeeze and, IIRC, that cannot be achieved (as a guarantee) either: you may need 8 compares.

By the way, Oblivious sorts, often used in hard-wired systems (and fun to play with!), impose further restrictions. It turns out the minimum oblivious sort when n=5 requires 9 compares.

The reasons aren’t immediately apparent to anyone. Comparison sorts (i.e. the kind of sort where the basic operation is “look at two elements and determine if one is bigger than the other”) are a well-studied problem in Computer Science.

The O(_) notation means, “of order”. Basically, the number of comparisons is less than knlog(n) for n > N for some k and N. The important fact is that if you compared every object against every other object, you would use n^2 comparisons (i.e. it would be O(n^2)), but if you got away with using only as many comparisons as objects, it would be O(n).

The actual number of required comparisons is more than O(n), but less than O(n^2), and basically means, “you don’t have to compare every thing with every other thing, but you still have to make more comparisons than objects by a factor that grows (slowly) with n”.

As Chronos notes, there are algorithms that achieve this behaviour (e.g. mergesort), but proving that there is no tricksy algorithm that exceeds it involves some subtlety. Here’s a blog post on the subject.

It should also be mentioned that if you’re allowed to do more than just a simple greater-than comparison at each step, you might be able to do better (at least, if you have some idea of the distribution of your items). If, for instance, I have a stack of papers with names on them, and I want to sort them alphabetically, if I come across the name “Zane”, I’m going to send it straight to the bottom of the stack and forget about it (at least until I come across another Z name, like Zimmerman, but that’s unlikely). In more detail, this leads to methods like “library sort”.

The minimum number of evaluations would be n-1. I think the maximum is what you are after.

Well, the maximum is infinte – just compare the first and second object over and over again and never finish the sort.

It is a subtle point though, we are talking about the minimum number of evaluations required to sort every possible list of length n. Certainly there are some lists that can be sorted with fewer comparisons (e.g. an already-sorted list, or even a list that is already sorted except for one element, provided we know already where that element is).

As an aside: The method with the best average run-time is something called Quicksort, but its worst-case run-time is O(n^2). Ordinarily, this wouldn’t be such a big deal, since it’s the average that’s important… except that in the simplest implementations, that worst-case scenario happens precisely when the input list is already sorted, which is in practice not all that rare. Of course, there are a number of ways to fix this, of which the simplest is just to check whether the list is already sorted before you start in on the quicksort algorithm itself.

What OP seeks is called minimax – minimizing worst-case.

There’s also radix sort. For example, old IBM punched-card sorters divided a stack of cards into ten piles; the operator would pile these cards together and pass them again through the card sorter; and so on.

These might seem to lead to, using radix 10 for example,
10[SUP]k[/SUP] ≥ n!
instead of
2[SUP]k[/SUP] ≥ n!
However radix sorts are constrained by the length of the key, and for large keys I don’t think there’s any way to avoid the worst-case behavior given previously.

I’m going to drop this down a notch on the assumption the OP isn’t a CS guy, but rather just a math-curious guy.
Are you actually trying to count comparisons out of math curiosity, or are you trying to develop / program a sorting algorithm? IOW are you doing physics or engineering?

The reason I ask is in the OP you talk about applying the transitive property. It’s a true fact, but it isn’t computationally free.

i.e. you said: “…because you can apply the transitive property if you know A > B and B > C, then you also know A > C and don’t need to compare those 2 directly …”

But you can only do that if you store the result of both comparisons and have a mechanism to search your stored comparisons to match them up & apply transitivity. In almost all cases doing all that involves a lot more comparisons and other costly operations than does simply comparing A to C to discover that A > C.

(Chronos’ mergesort example is a way to store the prior comparison results implicitly in the ordering of the sub-lists. The process of merging the sub-lists makes explicit the extra work of locating and comparing the pre-existing comparisons.)
Since you were talking about small lists, e.g. 10 to 20 members then O(f(n)) notation falls down a bit since the n isn’t big enough to bury the details.

e.g. For an ultra-naïve algorithm like bubble sort, if by luck the list is already sorted it requires n-1 list member comparisons to sort the already-sorted list. OTOH, of the list is ordered exactly backwards versus sorted order, it will take (n-1)! comparisons to sort the list. For a 10 element list that’s either 9 comparisons, or 362,880 comparisons.

And of course there are other overhead operations required: comparisons to keep track of where we are, ops to move the members around, increment and decrement counters / pointers, etc.
(What’s interesting to me is that the best, smartest, least naïve sort out there, Quicksort, has the opposite behavior to crappy, dumbest, most naïve bubble sort. Quicksort performs *worst *when the list is already sorted, whereas bubble sort performs best. As **Chronos **said just above.)

There was a time when the saying was “90% of computing is sorting”. That time has long passed, it is now “90% of computing is not sorting”, i.e. avoiding sorting at all costs. For lists that are sufficiently short sorting efficiency is not a real concern, for larger lists the process is not sorting in the conventional sense but building indices that allow lists to be build in order to start with. Even in ye olde days of the horse drawn computers large lists were maintained in sort-merge operations where smaller sets were sorted and merged into a larger one.

Check your math. The worst-case scenario for bubble-sort is exactly n*(n-1)/2.

And bubble sort might be the most naive (or at least, tied with insertion sort), it’s not the dumbest or crappiest. That would be bogosort (which is also quite efficient on an already-sorted list).

D’oh! I was thinking Product(1 to n-1) instead of Sum(1 to n-1). :smack: Thanks.

So for n=10 the worst case bubble sort needs 45 comparisons. I should have immediately recognized that my 362K number was unrealistic. I’ve been away from this stuff too long.

I thought of bogosort, which you’ve pointed out in previous sorting threads. Given my assumption that the OP isn’t a computer guy I was trying to keep it simple & straightforward.

Nitpick. While this may be true for the most naive implementations of Quicksort, practical implementations avoid this by using a little more sophistication when selecting a “pivot.” Even so, there may be other, more complicated data patterns that degrade Quicksort to O(N[sup]2[/sup]) performance. I recall a paper that investigated such patterns, which can be used for Denial Of Service attacks! :eek:

Yeah. I was referring to the canonical textbook v1.0 QS. Pretty much any real-world commercial quality implementation of any ACM algorithm is going to have corner-case optimizing tricks, pre- and post-processing steps, etc. added.

I’d be curious to see the paper on DOS by Quicksort poisoning. Have a ref? Not that I doubt you; I’m just curious to read more. ISTM for small n even O(n^2) isn’t very painful. And building a public web-exposed sorting service that accepts workloads with large n seems like asking for it.

Quicksort seems to me to be not quite as pathological a case as the DOS by regex attack. Wherein some useful but naïvely written server-side regexes can exhibit O(2^n) :eek: or worse :eek: :eek: behavior on pathological inputs. In that case carefully crafted inputs of completely ordinary length make a dandy DOS vector. https://en.wikipedia.org/wiki/ReDoS

The paper was just about sophisticated Quicksort processing to avoid other pathologies, and gave little or no information on any real-world D.O.S. attacks.

You may be right that pathologically-ordered sort input is a poor choice for D.O.S., but your comment does seem to ignore the possibility of medium-sized N.