Two questions about the Merge Sort Algorithm

In Beyond Numeracy Paulos discusses the “merge sort algorithm” for sorting lots of things. He writes that to do this one breaks the big pile into a bunch of little ones, sort the little ones, then combine them, two-by-two, by comapring the top elements, then second, and so on.

I think I’ve applied this to things like alphabetizing large stacks of accident reports when I worked for a city engineer. I’d quickly sort as many off the top as I could in one hand; sometimes five, sometimes ten or fifteen if the order worked out right. Then I’d compare two stacks at a time, creating a third stack by taking the first item of the two items showing. (One at the top of each stack.)

Although I haven’t tested it empirically, it does seem that the job goes much more quickly when I do it this way.

Question 1 Am I doing it approximately correctly?

Question 2 Is there an optimal size for the very small stacks that one starts sorting with?

Can you explain it in a step-by-step fashion, I’m not visualizing it

Don’t know, but here’s how you would alphabetize accident reports using merge sort:
A) Split the pile into two halves.
B1) If one or both of the halves are of size X or smaller, sort them by hand.
B2) Otherwise, sort them by going to step A on that half (i.e. splitting it in half again)
C) Merge the two halves by comparing top elements and putting them into the third pile until both halves are exhausted and you are left with a sorted pile.

X would probably be reasonable around 5, since sorting 5 things in your head is very very fast.

As far as optimal size for X, that depends on how well you sort by hand. For computers, an optimal size would most likely be 2 or 3 since otherwise you would need to employ a second sorting algorithm.

Sure. But first let me use some prose: the method you detail has me splitting in halves until I reach some pre-determined number, X, of items. Instead, imagine that I can eyeball the stack and know that it contains well above X items, so I just take X items off the top (so that I don’t have to count each half). Since X is a variable number, I don’t have to go for a standardized average.

  1. Using my left hand, I sort items off the top of the pile, and into my right hand, until my right hand is too full to continue quickly.
  2. Set that short stack aside.
  3. Repeat 1 & 2 until the whole pile is separated into short, sorted stack.
  4. Set two short stacks side-by-side and combine into bigger short stack.
  5. Repeat 4 until whole thing is sorted.

I’m not used to writing things out like that, so I’m sure I made it confusing somehow; however, it looks like I’m doing it correctly, with X being a variable dependent on the chance order of the original stack.

Does that sound like I’m on the right track?

Well as far as I understand that would be a variant of mergesort but not the pure unadulterated version as Knuth would present it.

I mean, shouldn’t you just use what fits best? I’m curious as to what situation might motivate you to try to use merge sort specifically other than whatever way feels faster and better?

Indeed! IIRC, that’s how step 1 of my method evolved.

I guess I’ll say two things. First, I wasn’t even sure I was quite doing it right—even after reading the Wikipedia article on it. (Well, skimming the article.) Second, the method itself never occured to me intuitively, so I figure if there is a better way, I may not hit on it myself.

Also consider QuickSort. For some people it’s a more intuitive way to sort by hand.

Can you dumb it down? Wikipedia’s explanation made no sense to me at all.

Ok, you have a pile of accident reports you need to alphabetize.

You clear off some desk space, and pull one out of the middle of the stack, and divide into two stacks A and B. Say it’s “Klibo”. Now you look through both piles making sure that pile A has only things before “Klibo” and pile B has thigns after “Klibo”. Now “Klibo” is in the right place, i.e. between the two piles A and B. Now you repeat the same thing for A and B. (Pull one out of the middle, divide into two halves A1 and A2, B1 and B2 etc.) eventually you are going to have a left to right row of little stacks that can be stacked left-to-right to produce a sorted stack.

Whoa?! That sounds way, way unpleasant to go through. Is it faster than merge sort?

There’s no need for step B1 there. A list of length 1 is automatically sorted, so just do nothing in that case.

Automation needs these algorithms because it is expensive to walk through the entire list over and over, looking for the right place to put an item. In my experience of sorting things by hand this is not such an issue for the human brain. If you have a sorted file and an item you need to place in it this can be done close to instantaneously, excluding the physical labor.

So I would typically just start by putting each item as I found it in the sorted stack in the right place. I guess I don’t see the time saver here. Maybe I just haven’t done this enough …

If you have a large list of things to sort, merge sort is much better than naive sorting algorithms.

Yes, on a computer, because the mechanics of Quicksort are easier for computers to handle than Mergesort. Merge requires the creation of temporary arrays and is less efficient in its recursion. In short, Quicksort is fine if you’re a computer. Us humans work better with Merge.

Good question; I may have been unclear in the OP.

Sticking w/ my example. I’d get a stack of accident reports that were in no order and they’d have to go into a large drawer with a file for each street, alphabetized. Rather than jump back and forth with each item that comes to the top of the pile, I can sort in advance and save time by going in order. Thus I am already quite close to the next item, or I’m already there. (Bonus: I only have one direction to search for the next file.) I cannot intuitively sort into the stack because it is completely unsorted, and to file the stack it makes sense to sort it out before hand.

Sloth may be a deadly sin; however, it is a powerful engine of efficiency. :smiley:

Do you continuously merge a small stack into a large stack that consists of all the merges you’ve done up to that point? If so, then you’re doing a slightly optimized list sort/insertion sort rather than a merge sort. Consider in the limit where your small stacks have exactly one folder in them to see why. To get the logarithmic efficiency you need to merge stacks where the ratios of the size of the stacks you’re merging are never larger than some value, such as 2:1 or 3:1.

Your method is almost a pure Mergesort, provided that you make one small modification (which you may already be doing; I can’t tell from your description). When you combine substacks into the final big stack, make sure to combine substacks of close to the same size. That is to say: Suppose you have 20 items total, so you have four small stacks of 5 items each. Ideally, you should combine stack A with stack B, and combine stack C with stack D, then combine stack AB with CD. This would in general be more efficient than combining A with B, then combining AB with C, then ABC with D.

Quicksort is, in general, faster than Mergesort, but Mergesort is still one of the three good sorting algorithms (the third being Heapsort), and is only a little slower than Quicksort. Meanwhile, all three are far more efficient than Insertion-sort or Bubble-sort, the other commonly encountered sort algorithms, and Mergesort is probably the most conceptually simple of the three good ones, so I would probably just stick to that for your hand-sorting.

Don’t forget that the naive quicksort is potentially quadratic.

As Larry Wall once put it: the three attributes of a great programmer are laziness, impatience, and hubris.

Actually, Mergesort can be done in-place.

Another point about Mergesort: It is in no way intrinsically recursive. It is quite easy to code up iteratively, no stack required at all.

Sorting by hand opens up all sorts of operations that are unavailable on computers. Take Insertionsort. It can be done with O(n log n) comparisons using binary search, but the actual insertion part adds up to O(n[sup]2[/sup]) due to the time to shift things over. (You can use a balanced tree though to make it O(n log n) overall.) However, with handheld items, the insertion is quite simple. So it actually works well in this situation. Also note that you can do “better” than binary search by estimating where the item might go, similar to Andy Yao’s O(log log n) insertion algorithm.

A lot depends on the key type and distribution. For example, if the keys are from of small range then radix-type sorts work well. E.g., keys from 000 to 999, three passes using 10 piles (starting with the last digit of course) and it’s a breeze.

I am a big fan of the beauty of Heapsort, but sadly it is not the best in most situations. However, I can do Heapsort by hand. (Years of teaching sorting algorithms.) So that is technically an option.

Note that Quicksort is the “Greenland” of sorting methods. Sure Greenland is actually is green in a few places, but generally it isn’t.