In *Beyond Numeracy* Paulos discusses the “merge sort algorithm” for sorting lots of things. He writes that to do this one breaks the big pile into a bunch of little ones, sort the little ones, then combine them, two-by-two, by comapring the top elements, then second, and so on.

I think I’ve applied this to things like alphabetizing large stacks of accident reports when I worked for a city engineer. I’d quickly sort as many off the top as I could in one hand; sometimes five, sometimes ten or fifteen if the order worked out right. Then I’d compare two stacks at a time, creating a third stack by taking the first item of the two items showing. (One at the top of each stack.)

Although I haven’t tested it empirically, it does seem that the job goes much more quickly when I do it this way.

**Question 1** Am I doing it approximately correctly?

**Question 2** Is there an optimal size for the very small stacks that one starts sorting with?