Can you describe my (new, original) sorting algorithm mathematically?

The idea is simple: A random ordering will result in very small contiguous ordered blocks in either direction. Snake sort begins by identifying all those blocks, and then merges them together. Each merge pass will halve the remaining number of blocks, so it very quickly resolves to a sorted state.
I’ve been toying with sorting algorithms for fun over the last month, and I think I’ve come up with an honest-to-god new algorithm that is wholly original, and all mine. My general questions are at the end of this very long post. Skimmers can just read the bolded paragraph to see how it works, which I’ve also set to be displayed in mouseover preview.

In my toying with algorithms, I wrote a benchmark program that compares pure speed with no extra overhead, and also a second screen which does graphical head-to-head races that look pretty much exactly like this java page.

The algorithms generally perform in this order, best to worst:

Quick sort (scales well to very large arrays)
Shell sort
Merge sort (scales well to very large arrays)
Comb sort
Shaker sort (optimized by a better programmer than me)
Heap sort
Jump sort (original to a programmer on another messageboard)
Insertion sort
Selection sort
Gnome sort
Cocktail sort
Bubble sort

The algorithm I came up with is as fast or faster than any other algorithm, including quick sort, and it scales up as well as quicksort excluding memory constraints. It is in the merge sort family, and it is unstable, out-of-place, offline, and non-recursive. I call it snake sort due to its similarities to fantasy football snake drafts.

The idea is simple: A random ordering will result in very small contiguous ordered blocks in either direction. Snake sort begins by identifying all those blocks, and then merges them together. Each merge pass will halve the remaining number of blocks, so it very quickly resolves to a sorted state.

It uses quite a bit of memory: a full copy of the original array, plus an index array (to remember the block cutoffs) half the size of the original array.

Consider the 10 character string SJDFGASLKD. The first three letters, SJD, are already in descending order, so they are the first block. FG are in ascending order, so that’s the second block. AS becomes the third block, and LKD (descending order) rounds us out with the fourth and final block.

One key feature is to bounce the array contents back and forth between the original array and the mirror array instead of merging to the mirror and then copying back to the original each step as with a standard merge sort. This means that if the last step leaves the contents in the mirror, a final pass must be run to copy that back over the original.

Here’s the debug info I generated in testing for a shuffled array of 50 elements. Descending blocks are denoted by negative numbers:


----------------------
|   Copy to Mirror   |
----------------------
(0)=73...Block(0)=0
(1)=69
(2)=49
(3)=15...Block(1)=-3
(4)=91
(5)=1...Block(2)=-5
(6)=47
(7)=23...Block(3)=-7
(8)=53
(9)=13...Block(4)=-9
(10)=81
(11)=79
(12)=55...Block(5)=-12
(13)=57
(14)=9...Block(6)=-14
(15)=31
(16)=39...Block(7)=16
(17)=7
(18)=33
(19)=63
(20)=97...Block(8)=20
(21)=83
(22)=11...Block(9)=-22
(23)=25
(24)=59
(25)=89...Block(10)=25
(26)=35
(27)=37
(28)=67...Block(11)=28
(29)=3
(30)=21...Block(12)=30
(31)=17
(32)=75...Block(13)=32
(33)=61
(34)=51...Block(14)=-34
(35)=71
(36)=29
(37)=27
(38)=5...Block(15)=-38
(39)=85
(40)=87...Block(16)=40
(41)=41
(42)=95...Block(17)=42
(43)=77
(44)=43...Block(18)=-44
(45)=45
(46)=99...Block(19)=46
(47)=19
(48)=93...Block(20)=48
(49)=65...Block(21)=49
----------------------
|  Copy to Original  |
----------------------
(0)=1...Block(0)=0
(1)=15
(2)=49
(3)=69
(4)=73
(5)=91...Block(1)=5
(6)=13
(7)=23
(8)=47
(9)=53...Block(2)=9
(10)=9
(11)=55
(12)=57
(13)=79
(14)=81...Block(3)=14
(15)=7
(16)=31
(17)=33
(18)=39
(19)=63
(20)=97...Block(4)=20
(21)=11
(22)=25
(23)=59
(24)=83
(25)=89...Block(5)=25
(26)=3
(27)=21
(28)=35
(29)=37
(30)=67...Block(6)=30
(31)=17
(32)=51
(33)=61
(34)=75...Block(7)=34
(35)=5
(36)=27
(37)=29
(38)=71
(39)=85
(40)=87...Block(8)=40
(41)=41
(42)=43
(43)=77
(44)=95...Block(9)=44
(45)=19
(46)=45
(47)=93
(48)=99...Block(10)=48
(49)=65...Block(11)=49
----------------------
|   Copy to Mirror   |
----------------------
(0)=1...Block(0)=0
(1)=13
(2)=15
(3)=23
(4)=47
(5)=49
(6)=53
(7)=69
(8)=73
(9)=91...Block(1)=9
(10)=7
(11)=9
(12)=31
(13)=33
(14)=39
(15)=55
(16)=57
(17)=63
(18)=79
(19)=81
(20)=97...Block(2)=20
(21)=3
(22)=11
(23)=21
(24)=25
(25)=35
(26)=37
(27)=59
(28)=67
(29)=83
(30)=89...Block(3)=30
(31)=5
(32)=17
(33)=27
(34)=29
(35)=51
(36)=61
(37)=71
(38)=75
(39)=85
(40)=87...Block(4)=40
(41)=19
(42)=41
(43)=43
(44)=45
(45)=77
(46)=93
(47)=95
(48)=99...Block(5)=48
(49)=65...Block(6)=49
----------------------
|  Copy to Original  |
----------------------
(0)=1...Block(0)=0
(1)=7
(2)=9
(3)=13
(4)=15
(5)=23
(6)=31
(7)=33
(8)=39
(9)=47
(10)=49
(11)=53
(12)=55
(13)=57
(14)=63
(15)=69
(16)=73
(17)=79
(18)=81
(19)=91
(20)=97...Block(1)=20
(21)=3
(22)=5
(23)=11
(24)=17
(25)=21
(26)=25
(27)=27
(28)=29
(29)=35
(30)=37
(31)=51
(32)=59
(33)=61
(34)=67
(35)=71
(36)=75
(37)=83
(38)=85
(39)=87
(40)=89...Block(2)=40
(41)=19
(42)=41
(43)=43
(44)=45
(45)=65
(46)=77
(47)=93
(48)=95
(49)=99...Block(3)=49
----------------------
|   Copy to Mirror   |
----------------------
(0)=1...Block(0)=0
(1)=3
(2)=5
(3)=7
(4)=9
(5)=11
(6)=13
(7)=15
(8)=17
(9)=21
(10)=23
(11)=25
(12)=27
(13)=29
(14)=31
(15)=33
(16)=35
(17)=37
(18)=39
(19)=47
(20)=49
(21)=51
(22)=53
(23)=55
(24)=57
(25)=59
(26)=61
(27)=63
(28)=67
(29)=69
(30)=71
(31)=73
(32)=75
(33)=79
(34)=81
(35)=83
(36)=85
(37)=87
(38)=89
(39)=91
(40)=97...Block(1)=40
(41)=19
(42)=41
(43)=43
(44)=45
(45)=65
(46)=77
(47)=93
(48)=95
(49)=99...Block(2)=49
----------------------
|  Copy to Original  |
----------------------
(0)=1
(1)=3
(2)=5
(3)=7
(4)=9
(5)=11
(6)=13
(7)=15
(8)=17
(9)=19
(10)=21
(11)=23
(12)=25
(13)=27
(14)=29
(15)=31
(16)=33
(17)=35
(18)=37
(19)=39
(20)=41
(21)=43
(22)=45
(23)=47
(24)=49
(25)=51
(26)=53
(27)=55
(28)=57
(29)=59
(30)=61
(31)=63
(32)=65
(33)=67
(34)=69
(35)=71
(36)=73
(37)=75
(38)=77
(39)=79
(40)=81
(41)=83
(42)=85
(43)=87
(44)=89
(45)=91
(46)=93
(47)=95
(48)=97
(49)=99
----------------------
|   Sort Complete    |
----------------------

Once the initial pass identifies the original blocks, no basic comparisons are needed ever again, since by definition any elements inside a block are already in a sorted order. (Comparisons are still needed in the merging operations, of course.) Each subsequent pass merges two existing blocks together, so the new block cutoffs are already known and do not need to be identified via comparisons. What makes snake sort unique from merge sort is that snake sort bases its block boundaries on the preexisting natural state of the array, as opposed to choosing arbitrary cutoff points. (Also, snake sort supports descending blocks, which merge sort does not. And it’s not recursive.)

The most interesting feature of snake sort is that the more ordered the array is initially, the faster it runs. Each alogorithm has its own unique worst case scenario. Quick sort’s worst case appears to be a camel hump, such as:

.1
…5
…9
…7
…3

This puts the first pivot in the worst possible place, and it goes downhill from there. It would be logical to conclude that snake sort’s worst case would be alternating blocks of two:

.1
…7
…3
…9
…5

Snakesort is extremely efficient on already ordered lists; virtually equal to bubblesort. It really shines on descending lists, where it can transform the array by looping only halfway through it and swapping the ends.

Most importantly, its absolute worst case scenario is light years faster than quicksort’s worst case. (Both scenarios are, IMO, unlikely in the extreme.) A rough approximation in analyzing sorting efficiency is to assign each comparison a time value of 1, each exchange (swapping two elements with each other) a time value of 2, and each overwrite (pretty much the only kind of swap merge and snake sort do) a time value of 1. This is only a rough approximation, but it predicts real performance fairly accurately. Here are some time numbers comparing snake sort with quick sort sorting a 50-element array:

Best case (ascending):
Quicksort: 317
Snakesort: 50 (absolute minimum + 1)

Descending:
Quicksort: 368
Snakesort: 100 (absolute minimum + 1)

Quicksort’s worst case: (camel hump)
Quicksort: 884
Snakesort: 199

Snakesort’s worst case: (thatched)
Quicksort: 408
Snakesort: 510

Random shuffle:
Quckisort: (roughly) 500
Snakesort: (roughly) 560

As you can see, snake sort approaches quicksort efficiency on randomly ordered lists, but gets much faster the moment any order presents itself, unlike quicksort. Oddly, it manages to sort its theoretical worst case (thatched) faster than a generic random set. This leads me to conclude that it has no worst case scenario; only best case – where it approaches perfect efficiency – and average case, where it approaches quick sort.

So given the strong performance, I’m pretty happy with it. The only questions that remain are whether or not it already exists, and what formulas describe its efficiency. Note that just because it uses merging as its operation doesn’t mean it is a merge sort. Snake sort is as different from merge sort as shell sort is from insertion sort, or comb sort is from bubble sort. Here’s a fully function visual basic (not .NET) implementation:


Public Sub SnakeSort1(ByRef pvarArray As Variant)
    Dim i As Long
    Dim iMin As Long
    Dim iMax As Long
    Dim lngIndex() As Long
    Dim lngLevel As Long
    Dim lngOldLevel As Long
    Dim lngNewLevel As Long
    Dim varMirror As Variant
    Dim lngDirection As Long
    Dim blnMirror As Boolean
    Dim varSwap As Variant
    
    iMin = LBound(pvarArray)
    iMax = UBound(pvarArray)
    ReDim lngIndex((iMax - iMin + 3) \ 2)
    lngIndex(0) = iMin
    i = iMin
    ' Initial loop: locate cutoffs for each ordered section
    Do Until i >= iMax
        Select Case lngDirection
            Case 1
                Do Until i = iMax
                    If pvarArray(i) > pvarArray(i + 1) Then Exit Do
                    i = i + 1
                Loop
            Case -1
                Do Until i = iMax
                    If pvarArray(i) < pvarArray(i + 1) Then Exit Do
                    i = i + 1
                Loop
            Case Else
                Do Until i = iMax
                    If pvarArray(i) <> pvarArray(i + 1) Then Exit Do
                    i = i + 1
                Loop
                If i = iMax Then lngDirection = 1
        End Select
        If lngDirection = 0 Then
            If pvarArray(i) > pvarArray(i + 1) Then
                lngDirection = -1
            Else
                lngDirection = 1
            End If
        Else
            lngLevel = lngLevel + 1
            lngIndex(lngLevel) = i * lngDirection
            lngDirection = 0
        End If
        i = i + 1
    Loop
    ' Note final block
    If Abs(lngIndex(lngLevel)) < iMax Then
        If lngDirection = 0 Then lngDirection = 1
        lngLevel = lngLevel + 1
        lngIndex(lngLevel) = i * lngDirection
    End If
    ' If the list is already sorted, exit
    If lngLevel <= 1 Then
        ' If sorted descending, reverse before exiting
        If lngIndex(lngLevel) < 0 Then
            For i = 0 To (iMax - iMin) \ 2
                varSwap = pvarArray(iMin + i)
                pvarArray(iMin + i) = pvarArray(iMax - i)
                pvarArray(iMax - i) = varSwap
            Next
        End If
        Exit Sub
    End If
    ' Main loop - merge section pairs together until only one section left
    ReDim varMirror(iMin To iMax)
    Do Until lngLevel = 1
        lngOldLevel = lngLevel
        For lngLevel = 1 To lngLevel - 1 Step 2
            If blnMirror Then
                SnakeSortMerge varMirror, lngIndex(lngLevel - 1), lngIndex(lngLevel), lngIndex(lngLevel + 1), pvarArray
            Else
                SnakeSortMerge pvarArray, lngIndex(lngLevel - 1), lngIndex(lngLevel), lngIndex(lngLevel + 1), varMirror
            End If
            lngNewLevel = lngNewLevel + 1
            lngIndex(lngNewLevel) = Abs(lngIndex(lngLevel + 1))
        Next
        If lngOldLevel Mod 2 = 1 Then
            If blnMirror Then
                For i = lngIndex(lngNewLevel) + 1 To iMax
                    pvarArray(i) = varMirror(i)
                Next
            Else
                For i = lngIndex(lngNewLevel) + 1 To iMax
                    varMirror(i) = pvarArray(i)
                Next
            End If
            lngNewLevel = lngNewLevel + 1
            lngIndex(lngNewLevel) = lngIndex(lngOldLevel)
        End If
        lngLevel = lngNewLevel
        lngNewLevel = 0
        blnMirror = Not blnMirror
    Loop
    ' Copy back to main array if necessary
    If blnMirror Then
        For i = iMin To iMax
            pvarArray(i) = varMirror(i)
        Next
    End If
End Sub

Private Sub SnakeSortMerge(pvarSource As Variant, plngLeft As Long, plngMid As Long, plngRight As Long, pvarDest As Variant)
    Dim L As Long
    Dim LMin As Long
    Dim LMax As Long
    Dim LStep As Long
    Dim R As Long
    Dim RMin As Long
    Dim RMax As Long
    Dim RStep As Long
    Dim O As Long
    
    If plngLeft <> 0 Then O = Abs(plngLeft) + 1
    If plngMid > 0 Then
        LMin = O
        LMax = Abs(plngMid)
        LStep = 1
    Else
        LMin = Abs(plngMid)
        LMax = O
        LStep = -1
    End If
    If plngRight > 0 Then
        RMin = Abs(plngMid) + 1
        RMax = Abs(plngRight)
        RStep = 1
    Else
        RMin = Abs(plngRight)
        RMax = Abs(plngMid) + 1
        RStep = -1
    End If
    L = LMin
    R = RMin
    Do
        If pvarSource(L) <= pvarSource(R) Then
            pvarDest(O) = pvarSource(L)
            If L = LMax Then
                For R = R To RMax Step RStep
                    O = O + 1
                    pvarDest(O) = pvarSource(R)
                Next
                Exit Do
            End If
            L = L + LStep
        Else
            pvarDest(O) = pvarSource(R)
            If R = RMax Then
                For L = L To LMax Step LStep
                    O = O + 1
                    pvarDest(O) = pvarSource(L)
                Next
                Exit Do
            End If
            R = R + RStep
        End If
        O = O + 1
    Loop
End Sub

So, the general questions:[ul][li]Have you seen anything like this before? Links would be great, but any cite will do.[/li][*]Can you math guys quantify this algorithm? As in, the same type of best/worst/average descriptors given for various algorithms in the table on this wikipedia page?[/ul]

[QUOTE=Ellis Dee]
So, the general questions:[ul][li]Have you seen anything like this before? Links would be great, but any cite will do.[/li][li]Can you math guys quantify this algorithm? As in, the same type of best/worst/average descriptors given for various algorithms in the table on this wikipedia page?[/ul][/li][/QUOTE]

Your explanation is a little confusing but I’ll give it a try:

  1. Yes. I’ve seen something like this. If I understand it correctly, you are essentially building an out-of-place heap and then merging the subtrees. This is very much like smoothsort or even just heapsort, only it’s out of place so it uses more storage.
  2. Basic analysis (note, I have not had coffee, I’ll take a look at it later if I have time to see if I missed something)

Memory usage - O(n)
Since it uses as much extra memory as the array size

Computational:
a) Identify blocks (once) - linear pass with the worst case number of resulting blocks n/2 and best case 1.
b) Scan to find blocks to merge (1 to log(n) times) - linear pass
c) Merge two blocks - linear with the size of the block I’m assuming

Best case: O(n)
Worst case: O(n log n)
Avg case: Most likely O(n log n)
Stability: No (Unless you add this to your algorithm)

But I’m not sure what you were expecting here. You theoretically can’t beat O(n log n) worst case for comparison sorting.

Not quite. It’s not building any heaps, but rather identifying already sorted blocks to merge together. Heap sort is very different in practice, and is mostly unaffected by the initial state:


Public Sub HeapSort(ByRef pvarArray As Variant)
    Dim i As Long
    Dim iMin As Long
    Dim iMax As Long
    Dim varSwap As Variant
    
    iMin = LBound(pvarArray)
    iMax = UBound(pvarArray)
    For i = (iMax + iMin) \ 2 To iMin Step -1
        Heap pvarArray, i, iMin, iMax
    Next i
    For i = iMax To iMin + 1 Step -1
        varSwap = pvarArray(i)
        pvarArray(i) = pvarArray(iMin)
        pvarArray(iMin) = varSwap
        Heap pvarArray, iMin, iMin, i - 1
    Next i
End Sub

Private Sub Heap(ByRef pvarArray As Variant, ByVal i As Long, iMin As Long, iMax As Long)
    Dim lngLeaf As Long
    Dim varSwap As Variant
    
    Do
        lngLeaf = i + i - (iMin - 1)
        Select Case lngLeaf
            Case Is > iMax: Exit Do
            Case Is < iMax: If pvarArray(lngLeaf + 1) > pvarArray(lngLeaf) Then lngLeaf = lngLeaf + 1
        End Select
        If pvarArray(i) > pvarArray(lngLeaf) Then Exit Do
        varSwap = pvarArray(i)
        pvarArray(i) = pvarArray(lngLeaf)
        pvarArray(lngLeaf) = varSwap
        i = lngLeaf
    Loop
End Sub

Smooth sort might be a possibility. I was never able to find any code for it in any language, and wikipedia has only this to say about it:

This has many things in common with what I’m calling snake sort. Faster if the input is sorted to some degree, plus lots of complexity.

The main difference would be that snake sort does not use any kind of heap structure at all. Still, I’m definitely interested in implementing a smooth sort simply for the sake of having it. If anyone knows of any code for it (in any language) or even just a detailed narrative description of how it works, that’d be great.

Also of interest to me is library sort, which appears to have never been actually implemented by anyone in any language, ever.

I just ran some true timing benchmarks comparing snake sort and quick sort. Here’s the results in seconds, with three trials each and both algorithms working on the same initial array in each trial:

Random Array - 99 elements
Snake sort: 0.00074, 0.00076, 0.00076 (Average: 0.00075)
Quick sort: 0.00078, 0.00077, 0.00080 (Average: 0.00078) -4%

Random Array - 999 elements
Snake sort: 0.0117, 0.0115, 0.0114 (Average: 0.0115)
Quick sort: 0.0142, 0.0128, 0.0125 (Average: 0.0132) -15%

Random Array - 9,999 elements
Snake sort: 0.172, 0.173, 0.154 (Average: 0.166)
Quick sort: 0.166, 0.175, 0.165 (Average: 0.169) -2%

Random Array - 99,999 elements
Snake sort: 2.122, 2.129, 2.125 (Average: 2.125)
Quick sort: 2.136, 2.123, 2.131 (Average: 2.130) -2%

Random Array - 999,999 elements
Quick sort: 25.189, 25.628, 25.113 (Average: 25.310)
Snake sort: 26.635, 26.230, 26.322 (Average: 26.396) -4%

Ascending order - 10,000 elements
Snake sort: 0.033
Quick sort: 0.096 -191%

Descending order - 10,000 elements
Snake sort: 0.011
Quick sort: 0.102 -827%

5% unsorted - 10,000 elements
Snake sort: 0.122
Quick sort: 0.130 -7%

Note that these numbers are remarkably slow because 1) it’s in visual basic 6, and 2) because my machine is very old and very slow. But since both algorithms face the same handicaps, the relative performance is meaningful.

In the interest of full disclosure, here’s the quicksort implementation:


' Omit plngLeft & plngRight; they are used internally during recursion
Public Sub QuickSort(ByRef pvarArray As Variant, Optional ByVal plngLeft As Long, Optional ByVal plngRight As Long)
    Dim lngFirst As Long
    Dim lngLast As Long
    Dim varMid As Variant
    Dim varSwap As Variant
    
    If plngRight = 0 Then
        plngLeft = LBound(pvarArray)
        plngRight = UBound(pvarArray)
    End If
    lngFirst = plngLeft
    lngLast = plngRight
    varMid = pvarArray((plngLeft + plngRight) \ 2)
    Do
        Do While pvarArray(lngFirst) < varMid And lngFirst < plngRight
            lngFirst = lngFirst + 1
        Loop
        Do While varMid < pvarArray(lngLast) And lngLast > plngLeft
            lngLast = lngLast - 1
        Loop
        If lngFirst <= lngLast Then
            varSwap = pvarArray(lngFirst)
            pvarArray(lngFirst) = pvarArray(lngLast)
            pvarArray(lngLast) = varSwap
            lngFirst = lngFirst + 1
            lngLast = lngLast - 1
        End If
    Loop Until lngFirst > lngLast
    If plngLeft < lngLast Then QuickSort pvarArray, plngLeft, lngLast
    If lngFirst < plngRight Then QuickSort pvarArray, lngFirst, plngRight
End Sub

Well, there’s an implementation of Bogosort which has linear performance on all inputs. Unfortunately, it has very high overhead, and depends upon the Many Worlds interpretation of quantum mechanics to be true.

Step 1: Randomize the array using a quantum true RNG (linear time).
Step 2: Check to see whether the array is sorted (linear time).
Step 3a: If the array is sorted, return the sorted array (linear time).
Step 3b: If the array isn’t sorted, destroy the universe (constant time).

In every universe which continues to exist, the array is now sorted.

Sounds a lot like those algorithms they use(d) on old-style mainframes to merge-sort tapes.

Have you tried here or here ?

I’m not saying the code is there (although I did get some promising looking hits), I just wondered if you were aware of Google Code Search.

Here’s Dijkstra’s original paper on smoothsort.

Generating an index of blocks (which makes your storage requirement O(n log n) in the worst case, now that I think about it) is not that different from building a heap – it’s not equivalent but the idea is almost the same. Sorting algorithms are compared by analysis rather than actual performance. Actual performance is something you measure for a particular implementation and it does not tell you that much about the algorithm. I’d venture a guess that using your example sizes/sets the implementations of mergesort and quicksort can be optimized.

Sweet, thank you. I was not aware of google code search. I managed to find this implementation of smooth sort:


/*	An algorithm developed by Edsger Dijkstra based on the same principle as
 *	heapsort.  By using a postorder traversal of a heap-like structure, O(n)
 *	time complexity is achieved for already-sorted data.
 *	Time complexity:
 *		O(n log n) worst case
 *		O(n) best case
 *	Working memory:
 *		O(1) all cases
 */
void smoothSort(uint[] data)
out {
	debug (100) printList(data);
	checkSorted(data);
}
body {
	void up(inout int b, inout int c) {
		int temp = b + c + 1;
		c = b;
		b = temp;
	}

	void down(inout int b, inout int c) {
		int temp = b - c - 1;
		b = c;
		c = temp;
	}

	void sift(int r, int b, int c) {
		while (b >= 3) {
			int r2 = r - b + c;

			if (data[r2] < data[r - 1]) {
				r2 = r - 1;
				down(b, c);
			}

			if (data[r] >= data[r2]) {
				b = 1;
			} else {
				swap(data[r], data[r2]);
				r = r2;
				down(b, c);
			}
		}
	}

	void trinkle(int r, int p, int b, int c) {
		while (p > 0) {
			int r3;

			while (p % 2 == 0) {
				p /= 2;
				up(b, c);
			}
			r3 = r - b;

			if (p == 1 || data[r3] <= data[r]) {
				p = 0;
			} else {
				p--;
				if (b == 1) {
					swap(data[r], data[r3]);
					r = r3;
				} else if (b >= 3) {
					int r2 = r - b + c;
					if (data[r2] < data[r - 1]) {
						r2 = r - 1;
						down(b, c);
						p *= 2;
					}

					if (data[r3] >= data[r2]) {
						swap(data[r], data[r3]);
						r = r3;
					} else {
						swap(data[r], data[r2]);
						r = r2;
						down(b, c);
						p = 0;
					}
				}
			}
		}
		sift(r, b, c);
	}

	void semitrinkle(int r, int p, int b, int c) {
		int r1 = r - c;
		if (data[r1] > data[r]) {
			swap(data[r], data[r1]);
			trinkle(r1, p, b, c);
		}
	}

	void swap(inout int x, inout int y) {
		int temp = x;
		x = y;
		y = temp;
	}

	int q = 1, r, p = 1, b = 1, c = 1;

	while (q != data.length) {
		if (p % 8 == 3) {
			sift(r, b, c);
			p = (p+1)/4;
			up(b, c); up(b, c);
		} else if (p % 4 == 1) {
			if (q + c < data.length) {
				sift(r, b, c);
			} else {
				trinkle(r, p, b, c);
			}
			do {
				down(b, c);
				p *= 2;
			} while (b != 1);
			p++;
		}
		q++; r++;
	}
	trinkle(r, p, b, c);

	while (q != 1) {
		q--;
		if (b == 1) {
			r--; p--;
			while (p % 2 == 0) {
				p /= 2;
				up(b, c);
			}
		} else if (b >= 3) {
			p--;
			r += c - b;
			if (p != 0) semitrinkle(r, p, b, c);
			down(b, c);
			p = 2*p + 1;
			r += c;
			semitrinkle(r, p, b, c);
			down(b, c);
			p = 2*p + 1;
		}
	}
}

The problem with library sort is that its name is almost completely search-proof. For example, scanning through the hits in the link you provided, here’s the typical implementation:


import java.io.FileReader;
import java.io.BufferedReader;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;

public class LibrarySort {
  public static void main(String[] args) throws Exception {
    if (args.length != 1) {
      System.err.println(
        "Usage:
	java LibrarySort (file with numbers to sort)");
      System.exit(-1);
    }

    ArrayList<Float> items = new ArrayList<Float>();

    BufferedReader breader = new BufferedReader(new FileReader(args[0]));

    // read in the items
    for (String inLine; (inLine = breader.readLine()) != null; )
      items.add(Float.valueOf(inLine));

    // sort them
    **Collections.sort(items);**

    // display the results
    for (Iterator<Float> iter = items.iterator(); iter.hasNext(); ) {
       Float f = iter.next();
       System.out.println(f);
    }
  }
}

heh, not much of an algorithm, but rather using the built-in library’s sort function. Also, library sort’s design makes it unappealing to sit down and actually code, since the programmer is left with arbitrary decisions to make that will have a huge impact on performance. It’s much preferred when an optimal conceptual design is easy to picture.

Would you be comfortable with the factual accuracy of me naming my algorithm smooth sort? I read every word in the original paper you linked, and apart from a couple sentences in the beginning describing the concept, I couldn’t find any similarities. And the technical details are wildly different. Smooth sort is in-place and uses a heap structure to make exchanges. Snake sort is out-of-place and uses an index list to merge sections. They don’t even share a similar exchange method, which is a pretty fundamental difference. The comparisons are equally dissimilar in how and when they are needed.

I would come back to shell sort, which is a simple insertion sort using descending gap values, and comb sort, which is a bubble sort using descending gap values. When I say “is a (whatever) sort”, I mean that literally. Looking at the code for shell sort, you can see that it contains the actual code for insertion sort, line for line. Same with comb and bubble. Normalizing for the different languages, I can’t detect a single similarity in the code between smooth sort and snake sort. (Though it’ll take me a bit to get around to converting that smoothsort function over.)

“Generating an index of blocks (which makes your storage requirement O(n log n) in the worst case, now that I think about it)”

I’m not sure I follow. The storage required by snake sort is 2.5n + 1. The mirror array requires the same space as the original, and the index array (by definition) cannot exceed half the items + 1, because the minimum number of elements in a block is two. (An additional initial border boundary of 0 is used to simplify the implementation, thus the +1.)

Of course, truth be told, I’m so ignorant of math that I don’t actually know what O(n log n) even means. What number does that produce for 50 elements? (Actual storage required for sorting 50 elements: 126)

I apologize if I’m coming across as argumentative; I don’t mean to. Feedback like yours is exactly what I was looking for in opening this thread.

Let the time needed for an algorithm be denoted as T(n), where n is an appropriate measure of the input size (the number of elements in the array to be sorted here). Saying that T(n) is O(nlog(n)) means that, as n gets large, T(n) becomes approximately equal to cnlog(n) for some constant c. In other words, you can use cnlog(n) as an approximation for T(n) and it works pretty well unless n is small.

Not sure how I missed it before, but my snake sort algorithm most closely resembles the Patience sorting algorithm, though there are several key differences.

The rules of snake sort dictate that you can only put a card on either the right-most pile or start a new pile. Also of note is that the piles can be either ascending or descending, unlike patience sort’s requirement that they all be descending.

Patience sorting (like snake sort) positively screams out for a merge-sort resolution once all the initial piles are set up, yet I don’t see any mention of this in that article. Instead, it discusses binary searching the top row each time you want to add an element to the final sorted array. That seems crazy-inefficient to me.

After further testing, I am confident in stating that snake sort has all the speed strengths of quicksort, but none of the weaknesses. Also of note is that smoothsort kicks all kinds of ass. It’s really a shame that both snake sort and smoothsort are so complicated to implement, since quicksort is vulnerable to a worst-case O(n[sup]2[/sup]) and heapsort is painfully slow on nearly-sorted lists.

Nothing earth-shattering to add mathematically, but here’s a link to Big O Notation so you can read up on O(n log n) and so forth. If I remember from my college days, the function after O defines the rate at which the solution’s difficulty grows with respect to the rate that n (the size of the problem) grows. If your algorithm is n log n, then at 50 elements, your solution time increases by about 2.5% when you add the 51st element.

Given the fact that smoothsort and snake sort share some similarity of purpose (find the ordered bits and join them together) but accomplish those goals differently, maybe you’d consider calling it “shake” sort instead. Not because it’s like shaking the array, but by analogy to a smoothie and a shake, which each aim to produce a sweet frozen drink, but go about it in two very different ways.

There is already a Shaker sort. Actually, there are a whole bunch of them. The three most common forms of Shaker sort are:[ul][li]A two-pass bubble sort going both directions each pass. This is more correctly called a cocktail sort, but it’s not like there’s a naming committee for sorting algorithms.[/li][li]A two-pass selection sort going both directions each pass. Not only is this another example of misusing the name shaker sort, but it it is a completely pointless variation. Unlike bubble sort – which gains a marginal performance boost in the two-pass version by taking care of turtles – selection sort by definition makes a fixed number of comparisons and exchanges; those totals remain unchanged by adding a secondary reverse pass.[/li][li]The actual shaker sort is a descending-gap bubble sort similar to comb sort. The main difference is that once the gap reaches 1, it expands out again and then shrinks down to 1, expands out, shrinks, etc… This “shaking” action is where it gets its name.[/ul]Thanks much for the math help. I’m beating my head against a wall trying to write code to predict how many comparisons/exchanges any given benchmark will likely take in order to warn the user of a potential time sink.[/li]
Coincidentally, I read that article on Big O notation just this morning, but it didn’t help. I still don’t know what the actual numeric prediction would be for a list of, say, 1000 items in an “n log n” algorithm. I’ve narrowed it down to either 6908 or 3000. Are either of those the correct answer?

No. See the explanation I gave in post #9.

Big-O notation isn’t for calculating a particular run time, but rather for predicting how run-time increases with input size.

For an O(n) algorithm, doubling the size of the input (from say 1000 to 2000) will double the run time (2 n / n = 2). An O(n[sup]2[/sup]) algorithm will quadruple ( (2 n)[sup]2[/sup] / (n)[sup]2[/sup] = 4). An O(n log n) algorithm will increase by about 2.1 to 2.3, depending on slightly n.

Order notation let’s you see how the algorithm scales, which can be critically important when code moves from the experimental phase to the production phase.

After implementing smoothsort in my benchmark program, I’ve come to realize that smoothsort and snakesort have almost nothing in common. I don’t know how to describe it very well, but the difference is fundamental. Snake sort takes advantage of items starting out in sorted order compared to their immediate neighbors, whereas smoothsort takes advantage of items starting out near their final sorted position.

The following initial ordering will perform poorly in smooth sort, but extremely efficiently in snake sort. Snake sort is exactly as efficient on this list (of any size) as counting sort, which isn’t even a comparison sort. No other comparison sort is even remotely close. Of note is that this is the dreaded O(n^2) case for quicksort.





In contrast, the following initial ordering performs poorly in snake sort, but is extremely efficient for smooth sort. Note that it is almost completely unordered, but all elements start out very close to their final sorted position.









In both cases, the one that performs poorly performs slightly better than their quite respectable n log n worst case.

That post is unintelligable to me.

I probably shouldn’t have dropped out of highschool.

My program knows how many elements are in the list and what algorithm is being run. I need it to make a ball-park guess as to how many comparisons and exchanges would be made in the average case for a list of n elements. If they total more than some arbitrary number, (which will be calculated based on how many iterations of a testing loop fire off in a half-second, scaled up to some arbitrary number of seconds, eg: 20,) I want to warn the user that it could be a time sink.

What math can I use to make a rough guess as to the number of comparisons and exchanges needed? I really don’t want to have to set it to run each algorithm 10 times each on 20 different list sizes and then come up with a formula out of thin air based on those numbers. Surely somebody somewhere has already come up with math to handle this, no?

The problem with predicting the run-time is it depends on a lot of difficult to know variables that affect the scaling coefficient. But the scaling order will be predictable.

My suggestion: run the algorithms on lists of 1000, 10,000 and 100,000. Maybe 5 different lists of each length; 5 is enough to get a rough average. And three list sizes should be enough to check the scaling order (whether it’s really n log n or something worse) and give you an idea of the coefficient.

For example, let’s say one algorithm (that you expect is O[n log n]) has these average run times:
n=1000, t=50
n=10000, t=700
n=100000, t=9000

Using the formula t = c * n * log(n), we get these values for the coefficient c:
c[sub]1000[/sub] = 50 / (1000log(1000)) = 0.0072
c[sub]10000[/sub] = 700 / (10000
log(10000)) = 0.0076
c[sub]100000[/sub] = 9000 / (100000*log(100000)) = 0.0078

Since the coefficients are about the same, then the hypothesis that the algorithm is O(n log n) was good. For a rough guess of future run times (on that machine), you now have the formula t = 0.0080 * n * log (n). (Always better to guess high.)

Yes, exactly the idea I was hoping for.

Why are there almost a half dozen posts in this thread not only NOT offering this, but actively saying this doesn’t work or is invalid in some way?

For example, when I posted this:

The reply I got was:

WTF? Based on your sample methodology, I finally figured out that the correct answer is 6908. Why wouldn’t anybody just say that?

OK. Call the function that describes the running time of your algorithm on an array of length n T(n). The idea is that while T(n) can be a pretty complicated function, there may be a simple function f(n) with the property that T(n)/f(n) becomes approximately constant if n is big enough. For example, consider bubblesort. The actual running time is going to be of the form an[sup]2[/sup] + bn + c. If you take 1000000[sup]2[/sup]a + 1000000b + c and divide it by 1000000[sup]2[/sup], the answer is close enough to a that the difference doesn’t really matter. So we say that bubblesort is O(n[sup]2[/sup]), or that it’s asymptotic to n[sup]2[/sup].

Because there’s so much variability between computers, you generally only need to give your algorithm’s asymptotic running time in the best and worst cases. You’ll also want to give the asymptotic memory usage in case a programmer cares more about saving space than running very quickly.

First, you need a pseudocode write-up of the algorithm. This is so that you can concentrate on the algorithm itself instead of getting caught up in the details of a particular programming language. If I have your idea right, the algorithm should look something like this:



snakesort(A)
{
    // Preamble
    Q <- new queue
    i <- 2

    // Loop 1
    while i <= length(A)
    {
        if A[i - 1] > A*
        {
            Q.enqueue(i)
        }
        i <- i + 1
    }
    Q.enqueue(n)

    // Loop 2
    if Q has more than one element
    {
        j <- Q.dequeue
        while Q is not empty
        {
            k <- Q.dequeue
            merge(A[1...j - 1], A[j...k])
        }
    }
}


Any reasonable implementation of the preamble is going to be constant time (denoted O(1)). Loop 1 is pretty clearly O(n) as it only examines each element at most twice, so the running time up to this point is O(n).

Loop 2 is the tricky part. It consists of some number of calls to merge, which we assume is O(n). If there a constant number of calls, the loop is O(n) and so’s the whole algorithm. But if there are n calls to merge, you’re looking at O(n[sup]2[/sup]).

As you’ve already observed, the best case is where the list is already sorted and you never enqueue anything. In that case, loop 2 is O(1). In the worst case, where you’ve got a descending list of numbers, you make n - 1 calls, so loop 2 is O(n[sup]2[/sup]).

Any reasonable implementation of merge will use at most O(n) memory, and that’s as big as your queue is going to get, so your whole algorithm’s memory usage is O(n). Remember that quicksort doesn’t an extra array, so its memory usage is O(1). That may make it more appealing than snakesort for, say, embedded systems programming.

There’s one extra bit we like to know about sorting algorithms: are they stable? Imagine that you’re given a list of database records that are already sorted by first name, and you want to sort them by last name. If two people have the same last name, will their records still be sorted by first name? If so, the sorting algorithm you used is said to be stable. Mergesort is stable, and the only operation it uses is merge, so that means your sort is stable as well. Note that quicksort is not stable, so you do have a leg up on it there.

Having said all that, this really just looks like a minor variation on mergesort that attempts to prevent it from merging together the sublists of a sorted list. It’s clever, but the quadratic worst case and linear storage are a bad combination. I’m sure there’s a more intelligent version of mergesort that does what you want and has an n*log(n) worst case, but this post is getting long enough, so I’ll skip it for now.

Because the constants aren’t guaranteed to be the same on every machine. The time required for assigning a value to an integer has one value on my home computer, and another on a Cray. You’re not assuming that everyone’s going to be running this algorithm on the same architecture, are you?