You are out in field with one hundred boulders in a line. They are numbered, but in random order. Naturally you want them ordered from one to a hundred.
e.g.: (46) (73) (15) (5) (66) (98)…
One strategy is to find rock #1 and carry it down to the beginning of the line. Then search for #2 and repeat. (Don’t worry about the insertions, you don’t have to move anything out of the way when you drop (2) between (1) and (46))
An improvement is to carry (46) back down the line until I find (2). This will save me some work later. When (2) is found, drop (46) and carry (2) back to it’s place.
One problem you have is that you are not sure of the absolute position of anything. If I am carrying (46) back down the line, I know approximately where half-way is, but not with any certainty. If I think I am past the 46th position, I should drop (46) and maybe pick up something else with a higher number and keep on scanning for (2).
Can this be improved upon? This is not homework… I am many years past that!
Sure.
Walk the line once, recording the positions of each boulder, then figure out the optimal re-ordering steps to minimize work on paper, then execute.
Ok, some limitations: I am a lazy human being, not a computer. It costs to traverse the line (i.e. amount of walking should be minimized). And I can carry only one thing at a time.
What I would do is to go the first one. If it is number 79, then I would walk till I had passed 78 of them and drop it there. I would pick up one of the adjacent ones, say the next one, and if it had, say, number 17 on it, I would take it back 63 (= 80-17) places and drop it there. The point is that I would never have to handle a rock more than once, nor would I have to look at all the rocks to find a particular one. I would just take each rock and put it where it belongs. I don’t see how to beat that algorithm.
I don’t think any of the standard computer science algorithms are going to be appropriate here, since the parameters are rather different. First, the cost of moving items is significantly higher than (say) the cost of doing comparisons, and second, we’re told that we don’t have to move anything out of the way to do insertions. Third, the OP seems to suggest that we know exactly what set of boulders we have, the numbers from 1 to 100.
I suspect that the optimal routine is something along the lines of finding the longest streak you can of not-necessarily-adjacent boulders in the right order, already, and then one at a time move the other boulders to interpolated positions between those. I’m not sure if there’s a good lazy-human algorithm for finding that longest existing streak, though.
Since the performance of an algorithm by a human would involve effort and endurance, the algorithm to choose would be the one that involves the least amount of rearrangement as possible.
This brings up some questions, though. Do you want to minimize the amount of walking up and down the line, or minimize how many boulders you move (really, you want to minimize both, but there are always worst case and best case scenarios in a stochastic distribution)?
Also, do the numbered positions have to remain static, and you have to move these boulders to those absolute positions, or can you move them relative to the distribution and the algorithm you choose to implement (Say, quick-sort by choosing pivot numbers, and rearranging the boulders around these pivots per iteration resulting in a shift of the “absolute” positions).
Another thing is, the algorithm would have to change for any given circumstance and individual.
Say you had a pool table full of the entire rack of billiard balls randomly scattered by the break.
The first thing I’d do is roll them into two groups: Solids and stripes (we’ll count the cue ball as #16).
Then, I’d find the 1 and 8 ball and roll them beside each other. Then I’d scope for the 4 ball and place it in between. Next, balls 2 and 3—then 6 and 7.
The same thing for the stripes, 9 through the cue ball, with 12 in the center and so on.
I’d do it this way, because while I know the 1 ball is traditionally yellow, 2 is blue, 3 is orange, 4 is dark green, 8 is black, and a few others, I haven’t entirely memorized all the colors according to their numbers.
Some might just simply ignore the numbers themselves and much more efficiently, order them by color alone.
There are only two factors involved: boulder-feet (how far you have to walk carrying boulders) and non-boulder-feet (distance you walk without carrying).
Obviously, each boulder is going to have to be carried at least as far as the distance from its current spot to its intended spot; no getting around that. So, as Hari, mentioned, you can’t do better that just carrying each boulder straight to it’s intended spot (minimum boulder-feet) and then picking up the boulder that was already there (minimum non-boulder-feet). The only variable would be deciding which boulder to start with. I suspect you couldn’t do better than just starting with the closest boulder.
Do the boulders have to occupy the same physical spaces at the end that they do at the beginning? If not, it should be a simple matter of moving the boulders one at a time into a parallel row, putting each in the appropriate spot. After all, the numbering of the boulders tells you the final ordinal position of each boulder. The number of moves is the number of boulders.
Using the same technique, if you consider the distance each boulder has to move, the cost could be seen as the sum of the differences between each boulder’s initial and final positions. The lowest cost is 0, when the boulders are already in order. When the boulders start out in reverse order, the cost is 5000 when measured this way. This neglects the lateral movement of putting the boulders in a parallel row.
It would be a little harder if the numbers on the boulders were guaranteed to be unique, but were not guaranteed to be the numbers 1 through 100 (or any other previously-known set of unique numbers). Then you’d have to do some kind of sorting to figure out where each boulder should go. You could use an efficient algorithm (like quicksort) on the numbers (not on the boulders themselves), and use the results to move the boulders one at a time into a parallel row. The sorting is at best an O(n * log(n)) algorithm, while the moving is still an O(n) algorithm. Adding the costs together results in O(n * log(n)). Realistically, though, for 100 boulders the cost of moving the boulders would be much higher than the cost of sorting 100 numbers. I don’t know how big n would have to be for the cost of sorting to become significant. The cost of moving the boulders would be the same as described above.
The problem gets more interesting if you only allow boulders to be placed in the 100 locations where they started out - that is, if you don’t allow the boulders to be placed in a parallel row. This would force you to move the boulders by swapping them - when you moved boulder A to a position that was occupied by boulder B, you’d have to pick up B in order to put down A. Now the question is how much extra cost is added by this new restriction. I don’t believe that it adds any cost. Each boulder begins at some distance from where it is supposed to be. If you start by picking up some boulder and carrying it to where it is supposed to be, you will leave behind a space without a boulder and encounter another space with a boulder. If you then pick that boulder up and put down the one you’ve been carrying, you can then carry the new boulder to where it should be. If you repeat this process you will eventually come back to the empty space you left behind when you picked up the first boulder. Then you can go find another boulder that is out of place and repeat the process. After 100 moves all the boulders will be in their correct spots, and you will have made the minimum number of moves with the minimum carrying distance.
Through all of this I have been neglecting the distance you have to move while not carrying boulders. It’s obviously a non-zero cost, but I don’t know how to account for it.
The problem is that this is a trivial sort for a computer, if I’m interpreting your scenario correctly. You seem to imply the boulders are numbered [1…100], in which case this becomes a trivial degeneration of Bucket Sort: you make an array of size 100, and for the boulder labeled “n” you put it in slot array[n-1]. This is also a trivial sort for an average human.
However, you introduce a problem which puts this into a different class altogether. The issue is that you posit a continuous space, knowing only the starting point and ending point. This could end up really easy or really hard depending on a single constraint: can the boulders be variable sized or not? If the boulders have no uniform size (or at least if you don’t know the sizes of each individual boulder ahead of time), this becomes a lot more complicated, and starts getting into spacial partitioning algorithms.
If they are uniform and you have a known size, then it’s relatively easy still: from the start point measure out 100 uniform spaces of size BoulderSize and then use your trivial sort. If they’re non-uniform but have known sizes, then you can still do the measuring trick. It’s just when they start to have unknown sizes (or even if you have a list of known sizes, but not what boulders those sizes correspond to) that you get issues.
If you really mean you don’t care about the space for insertions all you have to do is carry every rock to the middle of the line. That minimizes the distance travelled carrying rocks. If the rocks are not uniformly spread out it becomes more complicated, as the point where you want to deposit the rocks is not the middle point, but it probably isn’t far away.
If you want each location that currently holds a rock to hold a rock once you are done the problem is easy to specify. The distance you must carry rocks is fixed. So long as you don’t actually carry a rock away from its destination, each rock simply travels the distance from its current location to its correct location. Thus the optimisation is to reduce the amount of time you walk whilst not carrying a rock. So whenever possible, when you drop a rock, you want to pick up another rock. You want to avoid the situation where towards the end of the sort process you are walking a significant distance to the remaining rocks that need moving.
This suggests a few tactics. You want to get the ends sorted early and move towards the centre shuffling rocks. This avoids suddenly having to traverse a long distance to pick up a remaining rock. You can exchange rocks any-time you pass a rock that it would be better to be carrying, even if you reverse direction, so long as you are always taking the rock in the right direction.
Off the top of my head an algorithm like this:
Start at the beginning. If the rock next to you is not in its right place, pick it up. Start walking. Every time you reach a new rock, keep count of where you are. When you reach a rock that has a higher number than the one you are carrying, swap rocks and continue to walk. You will eventually be carrying the last unsorted rock. When you reach the location it belongs, swap rocks and start walking back to the beginning. Whenever you reach a rock that has a lower number than the one you are carrying, swap rocks. Eventually you will be carrying the first unsorted rock. Now repeat. Eventually you will end up in the middle, and the rocks on both sides will have been sorted.
It occurs to me that this is actually a travelling problem. You have 200 locations, with cost D being the cost of moving from one location to another while not carrying a rock, and cost m*D while carrying a rock (where m is some constant scaling factor due to the weight of the rock). If we assume we don’t have to search for the rocks (i.e., we know which ones in the initial slots have which numbers) – and assume that we start at some position like Empty Slot 50 – or that we can magically start in any empty slot, we can relax the problem a bit into saying that a single action involves going to a rock and bringing it to its corresponding slot meaning that you’ll make exactly 100 moves where each has cost D[sub]1[/sub]+mD[sub]2[/sub] where D[sub]1[/sub] is the distance from state s to the intermediate state where the next rock is, and D[sub]2[/sub]m is the weighted distance to state (s+1), i.e., the next filled slot.
The problem is, then, figuring out the least-cost path. In which case A* with a properly admissible and consistent heuristic (based on minimum spanning tree maybe?) is probably what we’re looking at. Or uniform cost search, but given that we have a depth 100 tree with a branching factor of (100-k), where k is the current depth, that would probably take a while.
This clearly isn’t optimal. If all the boulders happened to be in the correct order to start with you’ll move them all to the center when you didn’t need to do anything.
It seems to me the math of the problem is this. Suppose you want N boulders to end on a line running from 0 to L. The initial position of bounder with number i is X(i). Assuming you wish to minimize the total distance you carry boulders, you want to chose locations Z(i) such that 0 < Z(1) < … < Z(N) < L that minimizes sum |X(i) - Z(i)|.
You can set this up as a straightforward optimization defining the variables u(i) = max(Z(i) - X(i),0) v(i) = max(X(i) - Z(i), 0) then minimizing sum 1 to N of u(i) + v(i) subject to X(i) + u(i) - v(i) < X(i+1) + u(i+1) - v(i+1) all i.
I don’t even think you have to add the constraint that v(i)*u(i) = 0 (i.e., one of them is zero). I’d think that will be automatically satisfied. If I’m right, this is just a linear programing problem.
I think this can be set up as a linear program by using 2N choice variables one each for max(X(i) - Z(i),0) and max(Z(i) - X(i),0)
On a practical level, this question is also complicated by the fact that some things that are traditionally “slow” for a computer are in fact nearly instantaneous for a human. If I’m carrying a boulder with the number “37” painted on it and walk up to a line of less than 100 boulders with numbers on them which are already sorted, the amount of time it takes me to find where to insert 37 is basically nonexistant (assuming the numbers are painted nice and large, not something thing where I have to walk up to each boulder individually and inspect it via magnifying class to see its number.) On the other hands, some things that are trivial for a computer (counting how many boulders you walk past and totally inerrantly remembering that number and adding and subtracting things from it) are difficult and not dependable for humans.
So, since we have this magic inserting power, I think an insertion sort has to be close to optimal. Walk up the beginning of the line, see what number is there. Look at the 2nd boulder. If it’s larger than the first boulder, do nothing, otherwise, swap them. Look at the 3rd boulder. If it’s larger than #2, do nothing, otherwise pick it up and put it in its correct place in the line, etc. Of course it’s somewhat undefined how our magic inserting power works…