I’m trying to find an algorithm to solve an optimization problem. I’ve come up with several potential algorithms on my own, but I’ve punched holes in all of them (too slow, won’t find minimum in some simple case, etc). Somebody has probably figured it out, but I can’t find anything because I don’t know search terms that are specific enough. So I turn to the SDMB. If you have a reference to an algorithm that would solve this problem, the name of this type of problem, or any other insights about it, I would really appreciate your help. I realize there may not be a good algorithm for a global minimum, but an algorithm that would find a local minimum that is known to be at least close to the global minimum would be acceptable.
Problem particulars
Given:
n lists of m random positive real numbers per list
Let xji = number j in list i
Let pi = parameter for list i
sum(i = 1 to n, pi) = constant
Let ceiling® = smallest integer greater than or equal to r
Let Q = sum(j = 1 to m, ceiling(max(i = 1 to n, xji/pi)))
Find:
Parameters p1 through pn such that Q is minimized
I would be solving with a computer, with n < about 10 and m < about 10,000.
Example calculation of Q
(I’ve used integers for xji and pi, for simplicity)
n=3 and m=5
p1 = 15, p2 = 18, p3 = 12
For the problem as stated I’d think the answer is select the p’s to be + or - very big numbers then all the things you sum will be very close to zero. You can make Q as close to zero as you want. So I assume you probably mean to constrain the p’s to be positive.
Second I doubt any standard optimization algorithm is going to handle the ceiling function easily without much trial and error.
One thing to check would be to find the total of each column and assign the p’s so that the sum of (each column total)/p is minimized. I think that’s the best thing to do (approximately ignoring the ceiling) on average if the numbers are iid.
As above, the presence of the ceiling function pretty well wrecks things. Without it you could likely solve it in closed form. But once you have a component that creates discontinuities you are in deep trouble. Most optimisation heuristics are about coping with discontinuities in one for or another, but no-one expects them to get to the actual true optimal, and sometimes they behave quite badly.
For some reason I suspect a simulated annealing technique might work reasonably well. But you are always going to be finding a local optimum.
If the p[sub]i[/sub] values are constrained to be positive only you can enumerate the entire search space. You may be able to search that tractably.
OldGuy and Francis Vaughan–thanks for your thoughts.
Yes, the pi values are constrained to be > 0, so they can’t be arbitrarily large.
In my (limited) experience, max and min functions can be as troublesome as something like ceiling. If I ignored the ceiling function, and tried to minimize the sum of the maxima, is an algorithm straightforward? If so, maybe it would be a good starting point at least.
I don’t mind a local minimum as long as I can be confident that it’s not too much larger than the global minimum. I’m not sure if that can be determined without knowing the actual value of the global minimum, but I guess I could compare the solution to some simple lower bound and hope the lower bound isn’t too far from the unknown global minimum.
I’m also looking for terms I could search with to find more specific results than I’ve found so far. Is there a name for this type of problem or corresponding solution algorithm that I could search for? (e.g. “shortest path” algorithm for “traveling salesman” problem.) I’ve read about simulated annealing, but it’s been a couple years. I’ll revisit that and see if it gives me any ideas.
I have tried the following algorithm that tries to enumerate the entire search space, but it was so slow for n=4 and m=3500 I had to abandon it as I figured it would take about 6 hours (there are some other operations that I’ve left out in order to present the essence of the problem).
[ol]
[li]For each list, generate a list of Q vs. p, ignoring the other lists (i.e. figure out where the steps–due to the ceiling function–are.) For my trial, I ended up with 3 values for 1 list, about 120 for the 2nd list, and about 70 for the 3rd list (and left the 4th to be calculated from the given sum minus the sum of the first three).[/li]
[li]Calculate Q for each combination (about 25,000 combinations).[/li][/ol]
I’ve thought of a few different schemes to eliminate some of those combinations, but haven’t actually implemented any of them because they fail in some simple cases on paper. One of the schemes is to always increase p for the list that has the highest Q for its current value of p. Another is calculate Q if p1 is increased to its next value; calculate Q if p2 is increased instead; etc for each list; then actually implement the one that gives the lowest Q, and repeat. A third is similar to that, only implementing the one that results in the greatest decrease in Q per increase in p. The 2nd and 3rd might get me a local minimum, but I’m not confident that it would be anywhere near the global minimum.
Missed edit window, clarifying last paragraph of my previous post:
I’ve thought of a few different schemes to eliminate some of those combinations, but haven’t actually implemented any of them because they fail in some simple cases on paper. One of the schemes is to start with small values of p (ignoring the last list) always increase p for the list (again, ignoring the last list) that has the highest Q for its current value of p. Another is calculate Q if p1 is increased to its next value; calculate Q if p2 is increased instead; etc for each list (except the last); then actually implement the one that gives the lowest Q, and repeat. A third is similar to that, but implementing the one that results in the greatest decrease in Q per increase in p. The 2nd and 3rd might get me a local minimum, but I’m not confident that it would be anywhere near the global minimum.
I think your intuition that max is “more non-linear” than ceiling is correct.
Simulated annealing is my “go-to” solution for problems of this type. (If it doesn’t work, odds are only something more complicated would.) By re-running with several different random seeds, you may get a feel for whether poor local minima cause trouble. (I’d be willing to play with the problem for fun, but play results might be irrelevant without your real data.)
You could look into the Solver add-in for Excel. Here is some basic info, but google will show a lot more.
Basically it allows you to minimize (or maximize or come close to a target) a quantity by changing selected cells based on a set of constraints on the cells.
Reviewing the problem, I see that you must minimize the sum of a very large number (m) of unrelated dependent variables. It would be nice if annealing steps didn’t use O(m) time.
I think it would be much more complicated than what OP is looking for, but “Predator-Prey Co-optimization,” possibly introduced by Daniel Hillis, addresses that difficulty. In this case the secondary optimization would be to find a “best” subset of the m n-tuples, meaning the small subset whose optimal parameters best reflect the global optimal parameters.
(I doubt OP needs anything that complicated, but thought the little-known(?) Predator-Prey Co-optimization approach might warrant a “plug.” Again, I wouldn’t guess OP’s best approach without examining the application’s actual {x[sub]ji[/sub]}. )
I’ve had a chance to read the wikipedia article on simulated annealing, and I think it looks promising. I’ve got to think about it a while to understand it enough to implement it. (Good thing this is mostly for fun!) For example, the pseudocode algorithm presented is:
s ← s0; e ← E(s) // Initial state, energy.
sbest ← s; ebest ← e // Initial "best" solution
k ← 0 // Energy evaluation count.
while k < kmax and e > emax // While time left & not good enough:
T ← temperature(k/kmax) // Temperature calculation.
snew ← neighbour(s) // Pick some neighbour.
enew ← E(snew) // Compute its energy.
if P(e, enew, T) > random() then // Should we move to it?
s ← snew; e ← enew // Yes, change state.
if enew < ebest then // Is this a new best?
sbest ← snew; ebest ← enew // Save 'new neighbour' to 'best found'.
k ← k + 1 // One more evaluation done
return sbest // Return the best solution found.
s is equivalent to my set of pi, E(s) is equivalent to my Q. If I understand correctly, a possible next s is snew, which is a function of s. If E(snew) is better than E(s), there is a high probability I will change to snew. If E(snew) is worse than E(s), there is a small probability I will change to snew. However, in the latter case (and possibly the former as well) there is a probability that I do not change to snew, which means I go back and choose a new snew (assuming k < kmax). How do I choose a new snew, if snew is a function of (only) s? Obviously there has to be an allowance to choose a new snew that is not always the same as the current snew. But do I allow for the possiblity of picking the same snew a 2nd time after rejecting it once? If not, then I’ve got to keep track of which snew values have been proposed and rejected, and provide for the possibility of running out of potential snew values.
Let me see if I can get the n=4, m=3500 dataset together for you. It might be somewhat of a trivial case because the behavior is dominated by one list (but it is real). Let me know if you have a preferred format, otherwise I’ll just prepare a tab-delimited plain-text file. (And don’t feel like you have to do anything. I appreciate the offer in any case.)
Best call s a vector, not a set. snew = s + d, where d is a random vector of small norm, and zero sum. Don’t worry about snew not changing: the random generator will be outputting different values.
If snew = s + d outperforms s, then s + 1.5*d might perform better still: this suggests the “momentum” concept, one of the first heuristics in simulated annealing.
I’ll bet there are a variety of simulated annealing packages available on-line, though the task is simple enough that implementing it from scratch may be best, especially if a high value is placed on fun !
If you ignore the ceiling function and approximate the maximum by the log-sum-exp, you have a nice smooth nonlinear optimization problem that you can solve with gradient descent or Newton’s method. That will you give a good approximate solution that might be good enough if you don’t need the exact value, or can at least be a starting point for some other heuristic.
I gather that finding some suitably generic algorithm was more importan than finding the actual answer. We know the answer is P values should be infinite, but he wants to know what algorithm to use, if we had a “black box” telling us function( Xji , Pi ).
the random numbers would have to be constrained to some range, and the parameters would have to be constrained.
But no matter, perhaps its the concept that is more important.
Slope… Try P= 100,100,100 then P=100,100,1000
Then P=100,1000,100
Then P=1000,100,100
Now you know that the direction is ‘above 100’.
If multiplying the P value by 10 improved the result, do it again
if not, estimate by the change in Q between the last two results as what what P should be
I said multiply P by 10 so that it would rapidaly allow paramaters to go from from 100 to a million. You didn’t constrain the random real numbers, so if you do constrain them to being less than a googol, you might replace 10 with 10^10.
I am interested more in concepts and algorithms than absolutes. But if it helps, the values of xji are typically between 5 and 500. I’m familiar with Newton’s method (including in multiple dimensions) but I also know that it will typically only find the local minimum nearest where it began (which will be the global minimum only if the function is convex).
I am most interested in algorithms to find the global minimum (or near enough) for a function involving a sum of maxima, and if there were anything more specific to my problem, then that would be a bonus.