Traveling Salesman Problem: Are solutions normally distributed?

KarlGauss · May 21, 2006, 2:00pm

I recognize that finding the optimum solution to the Traveling Salesman Problem is hard (NP-hard, to be precise). My question, then, has to do with the distribution of the values in the solution set for any particular instance of the Problem. Are such solutions normally distributed?

I know I can’t be the first, and probably not even the millionth to wonder about this, but still, let me ask - if the solutions are normally distributed, wouldn’t it be possible to compute a “representative” sample of solutions (requiring much less effort and time than computing them all) and, after, considering the distribution curve that those solutions generate, be able to determine a “nearly optimum” solution", i.e. one within x% of the optimum, where ‘x’ is whatever you want it to be?

As always, thanks!

LiveOnAPlane · May 21, 2006, 2:27pm

Yes, having experienced this in undergrad as well as grad school, there are indeed nearly optimal solutions to this.

But, and I am talking out of my butt here, when one goes to x% of optimal, then one has to compare this aginst the optimal solution.

Therein lies the rub, AFAIK.

chrisk · May 21, 2006, 2:28pm

KarlGauss:

I recognize that finding the optimum solution to the Traveling Salesman Problem is hard (NP-hard, to be precise). My question, then, has to do with the distribution of the values in the solution set for any particular instance of the Problem. Are such solutions normally distributed?

I know I can’t be the first, and probably not even the millionth to wonder about this, but still, let me ask - if the solutions are normally distributed, wouldn’t it be possible to compute a “representative” sample of solutions (requiring much less effort and time than computing them all) and, after, considering the distribution curve that those solutions generate, be able to determine a “nearly optimum” solution", i.e. one within x% of the optimum, where ‘x’ is whatever you want it to be?

As always, thanks!

Hard question to answer.

I would suspect that for the TSP problem in general, you cannot make assumptions about whether the solutions are normally distributed or not… based partly on how the individual graph-edges (costs to travel between any particular pair of cities) are distributed.

In general, I would think that calculating a representative sample and taking the best available would still be a good real-life algorithm to approximate a solution, but it would be hard to tell just how good of an approximation it is without solving for the optimal solution.

It would be an interesting exercise to generate TSP graphs that are as large as possible while still being solvable with brute-force in a decent amount of time… and see… hmm – that’d be the problem actually. If you solve TSP on 12 cities, you have 12! possible solutions, about half a billion, which would be many too many to graph. You might be able to sort them all and then take every millionth solution as a graphing sample. Just to see how the overall distribution tends to go.

ultrafilter · May 21, 2006, 2:32pm

It might be possible, but for any practical purpose, it’s easy enough to get a really good solution through genetic programming.

Not necessarily. You can place error bound on most approximation algorithms out there deductively.

ftg · May 21, 2006, 3:13pm

The OP lacks key information. In particular, what does “normally distributed” range over? So the following might not address the question.

It is trivial to show that approximate TSP is NP-hard. I.e., to find a solution that is only half (or whatever) as good as the optimal is just as hard as the original problem. It is so easy to prove I have given this as a homework problem in algorithms classes.

This is unlike some other NP-Complete problems like bin packing where easy polytime approximation algorithms are known. So this sort of thing varies from problem to problem.

Note that being NP-hard does not mean that the problem is truly difficult. It may turn out to be polytime. (Which would of course surprise most people.)

Chronos · May 22, 2006, 12:56am

Note that just knowing that something’s approximately normally distributed cannot tell you the bound, because normal distributions don’t have bounds.

seenidog · May 22, 2006, 3:21am

okay, I admit there is no chance whatsoever that I will be able to contribute, but would someone please clue me in as to what the traveling salesman problem is so perhaps I might learn something?

ultrafilter · May 22, 2006, 3:46am

There’s a nice write-up here.

KarlGauss · May 22, 2006, 4:49am

Indeed, I actually did know that, but expressed myself very poorly.

What I meant to say, and what I should have said, was that if you you know that the solutions are normally distributed, you should be able to come up with a solution that is “better” than x% of the entire solution set, with ‘x’ being whatever you want it to be.

And, thanks to everyone for their input!

Blaster_Master · May 22, 2006, 2:11pm

I have a hard time believing this problem could be normally distrubuted or even could be represented by a normal distribution. There can be only one (possibly with a few ties) optimal solution. If there’s a normal distribution, this would imply that there are some solutions on both sides of the solution, which violates the optimality of the solution.

There are much simpler ways to get good bounds on a TSPs optimal solution. For instance, a good lower bound is the MST, which can be calculated in O(m) where m is the number of edges. You can also get a reasonable upperbound guarenteed no more than 150% the optimal solution in O(m). There are better quick upper bounds, but they require more than linear time, and I’d have to go back and look at my Combinatorial Optimizations text to remember.

ultrafilter · May 22, 2006, 2:22pm

But there are solutions on both sides of the average solution.

ITR_champion · May 22, 2006, 2:33pm

KarlGauss:

I recognize that finding the optimum solution to the Traveling Salesman Problem is hard (NP-hard, to be precise). My question, then, has to do with the distribution of the values in the solution set for any particular instance of the Problem. Are such solutions normally distributed?

I know I can’t be the first, and probably not even the millionth to wonder about this, but still, let me ask - if the solutions are normally distributed, wouldn’t it be possible to compute a “representative” sample of solutions (requiring much less effort and time than computing them all) and, after, considering the distribution curve that those solutions generate, be able to determine a “nearly optimum” solution", i.e. one within x% of the optimum, where ‘x’ is whatever you want it to be?

As always, thanks!

Let me present my interpretation of what you’re trying to say here. I have a collection of cities, and I know the distances between every pair of these cities. I look at every possible route that goes through all cities without repetition and calculate the length of every route. You’re asking whether these lengths fall into a normal distribution.

In the general case, they do not. I can provide a counterexample.

According to the official definition of the TSP, we may present an instance of the problem with any distances we choose. We may even choose distances that are entirely against intuition. For instance, suppose we take these cities.

Atlanta
Boston
Chicago
Denver
El Paso
Fargo

I will create an instance of the TSP as follows: the distance from Atlanta to Boston is 500 miles. The distance between any other pair of cities is 100 miles.

In that case, what does the distribution of lengths of routes look likes? Given any route, the only relevant question is whether it includes that length from Atlanta to Boston. For any route including the Atlanta-Boston leg, the total length is 500 + 100 + 100 + 100 + 100 + 100 = 1000. For any route not including the Atlanta-Boston leg, the total length is 100 + 100 + 100 + 100 + 100 + 100 = 600. Thus, the distribution of solutions contains only two points, one at 1000 and one at 600.

This is an extreme cases. One might look at restricted cases. For instance, if I declared that all distances on the map would be chosen randomly from between 100 and 200 miles, then the distribution of solutions might well be normal.

Blaster_Master · May 22, 2006, 2:44pm

True. But what good is an average solution in this case in terms of locating an optimal solution? We have no way of knowing how much worse than optimal this average solution is. It would still take O(m) to caluclate an average solution, but we have no idea how close to optimal it is. Further, why would we care? If we calculated 100 different Hamiltonian tours to find a good guess at the average, clearly the minimum of those 100 tours would be a better guess at the optimum than the average of those 100 tours.

ultrafilter · May 22, 2006, 3:51pm

It’s obvious that the solution space is never normally distributed, because it’s a finite space, and the normal distribution is continuous. The question is whether the normal distribution is a reasonable approximation to the distribution of solutions. If it is, the degree to which it’s a good approximation should depend on the number of solution costs. It might be worthwhile to consider TSPs over a (finite) metric space.

It’s not. As you’ll notice, I mentioned in post #4 that there are relatively easy ways to get good solutions. It’s still an interesting theoretical question.

ftg · May 22, 2006, 4:04pm

The current best MST algorithm is not O(m) but almost linear.
What does MST have anything to do with this??? Completely different problems.
See my earlier post. Approximate TSP is NP-hard.

Blaster_Master · May 22, 2006, 6:34pm

No, the fastest IS linear. From the Wiki article you just cited:

As this states, there’s a well known linear algorithm for integer edge weights.

I just read this paper 2-3 weeks ago. Tarjan et al found a linear algorithm with [VERY] high probability for Real edge weights; this was in 1994. According to my professor (sorry, no cite), some time since then they’ve improved upon this result by making it non-randomized, and thus O(m) with no caveat for Real edge weights.

Yes, they are different problems; however, an MST is a fairly tight lower bound on the TSP (MST has n-1 edges, TSP has n eges). Also, MST is used in a lot of poly-time estimations.

As is normally definted, yes. However, if you’re okay with a larger error, you may not need to go through all that work. There ARE linear and higher order poly-time TSP approximating algorithms that guarentee you to be no more than x% greater than the optimal. I know of at least one off the top of my head that guarantees a solution no more than 150% the optimal TSP and can calculate that in O(m) and it requires finding the MST. If <=150% is good enough, why break out the big guns and tackle the harder problem?

chrisk · June 3, 2006, 2:08am

Thought I’d bring this thread back up just to note that I managed to work out every possible solution to a sample randomly generated TSP problem… obviously, the premises that I was working with to generate my problem could have biased the results, but I did generate a cost versus frequency graph to see what it looked like.

It was triangle-shaped. Pretty clearly, as far as I could tell. A sharp incline from the worst possible solutions up to the average, median solutions, and then a sharp decline from those down to the very best ones. Very little trace of curvature in the inclines, though on the ‘up’ slope there was maybe a very little hint of a bell curve. Might have been imagining it though.

ultrafilter · June 3, 2006, 2:10am

How many vertices?

chrisk · June 3, 2006, 11:40am

Twelve, if you mean the cities in my sample.

Topic		Replies	Views
How is the travelling salesman problem in NP? Factual Questions	27	2874	March 13, 2016
Algorithm to create groups with minimal internal duplicates Factual Questions	33	2531	September 27, 2011
How does Google or Mapquest figure out the quickest way to Point A to Point B? Factual Questions	22	10087	August 10, 2010
Shortest Path Problem? Factual Questions	19	6378	October 4, 2010
Is there a free mapping site or app that can do this? Factual Questions	8	2497	May 5, 2014

Traveling Salesman Problem: Are solutions normally distributed?

Related topics