The Dynamic Programming Algorithm for the Travelling Salesman Problem
A reader asked me for some information on how the dynamic programming algorithm for the TSP works. I was surprised to find that a Google search found no good resources. Wikipedia merely acknowledges its existence: “One of the earliest applications of dynamic programming is the Held–Karp algorithm that solves the problem in time O(n22n).”
Consider the TSP problem with N+1 points labeled 0, 1, …, N. We start at point 0. The distance from point i to point j is dist[i][j].
In dynamic programming, we seek to solve a problem by first solving smaller instances of the same problem. We start by looking at the really trivial size problem: What is the best order to visit just one of the destinations?
Assume we have N destinations. There are N size-1 problems, one for each of the N destinations. The best way to visit each is the shortest path from node 0 to node N. We store the best solution to each of these problems in a table:
best[subset][end] = dist[0][end]
Here, subset represents which destinations have been visited. More specifically, subset is a bitmask: an integer where bit i is 1 if destination i has been visited and 0 otherwise. Also note that the stored solution is the solution to the A-Z TSP problem, so it does not include the cost to return to the origin. The variable end is the destination that this A-Z problem ends at.
Now, lets expand to all the A-Z TSP problems of size one larger:
best[subset][end] = min(best[subset \ { end }][i] + dist[i][end])
Here, i is an intermediary node that is 1 in subset and not equal to the end variable. Before we move on to the next size problem, we have to fill in all the possible subsets of the current size. Finding all of those subsets of size s is what the function nextSetOf(s) in BpTspSolver.js does. How do we justify this expansion? It does not matter to the distance of the larger problem in what order the intermediary destinations of the 1-size smaller problem ending at i was visited in, so we are safe to only store and look at the optimal such sub-solution.
Finally, when we reach the desired size where all the destinations are visited, we only need to look up the best solution to the A-Z TSP problem in the table:
best[2^(N+1)-1][N]
(2^(N+1)-1 is simply the integer with all the first N+1 bits set to 1). If we want to return to the origin, we instead take the min over all i of
best[2^(N+1)-1][i] + dist[i][0]
This then becomes the best solution to the roundtrip problem. It is important to note that while this solution may look fast, there are 2^(N+1) different subsets involved (really 2^N, because 0 is always visited). Also, each subset solution needs to be stored, causing high memory usage. However, O(2^N) is still much better than the brute force O(N!) solution. 10! is 3 628 800 while 2^10 is only 1024. With the brute force solution, OptiMap could solve optimally up to 9 destinations. The dynamic programming solution now allows OptiMap to solve optimally up to 15 locations.