Archive for the ‘Mathematics’ Category

Behind the Scenes of OptiMap

Thursday, July 5th, 2007

What really happens when you hit the ‘Calculate fastest roundtrip’-button in OptiMap? The problem we attempt to solve is a classic and is called the ‘Travelling Salesman Problem’ or simply TSP.

Before we dive into this intrigueing question, some definitions are useful. Each location will be referred to as a node, with location number denoted . There are nodes in total. The trip between a pair of nodes is called an edge, and the trip from node to is denoted . The full set of nodes and edges is called a graph and is denoted.

We note that in the context of driving directions, there is either a way to go from each node to all of the other nodes, or there is no way to reach that node at all. We will assume that there is a way to reach all the nodes, otherwise there would be no solution at all. For ease of notation, we say that the time it takes to traverse an edge is .

Another characteristic of the problem we try to solve is that the time associated with an edge is not necessarily equal to the time associated with its reverse edge. This is the result of for instance one-way streets and that it’s easier to turn right than left at an intersection. Thus, our graph is directed.

The first part of solving the problem is defining it. We seek to find an ordering of the nodes, starting and ending at node 1, that visits each location exactly once. This ordering should minimize the total time it takes to traverse.

There are possible roundtrips that visits each location exactly once. Evaluating each possible roundtrip, a so-called brute-force approach gets infeasible between 10 to 20 nodes, depending on your patience and hardware. Actually, all known methods of finding the optimal solution use resources exponential in .

With 9 or less nodes, it is feasible to write code in JavaScript. So in this case, you get the optimal roundtrip.

If we’re willing to settle for near-optimal solutions, a number of methods are available. These are called heuristic methods. The simplest of these methods is to always travel to the unvisited node that is closest to the one you’re currently at. This method is referred to as ‘Greedy’ in the performance results below.

There are better methods. ‘Ant Colony Optimization’ is one of them. Inspired by the way ants find food in nature, this method uses a swarm of ants, each of which is fairly dumb, to find an intelligent overall solution. Real ants leave pheromone trails when they walk. They can also smell these trails. If an ant finds food close to the ant colony, this trail will be traversed faster than the others and have a stronger smell, because the pheromone dissipates over time.

By letting virtual ants walk randomly around in the graph, we get roundtrips. The ant roundtrips are evaluated by the time they took to complete, with faster roundtrips being assigned a stronger smell. An important question is how the virtual ants choose which nodes to visit next. Probabilities are assigned to each edge going to an unvisited node, depending on its assigned time and how smelly it is:

Here, is the smell assigned to edge . The ants pick edges according to these probabilites. The constants and can be used to fine-tune the performance of the ant method.

Another trick which is called k2-opting is used when an ant has completed its roundtrip. We pick two edges of the tour, and see if the edges can be swapped to create a better tour.

The k2-opting procedure is repeated until no two edges can be swapped to create a faster roundtrip. k2-opting is a cheap way of improving many TSP heuristics.

Performance:

Test Case Optimal Solution Greedy ACO ACO k2-opt
n = 10 28 167 34 011 28 563 28 167
n = 11 28 294 29 758 29 542 28 294
n = 8 25 310 26 515 25 310 25 310
n = 12 36 204 41 211 39 404 36 204
n = 12 (Paris) 11 141 12 705 12 062 11 141
n = 12 (Berlin) 10 570 11 429 11 789 10 570
n = 12 (N.Y) 7 608 8 714 8 361 7 608
n = 12 (London) 4 729 4 845 5 220 4 729

The numbers are in seconds. The ACO column is with 30 ants and 30 waves, the k2-opt ACO column is with 10 ants and 10 waves, to make up for the extra computation needed to do the rewiring. The results suggest that the current method should find roundtrips very close to the optimal.

A Fastest Roundtrip Finder for Google Maps

Tuesday, July 3rd, 2007

Imagine you are a salesperson and have to visit a number of customers. However, you want to spend as little time as possible driving. If you only have to visit two or three locations, it is usually easy to find the optimal route. You can use regular map services such as Google Maps, Yahoo! Maps or MapQuest to find the shortest path between two places. However, as the number of locations to visit grows, the task of finding the order in which to visit the locations becomes daunting.

Despair not, I have created OptiMap, the answer to all your roundtrip troubles. At least the troubles that involve 20 or less locations to visit. There might be time and fuel costs (and thus greenhouse gas emissions) to save here, but don’t come sue me later if you find a better route. The application is only as accurate as the data that Google Maps supplies to it. Furthermore, when 10 or more locations are entered, a heuristic called Ant Colony Optimization (with some other tricks, too) is applied instead of trying every possible ordering, so there’s no guarantee of finding the optimal route. The heuristic usually finds a solution impressively close to the optimal, however.

A (Very) Simple Oil Field

Tuesday, June 19th, 2007

Assume and are the amount of oil in the ground and the production rate respectively. We model the behaviour of the oil field with a system of linear ordinary differential equations (ODEs).

In clear text, this means that the amount of oil in the ground decreases by the amount produced, and the production capacity increases if it is small compared to the amount of oil left in the field, and decreases if it is large compared to this amount.

Now, the system is easily transformed into a single ODE by differentiating the second equation and inserting for from the first equation. This gives

which is has a simple analytical solution. The general solution is

Now we impose the initial conditions

which translates to some finite amount of oil in the reservoir, and the production capacity 0 at the time we start. These two conditions are used to determine the coefficients and . The solution is then

This solution reveals that the model has some obvious flaws. The first thing that becomes apparent when plotting the solution (or simply noticing the sine factor), is that the production becomes negative at times. Another flaw, is that if we integrate the production from the start to the time it becomes negative, the amount of oil extracted is larger than the amount that was originally in the field. Clearly, some modifications to the model are needed. A fix is to replace the first equation by

This new equation makes the system a lot harder to solve by hand. Using a computer program and Euler’s method for explicit time-integration, it was easy to plot the result, however.

This time, the integral is 1 (at least with numeric integration), and the production is never negative.

Compared to real-life oil fields, the model production is ramped up too fast. The model does not incorporate the effects of limited manpower, investment and equipment. A sharp cliff is present where the technical production capacity exceeds the geological capacity of the field. This cliff should not be observed in any well-planned oil project, since no-one would invest in bringing production capacity beyond what the field can handle.