Archive for the ‘Mathematics’ Category

The (Mis)Behavior of Markets

Saturday, December 6th, 2008

The global financial system is a house built on sand. This is the main argument that the highly acclaimed mathematician Benoit Mandelbrot makes in his 2004 book “The Misbehavior of Markets”. And here we are, at the dusk of 2008, having seen global markets plummet an average 50% the last few months. Could he be on to something?

Mandelbrot wastes little time in the introduction and you get the main ideas simply by reading the first part of the book, titled: “The Old Way”. This is nothing less than a full-scale assault on the underlying assumptions of modern finance. He shows how the Capital Asset Pricing Model (CAPM), Markowitz Portfolio Theory (MPT) and the Black-Scholes options pricing method – the most important elements of orthodox financial theory – are all based on two shaky assumptions:

1. The magnitude of price moves are distributed according to a Bell curve, with small movements the most likely, and huge movements nearly impossible.

2. Price moves are temporally independent. At any given moment, all available information is priced into all securities, and thus the next price move is completely independent of the previous one. This is the heart of the Efficient Market Hypothesis.

Mandelbrot wastes no time shooting these assumptions down. The first assumption can be discarded fairly fast: Take the daily closing price of your favorite stock, currency or commodity and put them in a spreadsheet. Then take the log of each price to get relative price moves. Calculate all the daily differences and calculate a mean and standard deviation for the price moves. Now look at the largest moves: How many standard deviations are they? For instance, the 29% price move in the Dow Jones on October 19, 1987, has a probability of 1 in 10^50 based on standard financial reckoning. So what, you may say, we recognize these moves as “measurement errors” or “freak acts of God”. But these are the days when fortunes are made or lost. A model which simply discards the risk of such days, will effect a false sense of security.

The second assumption is harder to attack, but statistical tests show that the absolute size of price moves is correlated across days. If yesterday had a huge price move, you are more likely to see a huge price move today. Mandelbrot captures this in the concept of “trading time” – basically that time moves faster on volatile days and slower on dull days.

Of course, the author of this book should be known to anyone with even a slightest interest in popular mathematics, programming or computer graphics. As the founder of fractal theory – the theory of ruggedness – he has lent his name to the Mandelbrot fractal. Mandelbrot argues that price charts exhibit fractal behaviour: They are self-similar. If you remove the axis labels, you cannot tell if you’re looking on a 5-minute intraday chart or a long-time chart spanning decades. He also shows that simple fractal generators can create much more convincing fake charts than convential financial models can.

What are the implications of this? As I write this, my girlfriend is writing her Chartered Financial Analyst (CFA) exam. Her study material places great emphasis on the CAPM, MPT and Black-Scholes. And in the six volumes of study material, there is but one meager warning that these methods may not always work… Great, we have a horde of financial advisors and analysts out there who think they know how to diversify portfolios so that they give the optimal return for a given risk. But they are not even close to being able to quantify risk. Their alphas and betas are calculated with a mathematical distribution that does not fit the data. The map does not match the terrain, but our financial navigators are nevertheless taught that such discrepancies are small and that the current map is better than no map.

Is it better than no map? I would argue no. A man with no map is less likely to fall off a cliff than a man with a map that says there is no cliff.

In the context of banks, the picture gets particularly disturbing. A bank takes deposits and lends the money to others. In most countries there are reserve requirements, which dictates how much of the deposits should be available as cash or deposits in the central bank at all times. A common requirement is 10% (which is the US requirement). (As a side note, Canada has not had any reserve requirements since 1992. I have not been able to find any numbers for Norway, and Sweden also has no reserve requirement. Jordan is perhaps the safest place to have a bank account, with an 80% reserve requirement). The way risk is calculated on the bank’s assets forms the foundation for how much capital the bank will set aside to weather any financial storm. If risk is severly underestimated, the bank may end up having insufficient capital and go into bankruptcy.

Of course, if banks had completely transparent book-keeping and disclosed all their losses continuously, then the bank would declare bankruptcy just as its net value becomes 0 and the effect is that you have assets than can be liquidated to back all its liabilities. Of course, the banks’ book-keeping these days is extremely opaque, and their assets (loans, of which many might go bad) extremely illiquid. It doesn’t help that mark-to-market accounting has been suspended, so that the book values are far from what these assets would fetch in a forced liquidation. The end result is that when your bank finally declares bankruptcy, its assets will be inadequate to satisfy all the depositors demands.

But it says my deposits are insured! Indeed, sporadic bank failures are not a big problem because bank deposits are backed by government deposit guarantees up to a certain amount. However, the funds set aside for the deposit insurance schemes are based on similar risk calculations. The Federal Deposit Insurance Corporation (FDIC) does not set aside money equal to the full amount of deposits it insures. If even one big bank fails, the FDIC will have to draw on the goverment to do one of the following:

1. Spend more tax money to back the guarantees. You pay more taxes to pay yourself your money back. This is by far the most likely option, since most people will have trouble comprehending the previous sentence and are prone to accept this alternative.

2. Print money. The inflationary effect of this ensures that the money you receive is worth less than the money you had in the failed bank, and whatever money you earn every month is also going to be worth less. The net effect is likely to be that the bank deposit is lost. (This is also the kind of move that can lead to the collapse of currencies or even governments if it’s not carried out with extreme care.)

3. Don’t honor the deposit insurance. The deposit is lost. Prepare for riots in the streets!

Either way, your bank deposit is effectively lost. Anyone who tells you otherwise is saying you can have something for nothing. Depending on what option is chosen, you may or may not get to release your anger on bankers and politicians by going after them with torches and pitchforks. (You can tell that I’m favoring option #3)

The end result, guys, is that wherever risk in securities are measured, a new formula which can account for the sometimes violent swings in securities prices is needed. This is particularly important in banking where the public’s money and the stability of society is at risk. Stronger reserve requirements will follow, and bank failures can be a once every 10^47 years event, instead of once every 20 years.

Behind the Scenes of OptiMap

Thursday, July 5th, 2007

What really happens when you hit the ‘Calculate fastest roundtrip’-button in OptiMap? The problem we attempt to solve is a classic and is called the ‘Travelling Salesman Problem’ or simply TSP.

Before we dive into this intrigueing question, some definitions are useful. Each location will be referred to as a node, with location number denoted . There are nodes in total. The trip between a pair of nodes is called an edge, and the trip from node to is denoted . The full set of nodes and edges is called a graph and is denoted.

We note that in the context of driving directions, there is either a way to go from each node to all of the other nodes, or there is no way to reach that node at all. We will assume that there is a way to reach all the nodes, otherwise there would be no solution at all. For ease of notation, we say that the time it takes to traverse an edge is .

Another characteristic of the problem we try to solve is that the time associated with an edge is not necessarily equal to the time associated with its reverse edge. This is the result of for instance one-way streets and that it’s easier to turn right than left at an intersection. Thus, our graph is directed.

The first part of solving the problem is defining it. We seek to find an ordering of the nodes, starting and ending at node 1, that visits each location exactly once. This ordering should minimize the total time it takes to traverse.

There are possible roundtrips that visits each location exactly once. Evaluating each possible roundtrip, a so-called brute-force approach gets infeasible between 10 to 20 nodes, depending on your patience and hardware. Actually, all known methods of finding the optimal solution use resources exponential in .

With 9 or less nodes, it is feasible to write code in JavaScript. So in this case, you get the optimal roundtrip.

If we’re willing to settle for near-optimal solutions, a number of methods are available. These are called heuristic methods. The simplest of these methods is to always travel to the unvisited node that is closest to the one you’re currently at. This method is referred to as ‘Greedy’ in the performance results below.

There are better methods. ‘Ant Colony Optimization’ is one of them. Inspired by the way ants find food in nature, this method uses a swarm of ants, each of which is fairly dumb, to find an intelligent overall solution. Real ants leave pheromone trails when they walk. They can also smell these trails. If an ant finds food close to the ant colony, this trail will be traversed faster than the others and have a stronger smell, because the pheromone dissipates over time.

By letting virtual ants walk randomly around in the graph, we get roundtrips. The ant roundtrips are evaluated by the time they took to complete, with faster roundtrips being assigned a stronger smell. An important question is how the virtual ants choose which nodes to visit next. Probabilities are assigned to each edge going to an unvisited node, depending on its assigned time and how smelly it is:

Here, is the smell assigned to edge . The ants pick edges according to these probabilites. The constants and can be used to fine-tune the performance of the ant method.

Another trick which is called k2-opting is used when an ant has completed its roundtrip. We pick two edges of the tour, and see if the edges can be swapped to create a better tour.

The k2-opting procedure is repeated until no two edges can be swapped to create a faster roundtrip. k2-opting is a cheap way of improving many TSP heuristics.

Performance:

Test Case Optimal Solution Greedy ACO ACO k2-opt
n = 10 28 167 34 011 28 563 28 167
n = 11 28 294 29 758 29 542 28 294
n = 8 25 310 26 515 25 310 25 310
n = 12 36 204 41 211 39 404 36 204
n = 12 (Paris) 11 141 12 705 12 062 11 141
n = 12 (Berlin) 10 570 11 429 11 789 10 570
n = 12 (N.Y) 7 608 8 714 8 361 7 608
n = 12 (London) 4 729 4 845 5 220 4 729

The numbers are in seconds. The ACO column is with 30 ants and 30 waves, the k2-opt ACO column is with 10 ants and 10 waves, to make up for the extra computation needed to do the rewiring. The results suggest that the current method should find roundtrips very close to the optimal.

A Fastest Roundtrip Finder for Google Maps

Tuesday, July 3rd, 2007

Imagine you are a salesperson and have to visit a number of customers. However, you want to spend as little time as possible driving. If you only have to visit two or three locations, it is usually easy to find the optimal route. You can use regular map services such as Google Maps, Yahoo! Maps or MapQuest to find the shortest path between two places. However, as the number of locations to visit grows, the task of finding the order in which to visit the locations becomes daunting.

Despair not, I have created OptiMap, the answer to all your roundtrip troubles. At least the troubles that involve 20 or less locations to visit. There might be time and fuel costs (and thus greenhouse gas emissions) to save here, but don’t come sue me later if you find a better route. The application is only as accurate as the data that Google Maps supplies to it. Furthermore, when 10 or more locations are entered, a heuristic called Ant Colony Optimization (with some other tricks, too) is applied instead of trying every possible ordering, so there’s no guarantee of finding the optimal route. The heuristic usually finds a solution impressively close to the optimal, however.

A (Very) Simple Oil Field

Tuesday, June 19th, 2007

Assume and are the amount of oil in the ground and the production rate respectively. We model the behaviour of the oil field with a system of linear ordinary differential equations (ODEs).

In clear text, this means that the amount of oil in the ground decreases by the amount produced, and the production capacity increases if it is small compared to the amount of oil left in the field, and decreases if it is large compared to this amount.

Now, the system is easily transformed into a single ODE by differentiating the second equation and inserting for from the first equation. This gives

which is has a simple analytical solution. The general solution is

Now we impose the initial conditions

which translates to some finite amount of oil in the reservoir, and the production capacity 0 at the time we start. These two conditions are used to determine the coefficients and . The solution is then

This solution reveals that the model has some obvious flaws. The first thing that becomes apparent when plotting the solution (or simply noticing the sine factor), is that the production becomes negative at times. Another flaw, is that if we integrate the production from the start to the time it becomes negative, the amount of oil extracted is larger than the amount that was originally in the field. Clearly, some modifications to the model are needed. A fix is to replace the first equation by

This new equation makes the system a lot harder to solve by hand. Using a computer program and Euler’s method for explicit time-integration, it was easy to plot the result, however.

This time, the integral is 1 (at least with numeric integration), and the production is never negative.

Compared to real-life oil fields, the model production is ramped up too fast. The model does not incorporate the effects of limited manpower, investment and equipment. A sharp cliff is present where the technical production capacity exceeds the geological capacity of the field. This cliff should not be observed in any well-planned oil project, since no-one would invest in bringing production capacity beyond what the field can handle.