I'm cycling across the United States. You can track my progress here.
Planning the trip took a long, long time. Years of dreaming about it. Months of pumping people up. Weeks and weeks of buying gear, going on training rides, and planning the trip with meticulous detail.
Of particular interest is the day-to-day itinerary. The itinerary was way more involved than I expected, and to my delight, took an interdisciplinary turn very quickly.
This should be easy right?
Here's the gist of our itinerary problem for biking across the US:
- Our support driver insisted that every single place be booked in advance, so we had to book the entire trip ahead of time.
- There were about 200 possible stops along the way in which we were willing to stay
- We leave a few days after graduation, and my work starts promptly on July 20th, so we have exactly 30 days to complete the journey, non-negotiable.
- This means we need to choose 29 stops in which to stay
- We wanted to maximize the "evenness", or minimize the variance, of the mileage per day. That is, it's better to bike 100, 100, and 100 miles on the first three days than 50, 60, and 190.
It's all pretty logical stuff. We needed a place to stay each night, and wanted to avoid days with unreasonable biking distances.
Attempt #1: Greed
First, let's be greedy. choosing the farthest distance on each day will guarantee that you complete it in time. For example, if your daily limit is 200 miles, choosing the last stop on the 200 mile journey each day will guarantee completion. This is called a greedy algorithm, and the proof for its correctness is fairly straightforward:
- For any given stop S, no stop before the greedy stop G on the day before lets you access S without it also being true for G
- Therefore, G could not have been a bad choice
Even though this guarantees completion within the time limit, it doesn't minimize evenness. It only ensures that the the basic human limit is not exceeded.
Attempt #2: Brute Force
Okay, this one went by quickly.
Wolfram Alpha tells me that there are 71873983337215398865064302392798400 ways to choose 29 stops from 200. Apparently this is far more than the number of nanoseconds since the age of the universe, so computing the solution by trying everything is not a good idea.
Attempt #3: Constrained Optimization
We can try to represent the problem in math as constrained optimization.
Let each possible stop Pi be a distance Di from the starting point. Given n points, we are trying to calculate the minimum of f(D), where f(D) happens to be:
subject to the caveat that the distances D actually must contain valid distances from the stops to the origin. This constraint is very problematic. To satisfy it, we need a polynomial function g(D) like this one:
This is problematic because we are dealing with polynomial programming, with a constraint that increases in power with n. Hardly a recipe for a feasible solution.
In light of things starting to look more and more difficult and math-y, I finally stumbled upon the solution.
Get ready for it:
there is no solution.
In mathematics, there are a class of problems that, barring a miracle in computer science, cannot be solved quickly by a computer. Some of these problems are deemed Non-deterministic Polynomial Time (NP) hard.
Not only can these problems most likely not be solved quickly, but when given the correct solution, one can't even verify its correctness quickly.
Unfortunately, with some number crunching, our problem can be represented as an NP-hard problem.
Suppose that, instead of representing a vector of stops D in the previous example, we represented a vector of miles called M. For the sake of simplicity, let's round each stop to the closest mile. We define M as such:
We now define a distance matrix Q as:
Since we are minimizing MQMT, this type of problem boils down to a Quadratic Unconstrained Binary Optimization (QUBO) problem in the best case. QUBO is NP-hard.
It turns out though, that we're in even worse shape. Since we must stop exactly 29 times, M has yet another constraint. It must contain exactly 29 ones.
Because of this constraint, we are no longer QUBO. We are a zero-one Quadratic Integer Programming problem, which is unsurprisingly NP-hard as well.
What do we do now?
We eyeballed everything on our maps repeatedly, divided the entire route into 29 roughly equal parts and finagled with everything endlessly. When we resolved an issue such as a 200 mile day, another one popped up elsewhere. Towns sometimes didn't let us book in advance. We started to grow weary.
And then we gave up. Even now, it almost works but not quite. We still have a 200 mile day towards the end of our trip, and distances do vary significantly by the day. Manual heuristics got us reasonably far with the itinerary planning, but I keep feeling like we could have done better.
Genetics and Blacksmiths
Even though finding a perfect solution is likely infeasible due to the problem's being NP-hard, solving QUBO problems has been researched extensively because of its applicability to real life situations.
If I had some more time to work on the problem, I would have tried at least the following two interdisciplinary approaches:
Genetic Algorithm: We can choose a series of k random routes by picking 29 random points from the 200. In each generation, These routes can then "reproduce" new routes by combining features from two parent routes with some probability of mutation. Then, the lowest scoring routes "die off" and are removed from the pool. Hopefully, at the end of a number of generations, we are left with some pretty good routes from which to choose.
Simulated Annealing: We start with a random route and make that our current route. We then pick neighboring routes that are similar to it and accept or reject that route in proportion to how good it is relative to the current route. If it's good enough, it becomes our current route. Like the metallurgic name suggests, periodically we become more liberal with our "accepts" and gradually become very conservative over time. Given many trials, hopefully we end up with a good route.
Planning is over though, and it's time to ride!
Once again, check us out at Two Tired to Care.