What the Heck is Dynamic Programming, and Why Should I Care?: Being the first in a series on dynamic programming

Dynamic programming. It sounds like one of those Agile methodologies, like Extreme Programming. If you look into it expecting to find project management philosophies, though, you’ll quickly become horrified by the amount of weird-looking math that’s involved. Still, quantitative measures of productivity are good, right?

Dynamic programming is actually not dynamic, and it’s not programming, and it’s definitely not an Agile methodology. The name ‘dynamic programming’ is one of the worst, most misleading names in computer science, right alongside ‘computer science’, which makes non-computer scientists think that we fix people’s computers, but we’re really pretentious about it and think fixing people’s computers puts us on the same level as Albert Einstein and Jonas Salk. But the only other name considered for dynamic programming was ‘bottom-up recursion with memoization’, and that’s about as imposing as it gets, right alongside ‘first order homogeneous linear partial differential equation’, or the ‘infinite-dimensional Banach spaces’ that Guy Steele said in Coders at Work that he couldn’t ever get the hang of.

Still, the name ‘bottom-up recursion with memoization’ at least has the advantage of being accurate, so let’s start with that and try to make it less imposing. After we’ve explored the many parts of that sadly unused name, I’ll sketch out a problem of the type that dynamic programming might be good for. With that example in hand, you’ll hopefully have a better understanding of all the parts. Then I’ll talk a little about why you should care.

Bottom-up recursion with memoization

Dynamic programming is recursion. Whether that helps you or not depends on how much you like recursion (most people don’t much like it).

Recursion tends to work on problems with recursive structure, i.e. a structure where big problems can be cut into smaller problems with the same form, which can be cut into even smaller problems, down and down and down, until you reach problems so small that, hey, you know the answer! Recall the factorial: n! = n(n-1)(n-2)\cdots3\cdot2\cdot1. This function also has a recursive definition: 1! = 1; n! = n(n-1)!. This says that the factorial of 1 is 1, and the factorial of any other natural number n is n times the factorial of n-1. So 4! = 4(3!) = 4\cdot3(2!) = 4\cdot3\cdot2(1!) = 4\cdot3\cdot2\cdot1.

Since dynamic programming is recursion, it also tends to work on problems with recursive structure. It’s often applied to problems with optimal substructure. This means the problem has a recursive structure, but you can cut a problem into smaller problems in lots of different ways. Each of these smaller problems will have its own solution; however, if you can find the best answer among the answers to the smaller problems, you can use that knowledge to get the best answer to the bigger problem. “Best” here means “best” in whatever sense you need it to; you can think of it like giving a solution a rank on a scale of one to five stars, depending on how well it answers your original question. So in problems with optimal substructure, there are usually lots of correct answers, but they won’t always be equally good answers according to your criteria of what makes up a good answer.

Dynamic programming is a technique for solving optimization problems. An optimization problem is one with lots of correct answers, but with some answers better than others according to some measure other than right vs. wrong. Finding the shortest path in a graph, finding the longest increasing subsequence of a sequence of numbers, and pricing some items to get maximum profit are all optimization problems. They involve words like “shortest”, “longest”, “maximum” that imply some kind of measure of quality on the answers you find. And all of these problems can be solved with dynamic programming.

With memoization

Memoization is actually the easiest part of dynamic programming to understand. If you understand hash tables, you understand memoization. You don’t even have to understand how hash tables work; you just have to understand the idea that they can look up stuff really fast.

In normal recursion, your intermediate results are all stored on the call stack. Let’s look at our good friend the Fibonacci sequence in Python.

def fib(n):
    "Calculates the nth Fibonacci number."
    return fib_helper(0, 1, n-2) # 0 and 1 are the first two.

def fib_helper(a, b, n):
    "Does the actual recursion for fib(n)."
    if n == 0:
        return a+b
    else:
        return fib_helper(b, a+b, n-1)

That works, but what if we need to calculate all the Fibonacci numbers between 1000 and 2000? Then we might end up writing something like this:

for i in range(2000):
    f = fib(i)
    if 1000 < f < 2000:
        print(f)

Every time we call fib(i) inside that loop, it calculates the Fibonacci numbers from 0 to i, even though the first i-1 Fibonacci numbers haven’t changed at all since the last time we calculated them. We can’t start at the first Fibonacci number above 1000 because we have no idea what value to give fib to get it. Even if we looked it up somewhere, we’d still end up calculating all the Fibonacci numbers below it a bunch of times and then throwing them away.

This is where memoization comes in. Because looking things up in hash tables (or lists, or arrays) is so fast, we can save a lot of time if we save our calculations in a data structure and look them up when needed. We only calculate a value if it’s not stored in the data structure already.

Here’s a memoizing version of the Fibonacci numbers.

memo = {0: 0, 1: 1}

def fib_memo(n):
    "Calculates Fibonacci sequence with memoization."
    if n in memo:
        return memo[n]
    else:
        memo[n] = fib_helper(0, 1, n - 2)
        return memo[n]

def fib_helper(a, b, n):
    if n <= 0:
        return a + b
    else:
        return fib_helper(b, a+b, n-1)

We first declare a memo, a dictionary whose keys are values of n and whose values are the values of F_n, the nth Fibonacci number. Whenever fib_memo is called, it checks the memo to see if we’ve ever calculated F_n before and stored it there. If so, it just returns that stored value. If not, it calls fib_helper to calculate F_n and stores it in the memo for future reference.

Now when we run the code

for i in range(2000):
    f = fib_memo(i)
    if 1000 < f < 2000:
        print(f)

then instead of recalculating a bunch of Fibonacci numbers each time, we calculate them once, and every time we need them after that, we get them out of a dictionary. Dictionary lookups are super-fast; Python implements its dictionaries as open-address hash tables in C, which means that looking something up is basically a single array access.

That’s pretty much all there is to memoization; it just means storing stuff in a data structure with fast lookup in case you need it again, so you can look it up instead of recalculating.

Bottom-up recursion with memoization

Most common cases of recursion that you see in programming classes are cases of top-down recursion. This means that you start with a big instance of the problem and cut it down until you get instances that are easy to solve.

For instance, with the factorial, you start with a big instance, n!. You cut it into a smaller instance of the problem by rewriting it as n \cdot (n-1)!—you replace an instance with a big factorial by an instance with a smaller factorial. Then you cut down (n-1)! by replacing it with an instance that has an even smaller factorial, (n-2)!.

Mergesort is another example of top-down recursion; you start with a big array and cut it in half. Then you cut the halves in half. Then you cut the fourths in half, and so on, until you have a bunch of arrays with one element, which are already sorted, because there’s nothing to compare them to.

But think about how the Fibonacci function above worked. We didn’t start with a big instance and cut it down into smaller instances. Instead, we started with two small instances—calculating F_0 and F_1, which we already knew were 0 and 1—and built them up into a bigger instance. We could do this because of the formula F_n = F_{n-1} + F_{n-2}, which tells us how to get a bigger Fibonacci number from two smaller ones. In the fib_helper function, in order to know when to stop, I introduced the parameter n, which counts down. When it reaches 0, the recursion stops.

This type of recursion, where you start with small instances and build up bigger ones, is called bottom-up recursion. Dynamic programming algorithms are pretty much all bottom-up recursive, and the reason is optimal substructure. If a problem has optimal substructure, it means that the best solution to a big instance can be computed using the best solutions to smaller instances. We need to know the results from smaller instances to calculate the results of bigger instances.

In the Fibonacci number problem, the formula for F_n depends on the previous two numbers, so to calculate F_n, we have to have F_{n-1} and F_{n-2}. In some problems with optimal substructure, the solution to an n-sized instance depends on the solutions to all the smaller instances, from n-1 all the way down to 1.

Many problems can be solved with either top-down or bottom-up recursion; for instance, we could calculate n! using bottom-up recursion with a function like this:

def bottom_up_factorial(n):
    return factorial_helper(1, 1, n)

def factorial_helper(a, x, n):
    if a == n:
        return x
    else:
        return factorial_helper(a+1, x*(a+1), n)

And here’s a function that calculates Fibonacci numbers with top-down recursion:

def fib(n):
    if n == 0 :
       return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

You can even memoize top-down recursion, although it doesn’t always buy you anything. I don’t know of any problems that really benefit from using top-down recursion with memoization over using dynamic programming.

Bottom-up recursion with memoization

The hook that ties together these three things—bottom-up, recursion, and memoization—is optimal substructure. This will probably become more clear next time when we look at a problem, but for now just try to understand as much as you can.

Having optimal substructure implies that instances of a problem are somehow composed of smaller instances of that problem. This leads us to recursion.

In most of the problems solved by dynamic programming, the same smaller instances of the problem get examined over and over again when we find the answers for larger instances. We use memoization to make this more efficient—instead of recomputing the answers every time we need them, we memoize them the first time and just look it up after that.

We benefit from using bottom-up recursion in this situation for two reasons: it’s more efficient, and it’s easier to understand. If we start with the small instances and calculate their solutions, then we have all those solutions memoized. When we start working on a bigger instance, we know all the smaller answers we need are already stored in the memo. If we did top-down recursion, it might be the case that some of the smaller solutions we needed hadn’t been memoized yet. Then we would have to stop and calculate them. We’d need extra checks to make sure we had the solution, and extra recursive calls to calculate ones we didn’t have, and the program flow is a lot messier. With bottom-up recursion, every value we need is already memoized, so it’s fast and the program flow is cleaner.

That’s vaguely why problems with optimal substructure are so nicely solved by using bottom-up recursion with memoization. Next time we’ll go over some example problems, and hopefully this will all be more clear.

Postscript

I want to talk a little about why you should care about dynamic programming.

Dynamic programming is an advanced algorithm design technique. If you don’t think you need it, you probably don’t. But there are some very nasty problems that we really want to solve, and dynamic programming turns out to be perfect for them.

Weirdly, dynamic programming turns out to be a really good technique for optimization problems on strings. If you’ve ever used TeX, you might have noticed that it makes really nice paragraphs for you when you output to PDF. In Digital Typography, Donald Knuth talks a little about his algorithm for breaking up text into lines that form nice paragraphs. He frames it as an optimization problem: given a big mass of text, there are tons of ways you can choose to break the text into lines that form a paragraph. The problem is to find the “best” way. Knuth invented a way of measuring the “badness” of putting a break in a particular spot. The badness depends on things like how much extra space would need to be added to make the line stretch to the edge of the margin, and whether you have to split a word in half with a hyphen to put a break there. The algorithm uses dynamic programming to build lines, bottom-up, and finds the paragraph with the smallest badness over the line breaks.

Another place where dynamic programming is useful is bioinformatics. This field involves the analysis of genetic sequences gathered in biological research. DNA is made of molecules called nucleotides, strung together into chains. The structure comes from the nucleotide’s nitrogenous base; nucleotides can have one of the four bases adenine, guanine, cytosine, and thiamine. RNA is similar, but it has uracil instead of thiamine.

We can represent a strand of DNA as a string of the letters A, G, C, and T. These strings can be megabytes or gigabytes of text. (You can think of these strings like biological machine code; instead of each bit having the two states, 0 and 1, they have four states.) There are various tasks we might want to do with these strings, like searching for matches between them—maybe we’ve found a particular gene in gorillas, and we want to know if humans also have it.

Dynamic programming is an excellent way to analyze these genetic strings. If you’re interested in learning more about algorithms on genetic data, the book Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, by Dan Gusfield, is very good. Gusfield taught my undergraduate algorithms class at UC Davis and is responsible for most of what I know about dynamic programming; he’s an excellent teacher, good at delivering clear explanations, and somewhat well known in the bioinformatics field, and the book is just as clear as his lectures, and beautifully typeset.

(I’d like to say I was hanging out with Gusfield and soaking up his knowledge while I was in his class, but the truth is we never spoke. I had a two-hour drive to get home, and his office hours were at six at night.)

Anyway, those are a few reasons why you should care, even if you probably won’t ever have to invent a dynamic programming algorithm.

Advertisements