CS 512 Homework 6

Homework 6 - Dynamic Programming and Greed

CS 512, Algorithm Design, Spring 2000

The Weight Watchers monthly meeting orders a n-inch sub, for some positive integer n, with everything on it. Unfortunately, the sub maker is careless and doesn't evenly distribute the ingredients along the sub, which means different parts of the sub have different calorie counts; assume the calorie count for integer inch i is given by c(i). Each month k people are expected at the meeting, and, to avoid leftovers, the sub should be divided into no more than k pieces, each piece an integral number of positive inches (that is, 0 <= k <= n). Finally, to avoid arguments over who gets what piece, the sub should be divided such that the largest calorie count is as small as possible.
For example, suppose a three-inch sub has calorie counts (100, 150, 120) and k = 2. Leaving the sub undivided results in one piece with calorie count 100 + 150 + 120 = 370. The sub can be divided into two pieces in two ways: (1 inch, 2 inches) and (2 inches, 1 inch). In the first case, the calorie counts are (100, 150 + 120 = 270); in the second case the calorie counts are (100 + 150 = 250, 120). Because the maximum-calorie piece in the second case (250) is smaller than the maximum-calorie piece in the first case (270) and in the undivided case (370), it is the way the sub should be divided.
As you might expect, dividing up the sub gets more complicated as n and k get larger. Recognizing your expertise as an algorithm designer, the Weight Watchers club asks you for an algorithm to divide the sub according to the rules given above. Also, because the sub usually arrives a few minutes before the meeting starts, they ask that your algorithm be tractable.
A number of people tried to solve this by first arranging the inches of the sub in increasing (or decreasing) order by calorie count, in preparation for a greedy solution. Unfortunately, the problem says you can only cut the sub, not re-arrange it. A few other people tried to draw an analogy between this problem and the matrix multiplication problem. Nobody could convincingly describe in what way the two problems are analogous, and I find the analog murky (to start, the matrix chain had to be fully parenthesized, but the sub had to be cut into no more than k pieces).
Divide the sub as required. Assume the cuts were made from left to right and consider the leftmost (that is, first) cut. Because the maximum piece should be as small as possible, the sub to the right of the cut should be divided to minimize the maximum piece; that is, the division of the sub to the right of the first cut is optimum too. The one piece to the left of the first cut is optimum, so the optimum subproblem holds.
The previous paragraph leads naturally to a recursive solution:
```
int divide-sub(n, k)

  m <- infinity
  for i <- 1 to n
    m = min(m, max(divide-sub(i, k - 1, sum(j = i + 1 to n, c(i))))
  return m
```
divide-sub() does exponential work, but there are only polynomially many subproblems, which you can tell by counting the parameter values to divide-sub(). The repeated subproblem property holds, and there should be a dynamic programming solution to this problem.
Let M be an n by k matrix if integers. M[i, j] contains the size of the largest sum of the last i inches of the sub when divided into at most j pieces. The initial conditions are M[1, for 1 <= j <= k and M[n = sum(i = 1 to n, s_i). Assuming M has been initialized, the algorithm is
```
divide-sub(c, n, k)
for i <- 2 to n
  for j <- 2 to k
    M[i, j] <- infinity
    for k <- 1 to i - 1
      s = max(M[k, j - 1], sum(l = x to i, c(l)))
      if M[i, j] > s 
        M[i, j] = s
```
This algorithm has a triple loop nest, for an O(n³) running time.
Hearing of the Weight Watchers' problems with their subs, Tracy's Deli decides to help them out in the following way: rather than specify subs by length, the Weight Watchers specify subs by total calorie count C; Tracy's will then make a sub of exactly C calories. Do do so, they measure the calories each ingredient adds to the sub - a slice of ingredient i has c(i) calories) - and stop when the sub has C calories. Also, in the interest of saving time and money, Tracy's would like to use the fewest slices possible in a sub. Assuming a slice remains whole, what tractable algorithm can Tracy's use to build the Weight Watchers' subs? Will your algorithm always be able to build a Weight Watchers' sub with a minimum of slices?
Many people drew the analogy between this problem and the 0-1 knapsack problem. At first I thought the analogy was not particularly useful but otherwise harmless. However, it turned out many people focused on the wrong part of the analogy: in the 0-1 knapsack problem, item k is either in the knapsack or it isn't (that's why the knapsack is 0-1). In this problem, however, as many slices as possible from ingredient i can be included in the sub. A direct application of the 0-1 knapsack solution to this problem will fail because you need to keep track of how many slices of i are in the sub, not just whether or not there are slices of i.
First, note that instances of this problem may not have a solution; for example, slices with an even number of calories can't make a sub with an odd number of calories. On the other hand, if one of the slices has one calorie, a solution always exists for any sub.
We can use either greed or dynamic programming to derive an approximation algorithm for this problem. Greedy algorithms are usually simpler and faster to develop (and run) than are dynamic programming algorithms, so let's go with that.
Assume there are n ingredients available for making subs, and let the array cal store the calories per slice in descending order so that ingredient i has cal[i] calories per slice. The greedy algorithm makesub() returns the number of slices of each ingredient required to make a sub of no more than C calories:
```
[int] makesub(C)

  i <- 0
  slices <- [0, ..., 0]
  while (i < n) and (C > 0)
    if C > cal[i]
      C <- C - cal[i]
      slices[i]++

  return slices
```
Does makesub() find an optimal answer when one exists? Unfortunately, it does not. Suppose C = 35 and cal = [31, 20, 15, 1]. The optimal solution is slices = [0, 1, 1, 0], but makesub() returns [1, 0, 0, 4]. This problem does not exhibit the greedy choice property.
The Acme Assembly Co. uses a conveyer belt to feed an assembly robot a sequence of parts p₁, p₂, ..., p_n. In the interest of increasing productivity and saving money, Acme would like to have a second assembly robot use the same conveyer belt to assemble the sequence of parts q₁, q₂, ..., q_m. The parts p and q may be inter-mixed in any order on the conveyer belt, but each robot must receive its parts in the order p₁ (or q₁), p₂ (or q₂) and so on.
Unfortunately, the program that determines the conveyer-belt part sequence occasionally produces an sequence that's incorrect (for example, p₁, q₂, p₂, q₁, ...). Acme would like you to write a program that accepts as input the conveyer belt sequence c₁, c₂, ..., c_{m + n} and the two part sequences p₁, ..., p_n and q₁, ..., q_m and determine if the conveyer-belt sequence is a proper interleaving of the part sequences. Acme would like your algorithm to be as efficient as possible.
To restate the problem more compactly: Given the three string S, S₁ and S with |S| = |S₁| + |S₂|, do S₁ and S₂ interleave to produce S?
First, let's take a look at what doesn't work: a linear scan through S. Linear scan may incorrectly fail when S₁ and S₂ have common parts. For example, consider

S = abcb

S₁ = b

S₂ = abc

A linear scan through S to find S₁ latches onto the first b in S; removing S₁ turns S into acb, which is not equal to S₂ and the linear scan concludes that S is not an interleaving of S₁ and S₂. However, that S = S₂S₁.
This problem has a simple brute-force solution: generate all possible interleavings of S₁ and S₂ and check to see if each interleaving equals S. Form an n-bit vector where n = |S| and bit i corresponds to the ith character in S. When i is 1, assume the corresponding character in S came from S₁, otherwise the character came from S₂. Generate all possible n-bit vectors, extracting each interleaving from S and checking it against S₁ and S₂. This is an exponential algorithm.
How does this problem subdivide? If S is an interleaving of S₁ and S₂ then either S[0] = S₁[0] or S[0] = S₂[0]. If, for example, S[0] = S₁[0], then S[2 ..] is a interleaving of S₁[2 ..] and S₂; that is, the optimal subproblem property holds. The recursive algorithm follows directly:
```
bool interleaving(S, S1, S2)

  if |S1| + |S2| = 0
    return |S| = 0

  if S[0] = S1[0] and interleaving(S[2 ..], S1[2 ..]), S2)
    return true

  return S[0] = S2[0] and interleaving(S[2 ..], S1, S2[2 ..])
```
A quick worst-case analysis gives the recurrence relation T(n) = 2T(n - 1), which has exponential asymptotic behavior. However, there are only |S₁||S₂| subproblems to solve; that is, the repeated subproblem property holds.
We can exploit the repeated subproblem property with a boolean matrix M of dimension |S₁|x|S₂|. The matrix element M[i, j] is true if and only if the substring S[0 .. i+j-1] is an interleaving of the strings S₁[0 .. i-1] and S₂[0 .. j-1]. M[i, j] is true if and only if
- M[i-1, j] is true and S[i + j - 1] = S₁[i] or
- M[i, j-1] is true and S[i + j - 1] = S₂[j]
The dynamic programming version follows immediately
```
bool interleaving(S, S1, S2)

  M[0, 0] <- true
  for i <- 1 to |S1| - 1
    M[i, 0] = M[i - 1, 0] and S[i - 1] = S1[i - 1]
  for i <- 1 to |S2| - 1
    M[0, j] = M[0, j - 1] and S[i - 1] = S1[i - 1]

  for i <- 1 to |S1| - 1
    for j <- 1 to |S2| - 1
      M[i, j] = (M[i - 1, j] and S[i + j - 1] = S1[i - 1]) or
                (M[i, j - 1] and S[i + j - 1] = S2[j - 1])

  return M[|S1| - 1, |S2| - 1]
```
A museum has 2n galleries, and each gallery has hallways to at least n other galleries. Currently there's a guard in every gallery, but the museum would like to cut the number of guards in half by having a guard walk between two galleries connected by a common hallway. Each gallery should be covered by exactly one guard, and every guard should cover two galleries.
The museum would like you to devise an efficient algorithm to determine which guards are assigned to which galleries.
The museum can be represented as a graph G = (V, E) where each vertex represents a gallery and each edge represents a hallway between two galleries. Under this interpretation, the algorithm should create a subset E^* of E containing exactly n edges where each vertex in V is adjacent to exactly one edge E^*.
The greedy algorithm starts with E^* empty and iterates until E^* contains n edges. On each iteration, there are two cases to consider:
1. There exists an edge (v₁, v₂) and both v₁ and v₂ are not yet adjacent to any edges in E^*. Add (v₁, v₂) to E^*.
2. There exists two vertices v₁ and v₂ not connected by an edge in G and neither of which are adjacent to any edge in E^*. The graph
  shows an example when n = 3 (dotted edges belong to E^*). v₁ and v₂ are adjacent to a total of at least 2n edges, all of which are adjacent to edges in E^* (otherwise case 1 would hold). However there are less than n edges in E^*, which means that at least one edge in E^* is adjacent to at least three edges that are also adjacent to v₁ and v₂.
  In this case, the algorithm deletes the edge (v₃, v₄) in E^* and replaces it with edges adjacent the vertex pairs (v₁, v₃) and (v₂, v₄):
The algorithm terminates because each iteration adds an edge to E^*, and the iterations stop when E^* has n edges (and |E| >= n). When the algorithm terminates, each edge in E^* is adjacent to two vertices that no other edge in E^* is adjacent to, so all 2n vertices of G are covered. The algorithm is greedy because it makes a strictly local decision about which edge to add next to E^*.

This page last modified on 29 April 2000.

S	=	abcb
S₁	=	b
S₂	=	abc