Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 15 Prof. Charles E. Leiserson Introduction to Algorithms Day 26 L15.2? 2001 by Charles E. Leiserson Dynamic programming Design technique, like divide-and-conquer. Example: Longest Common Subsequence (LCS) ? Given two sequences x[1 . . m] and y[1 . . n], find a longest subsequence common to them both. x:ABCBDAB y:B D C A B A “a” not “the” BCBA = LCS(x, y) functional notation, but not a function Introduction to Algorithms Day 26 L15.3? 2001 by Charles E. Leiserson Brute-force LCS algorithm Check every subsequence of x[1 . . m] to see if it is also a subsequence of y[1 . . n]. Analysis ? Checking = O(n) time per subsequence. ? 2 m subsequences of x (each bit-vector of length m determines a distinct subsequence of x). Worst-case running time = O(n2 m ) = exponential time. Introduction to Algorithms Day 26 L15.4? 2001 by Charles E. Leiserson Towards a better algorithm Simplification: 1. Look at the length of a longest-common subsequence. 2. Extend the algorithm to find the LCS itself. Strategy: Consider prefixes of x and y. ? Define c[i, j] = | LCS(x[1 . . i], y[1 . . j]) |. ? Then, c[m, n] = | LCS(x, y) |. Notation: Denote the length of a sequence s by | s |. Introduction to Algorithms Day 26 L15.5? 2001 by Charles E. Leiserson Recursive formulation Theorem. c[i, j] = c[i–1, j–1] + 1 if x[i] = y[j], max{c[i–1, j], c[i, j–1]} otherwise. Let z[1 . . k]=LCS(x[1 . . i], y[1 . . j]), where c[i, j] = k. Then, z[k] = x[i], or else z could be extended. Thus, z[1 . . k–1] is CS of x[1 . . i–1] and y[1 . . j–1]. Proof. Case x[i] = y[j]: L 12 im L 12 j n x: y: = Introduction to Algorithms Day 26 L15.6? 2001 by Charles E. Leiserson Proof (continued) Claim: z[1 . . k–1] = LCS(x[1 . . i–1], y[1 . . j–1]). Suppose w is a longer CS of x[1 . . i–1] and y[1 . . j–1], that is, |w| > k–1. Then, cut and paste: w || z[k] (w concatenated with z[k]) is a common subsequence of x[1 . . i] and y[1 . . j] with |w || z[k]| > k. Contradiction, proving the claim. Thus, c[i–1, j–1] = k–1, which implies that c[i, j] = c[i–1, j–1] + 1. Other cases are similar. Introduction to Algorithms Day 26 L15.7? 2001 by Charles E. Leiserson Dynamic-programming hallmark #1 Optimal substructure An optimal solution to a problem (instance) contains optimal solutions to subproblems. If z = LCS(x, y), then any prefix of z is an LCS of a prefix of x and a prefix of y. Introduction to Algorithms Day 26 L15.8? 2001 by Charles E. Leiserson Recursive algorithm for LCS LCS(x, y, i, j) if x[i] = y[ j] then c[i, j] ← LCS(x, y, i–1, j–1) + 1 else c[i, j] ← max{LCS(x, y, i–1, j), LCS(x, y, i, j–1)} Worst-case: x[i] ≠ y[ j], in which case the algorithm evaluates two subproblems, each with only one parameter decremented. Introduction to Algorithms Day 26 L15.9? 2001 by Charles E. Leiserson same subproblem , but we’re solving subproblems already solved! Recursion tree m = 3, n = 4: 3,4 3,4 2,4 2,4 1,4 1,4 3,3 3,3 3,2 3,2 2,3 2,3 1,3 1,3 2,2 2,2 Height = m + n ? work potentially exponential. 2,3 2,3 1,3 1,3 2,2 2,2 m+n Introduction to Algorithms Day 26 L15.10? 2001 by Charles E. Leiserson Dynamic-programming hallmark #2 Overlapping subproblems A recursive solution contains a “small” number of distinct subproblems repeated many times. The number of distinct LCS subproblems for two strings of lengths m and n is only mn. Introduction to Algorithms Day 26 L15.11? 2001 by Charles E. Leiserson Memoization algorithm Memoization: After computing a solution to a subproblem, store it in a table. Subsequent calls check the table to avoid redoing work. Time = Θ(mn) = constant work per table entry. Space = Θ(mn). LCS(x, y, i, j) if c[i, j] = NIL then if x[i] = y[j] then c[i, j] ← LCS(x, y, i–1, j–1) + 1 else c[i, j] ← max{LCS(x, y, i–1, j), LCS(x, y, i, j–1)} same as before Introduction to Algorithms Day 26 L15.12? 2001 by Charles E. Leiserson 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 2 2 2 2D 2 2 0 0 0 0 1 1 2 2 2 2 2 2 2 2C 2 2 0 0 1 1 1 1 2 2 2 2 2 2 3 3A 3 3 0 0 1 1 2 2 2 2 3 3 3 3 3 3B 4 4 0 0 1 1 2 2 2 2 3 3 3 3 A Dynamic-programming algorithm IDEA: Compute the table bottom-up. ABCBD B B A 4 4 4 4 Time = Θ(mn). Introduction to Algorithms Day 26 L15.13? 2001 by Charles E. Leiserson 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 2 2 2 2D 2 2 0 0 0 0 1 1 2 2 2 2 2 2 2 2C 2 2 0 0 1 1 1 1 2 2 2 2 2 2 3 3A 3 3 0 0 1 1 2 2 2 2 3 3 3 3 3 3B 4 4 0 0 1 1 2 2 2 2 3 3 3 3 A Dynamic-programming algorithm IDEA: Compute the table bottom-up. ABCBD B B A 4 4 4 4 Time = Θ(mn). Reconstruct LCS by tracing backwards. A B C B D Space = Θ(mn). Exercise: O(min{m, n}).