Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 9 Prof. Charles E. Leiserson Introduction to Algorithms Day 17 L9.2? 2001 by Charles E. Leiserson 3 3 Binary-search-tree sort T ←? ? Create an empty BST for i = 1 to n do TREE-INSERT(T, A[i]) Perform an inorder tree walk of T. Example: A = [3 1 8 2 6 7 5] 8 8 1 1 2 2 6 6 5 5 7 7 Tree-walk time = O(n), but how long does it take to build the BST? Introduction to Algorithms Day 17 L9.3? 2001 by Charles E. Leiserson Analysis of BST sort BST sort performs the same comparisons as quicksort, but in a different order! 3 1 8 2 6 7 5 1 2 8 6 7 5 26 7 5 75 The expected time to build the tree is asymptot- ically the same as the running time of quicksort. Introduction to Algorithms Day 17 L9.4? 2001 by Charles E. Leiserson Node depth The depth of a node = the number of comparisons made during TREE-INSERT. Assuming all input permutations are equally likely, we have () )(lg )lg( 1 nodeinsert toscomparison# 1 1 nO nnO n iE n n i = = ? ? ? ? ? ? = ∑ = Average node depth . (quicksort analysis) Introduction to Algorithms Day 17 L9.5? 2001 by Charles E. Leiserson Expected tree height But, average node depth of a randomly built BST = O(lg n) does not necessarily mean that its expected height is also O(lg n) (although it is). Example. ≤ lg n nh = )(lg 2 lg 1 nO nn nn n = ? ? ? ? ? ? ? +?≤Ave. depth Introduction to Algorithms Day 17 L9.6? 2001 by Charles E. Leiserson Height of a randomly built binary search tree ? Prove Jensen’s inequality, which says that f(E[X]) ≤ E[f(X)] for any convex function f and random variable X. ? Analyze the exponential height of a randomly built BST on n nodes, which is the random variable Y n = 2 X n , where X n is the random variable denoting the height of the BST. ? Prove that 2 E[X n ] ≤ E[2 X n ] = E[Y n ] = O(n 3 ), and hence that E[X n ] = O(lg n). Outline of the analysis: Introduction to Algorithms Day 17 L9.7? 2001 by Charles E. Leiserson Convex functions A function f : R → R is convex if for all α,β≥0 such that α + β = 1, we have f(αx + βy) ≤αf(x) + βf(y) for all x,y ∈ R. αx + βy αf(x) + βf(y) f(αx + βy) xy f(x) f(y) f Introduction to Algorithms Day 17 L9.8? 2001 by Charles E. Leiserson Convexity lemma Lemma. Let f : R → R be a convex function, and let {α 1 , α 2 , …, α n } be a set of nonnegative constants such that ∑ k α k = 1. Then, for any set {x 1 , x 2 , …, x n } of real numbers, we have )( 11 ∑∑ == ≤ ? ? ? ? ? ? ? ? n k kk n k kk xfxf αα Proof. By induction on n. For n = 1, we have α 1 = 1, and hence f(α 1 x 1 ) ≤α 1 f(x 1 ) trivially. . Introduction to Algorithms Day 17 L9.9? 2001 by Charles E. Leiserson Proof (continued) ? ? ? ? ? ? ? ? ? ?+= ? ? ? ? ? ? ? ? ∑∑ ? == 1 11 1 )1( n k k n k nnn n k kk xxfxf α α ααα Inductive step: Algebra. Introduction to Algorithms Day 17 L9.10? 2001 by Charles E. Leiserson Proof (continued) ? ? ? ? ? ? ? ? ? ?+≤ ? ? ? ? ? ? ? ? ? ?+= ? ? ? ? ? ? ? ? ∑ ∑∑ ? = ? == 1 1 1 11 1 )1()( 1 )1( n k k n k nnn n k k n k nnn n k kk xfxf xxfxf α α αα α α ααα Inductive step: Convexity. Introduction to Algorithms Day 17 L9.11? 2001 by Charles E. Leiserson Proof (continued) ∑ ∑ ∑∑ ? = ? = ? == ? ?+≤ ? ? ? ? ? ? ? ? ? ?+≤ ? ? ? ? ? ? ? ? ? ?+= ? ? ? ? ? ? ? ? 1 1 1 1 1 11 )( 1 )1()( 1 )1()( 1 )1( n k k n k nnn n k k n k nnn n k k n k nnn n k kk xfxf xfxf xxfxf α α αα α α αα α α ααα Inductive step: Induction. Introduction to Algorithms Day 17 L9.12? 2001 by Charles E. Leiserson Proof (continued) )( )( 1 )1()( 1 )1()( 1 )1( 1 1 1 1 1 1 11 ∑ ∑ ∑ ∑∑ = ? = ? = ? == = ? ?+≤ ? ? ? ? ? ? ? ? ? ?+≤ ? ? ? ? ? ? ? ? ? ?+= ? ? ? ? ? ? ? ? n k kk n k k n k nnn n k k n k nnn n k k n k nnn n k kk xf xfxf xfxf xxfxf α α α αα α α αα α α ααα Inductive step: Algebra. . Introduction to Algorithms Day 17 L9.13? 2001 by Charles E. Leiserson Jensen’s inequality Lemma. Let f be a convex function, and let X be a random variable. Then, f (E[X]) ≤ E[ f (X)]. ? ? ? ? ? ? ? ? =?= ∑ ∞ ?∞=k kXkfXEf }Pr{])[( Proof. Definition of expectation. Introduction to Algorithms Day 17 L9.14? 2001 by Charles E. Leiserson Jensen’s inequality ∑ ∑ ∞ ?∞= ∞ ?∞= =?≤ ? ? ? ? ? ? ? ? =?= k k kXkf kXkfXEf }Pr{)( }Pr{])[( Proof. Convexity lemma (generalized). Lemma. Let f be a convex function, and let X be a random variable. Then, f (E[X]) ≤ E[ f (X)]. Introduction to Algorithms Day 17 L9.15? 2001 by Charles E. Leiserson Jensen’s inequality )]([ }Pr{)( }Pr{])[( XfE kXkf kXkfXEf k k = =?≤ ? ? ? ? ? ? ? ? =?= ∑ ∑ ∞ ?∞= ∞ ?∞= . Proof. Tricky step, but true—think about it. Lemma. Let f be a convex function, and let X be a random variable. Then, f (E[X]) ≤ E[ f (X)]. Introduction to Algorithms Day 17 L9.16? 2001 by Charles E. Leiserson Analysis of BST height Let X n be the random variable denoting the height of a randomly built binary search tree on n nodes, and let Y n = 2 X n be its exponential height. If the root of the tree has rank k, then X n = 1 + max{X k–1 ,X n–k } , since each of the left and right subtrees of the root are randomly built. Hence, we have Y n = 2· max{Y k–1 ,Y n–k } . Introduction to Algorithms Day 17 L9.17? 2001 by Charles E. Leiserson Analysis (continued) Define the indicator random variable Z nk as Z nk = 1 if the root has rank k, 0 otherwise. Thus, Pr{Z nk = 1} = E[Z nk ] = 1/n, and () ∑ = ?? ?= n k knknkn YYZY 1 1 },max{2 . Introduction to Algorithms Day 17 L9.18? 2001 by Charles E. Leiserson Exponential height recurrence [] () ? ? ? ? ? ? ?= ∑ = ?? n k knknkn YYZEYE 1 1 },max{2 Take expectation of both sides. Introduction to Algorithms Day 17 L9.19? 2001 by Charles E. Leiserson Exponential height recurrence [] () ()[] ∑ ∑ = ?? = ?? ?= ? ? ? ? ? ? ?= n k knknk n k knknkn YYZE YYZEYE 1 1 1 1 },max{2 },max{2 Linearity of expectation. Introduction to Algorithms Day 17 L9.20? 2001 by Charles E. Leiserson Exponential height recurrence [] () ()[] ∑ ∑ ∑ = ?? = ?? = ?? ?= ?= ? ? ? ? ? ? ?= n k knknk n k knknk n k knknkn YYEZE YYZE YYZEYE 1 1 1 1 1 1 }],[max{][2 },max{2 },max{2 Independence of the rank of the root from the ranks of subtree roots. Introduction to Algorithms Day 17 L9.21? 2001 by Charles E. Leiserson Exponential height recurrence [] () ()[] ∑ ∑ ∑ ∑ = ?? = ?? = ?? = ?? +≤ ?= ?= ? ? ? ? ? ? ?= n k knk n k knknk n k knknk n k knknkn YYE n YYEZE YYZE YYZEYE 1 1 1 1 1 1 1 1 ][ 2 }],[max{][2 },max{2 },max{2 The max of two nonnegative numbers is at most their sum, and E[Z nk ] = 1/n. Introduction to Algorithms Day 17 L9.22? 2001 by Charles E. Leiserson Exponential height recurrence [] () ()[] ∑ ∑ ∑ ∑ ∑ ? = = ?? = ?? = ?? = ?? = +≤ ?= ?= ? ? ? ? ? ? ?= 1 0 1 1 1 1 1 1 1 1 ][ 4 ][ 2 }],[max{][2 },max{2 },max{2 n k k n k knk n k knknk n k knknk n k knknkn YE n YYE n YYEZE YYZE YYZEYE Each term appears twice, and reindex. Introduction to Algorithms Day 17 L9.23? 2001 by Charles E. Leiserson Solving the recurrence Use substitution to show that E[Y n ] ≤ cn 3 for some positive constant c, which we can pick sufficiently large to handle the initial conditions. [] ∑ ? = = 1 0 ][ 4 n k kn YE n YE Introduction to Algorithms Day 17 L9.24? 2001 by Charles E. Leiserson Solving the recurrence Use substitution to show that E[Y n ] ≤ cn 3 for some positive constant c, which we can pick sufficiently large to handle the initial conditions. [] ∑ ∑ ? = ? = ≤ = 1 0 3 1 0 4 ][ 4 n k n k kn ck n YE n YE Substitution. Introduction to Algorithms Day 17 L9.25? 2001 by Charles E. Leiserson Solving the recurrence Use substitution to show that E[Y n ] ≤ cn 3 for some positive constant c, which we can pick sufficiently large to handle the initial conditions. [] ∫ ∑ ∑ ≤ ≤ = ? = ? = n n k n k kn dxx n c ck n YE n YE 0 3 1 0 3 1 0 4 4 ][ 4 Integral method. Introduction to Algorithms Day 17 L9.26? 2001 by Charles E. Leiserson Solving the recurrence Use substitution to show that E[Y n ] ≤ cn 3 for some positive constant c, which we can pick sufficiently large to handle the initial conditions. [] ? ? ? ? ? ? = ≤ ≤ = ∫ ∑ ∑ ? = ? = 4 4 4 4 ][ 4 4 0 3 1 0 3 1 0 n n c dxx n c ck n YE n YE n n k n k kn Solve the integral. Introduction to Algorithms Day 17 L9.27? 2001 by Charles E. Leiserson Solving the recurrence Use substitution to show that E[Y n ] ≤ cn 3 for some positive constant c, which we can pick sufficiently large to handle the initial conditions. [] 3 4 0 3 1 0 3 1 0 4 4 4 4 ][ 4 cn n n c dxx n c ck n YE n YE n n k n k kn = ? ? ? ? ? ? = ≤ ≤ = ∫ ∑ ∑ ? = ? = .Algebra. Introduction to Algorithms Day 17 L9.28? 2001 by Charles E. Leiserson The grand finale 2 E[X n ] ≤ E[2 X n ] Putting it all together, we have Jensen’s inequality, since f(x) = 2 x is convex. Introduction to Algorithms Day 17 L9.29? 2001 by Charles E. Leiserson The grand finale 2 E[X n ] ≤ E[2 X n ] = E[Y n ] Putting it all together, we have Definition. Introduction to Algorithms Day 17 L9.30? 2001 by Charles E. Leiserson The grand finale 2 E[X n ] ≤ E[2 X n ] = E[Y n ] ≤ cn 3 . Putting it all together, we have What we just showed. Introduction to Algorithms Day 17 L9.31? 2001 by Charles E. Leiserson The grand finale 2 E[X n ] ≤ E[2 X n ] = E[Y n ] ≤ cn 3 . Putting it all together, we have Taking the lg of both sides yields E[X n ] ≤ 3lg n +O(1) . Introduction to Algorithms Day 17 L9.32? 2001 by Charles E. Leiserson Post mortem Q. Does the analysis have to be this hard? Q. Why bother with analyzing exponential height? Q. Why not just develop the recurrence on X n = 1 + max{X k–1 ,X n–k } directly? Introduction to Algorithms Day 17 L9.33? 2001 by Charles E. Leiserson Post mortem (continued) A. The inequality max{a, b} ≤ a + b . provides a poor upper bound, since the RHS approaches the LHS slowly as |a – b| increases. The bound max{2 a ,2 b } ≤ 2 a + 2 b allows the RHS to approach the LHS far more quickly as |a – b| increases. By using the convexity of f(x) = 2 x via Jensen’s inequality, we can manipulate the sum of exponentials, resulting in a tight analysis. Introduction to Algorithms Day 17 L9.34? 2001 by Charles E. Leiserson Thought exercises ? See what happens when you try to do the analysis on X n directly. ? Try to understand better why the proof uses an exponential. Will a quadratic do? ? See if you can find a simpler argument. (This argument is a little simpler than the one in the book—I hope it’s correct!)