Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 9
Prof. Charles E. Leiserson
Introduction to Algorithms Day 17 L9.2? 2001 by Charles E. Leiserson
3
3
Binary-search-tree sort
T ←? ? Create an empty BST
for i = 1 to n
do TREE-INSERT(T, A[i])
Perform an inorder tree walk of T.
Example:
A = [3 1 8 2 6 7 5]
8
8
1
1
2
2
6
6
5
5
7
7
Tree-walk time = O(n),
but how long does it
take to build the BST?
Introduction to Algorithms Day 17 L9.3? 2001 by Charles E. Leiserson
Analysis of BST sort
BST sort performs the same comparisons as
quicksort, but in a different order!
3 1 8 2 6 7 5
1 2 8 6 7 5
26 7 5
75
The expected time to build the tree is asymptot-
ically the same as the running time of quicksort.
Introduction to Algorithms Day 17 L9.4? 2001 by Charles E. Leiserson
Node depth
The depth of a node = the number of comparisons
made during TREE-INSERT. Assuming all input
permutations are equally likely, we have
()
)(lg
)lg(
1
nodeinsert toscomparison#
1
1
nO
nnO
n
iE
n
n
i
=
=
?
?
?
?
?
?
=
∑
=
Average node depth
.
(quicksort analysis)
Introduction to Algorithms Day 17 L9.5? 2001 by Charles E. Leiserson
Expected tree height
But, average node depth of a randomly built
BST = O(lg n) does not necessarily mean that its
expected height is also O(lg n) (although it is).
Example.
≤ lg n
nh =
)(lg
2
lg
1
nO
nn
nn
n
=
?
?
?
?
?
?
?
+?≤Ave. depth
Introduction to Algorithms Day 17 L9.6? 2001 by Charles E. Leiserson
Height of a randomly built
binary search tree
? Prove Jensen’s inequality, which says that
f(E[X]) ≤ E[f(X)] for any convex function f and
random variable X.
? Analyze the exponential height of a randomly
built BST on n nodes, which is the random
variable Y
n
= 2
X
n
, where X
n
is the random
variable denoting the height of the BST.
? Prove that 2
E[X
n
]
≤ E[2
X
n
] = E[Y
n
] = O(n
3
),
and hence that E[X
n
] = O(lg n).
Outline of the analysis:
Introduction to Algorithms Day 17 L9.7? 2001 by Charles E. Leiserson
Convex functions
A function f : R → R is convex if for all
α,β≥0 such that α + β = 1, we have
f(αx + βy) ≤αf(x) + βf(y)
for all x,y ∈ R.
αx + βy
αf(x) + βf(y)
f(αx + βy)
xy
f(x)
f(y)
f
Introduction to Algorithms Day 17 L9.8? 2001 by Charles E. Leiserson
Convexity lemma
Lemma. Let f : R → R be a convex function,
and let {α
1
, α
2
, …, α
n
} be a set of nonnegative
constants such that ∑
k
α
k
= 1. Then, for any set
{x
1
, x
2
, …, x
n
} of real numbers, we have
)(
11
∑∑
==
≤
?
?
?
?
?
?
?
?
n
k
kk
n
k
kk
xfxf αα
Proof. By induction on n. For n = 1, we have
α
1
= 1, and hence f(α
1
x
1
) ≤α
1
f(x
1
) trivially.
.
Introduction to Algorithms Day 17 L9.9? 2001 by Charles E. Leiserson
Proof (continued)
?
?
?
?
?
?
?
?
?
?+=
?
?
?
?
?
?
?
?
∑∑
?
==
1
11
1
)1(
n
k
k
n
k
nnn
n
k
kk
xxfxf
α
α
ααα
Inductive step:
Algebra.
Introduction to Algorithms Day 17 L9.10? 2001 by Charles E. Leiserson
Proof (continued)
?
?
?
?
?
?
?
?
?
?+≤
?
?
?
?
?
?
?
?
?
?+=
?
?
?
?
?
?
?
?
∑
∑∑
?
=
?
==
1
1
1
11
1
)1()(
1
)1(
n
k
k
n
k
nnn
n
k
k
n
k
nnn
n
k
kk
xfxf
xxfxf
α
α
αα
α
α
ααα
Inductive step:
Convexity.
Introduction to Algorithms Day 17 L9.11? 2001 by Charles E. Leiserson
Proof (continued)
∑
∑
∑∑
?
=
?
=
?
==
?
?+≤
?
?
?
?
?
?
?
?
?
?+≤
?
?
?
?
?
?
?
?
?
?+=
?
?
?
?
?
?
?
?
1
1
1
1
1
11
)(
1
)1()(
1
)1()(
1
)1(
n
k
k
n
k
nnn
n
k
k
n
k
nnn
n
k
k
n
k
nnn
n
k
kk
xfxf
xfxf
xxfxf
α
α
αα
α
α
αα
α
α
ααα
Inductive step:
Induction.
Introduction to Algorithms Day 17 L9.12? 2001 by Charles E. Leiserson
Proof (continued)
)(
)(
1
)1()(
1
)1()(
1
)1(
1
1
1
1
1
1
11
∑
∑
∑
∑∑
=
?
=
?
=
?
==
=
?
?+≤
?
?
?
?
?
?
?
?
?
?+≤
?
?
?
?
?
?
?
?
?
?+=
?
?
?
?
?
?
?
?
n
k
kk
n
k
k
n
k
nnn
n
k
k
n
k
nnn
n
k
k
n
k
nnn
n
k
kk
xf
xfxf
xfxf
xxfxf
α
α
α
αα
α
α
αα
α
α
ααα
Inductive step:
Algebra.
.
Introduction to Algorithms Day 17 L9.13? 2001 by Charles E. Leiserson
Jensen’s inequality
Lemma. Let f be a convex function, and let X
be a random variable. Then, f (E[X]) ≤ E[ f (X)].
?
?
?
?
?
?
?
?
=?=
∑
∞
?∞=k
kXkfXEf }Pr{])[(
Proof.
Definition of expectation.
Introduction to Algorithms Day 17 L9.14? 2001 by Charles E. Leiserson
Jensen’s inequality
∑
∑
∞
?∞=
∞
?∞=
=?≤
?
?
?
?
?
?
?
?
=?=
k
k
kXkf
kXkfXEf
}Pr{)(
}Pr{])[(
Proof.
Convexity lemma (generalized).
Lemma. Let f be a convex function, and let X
be a random variable. Then, f (E[X]) ≤ E[ f (X)].
Introduction to Algorithms Day 17 L9.15? 2001 by Charles E. Leiserson
Jensen’s inequality
)]([
}Pr{)(
}Pr{])[(
XfE
kXkf
kXkfXEf
k
k
=
=?≤
?
?
?
?
?
?
?
?
=?=
∑
∑
∞
?∞=
∞
?∞=
.
Proof.
Tricky step, but true—think about it.
Lemma. Let f be a convex function, and let X
be a random variable. Then, f (E[X]) ≤ E[ f (X)].
Introduction to Algorithms Day 17 L9.16? 2001 by Charles E. Leiserson
Analysis of BST height
Let X
n
be the random variable denoting
the height of a randomly built binary
search tree on n nodes, and let Y
n
= 2
X
n
be its exponential height.
If the root of the tree has rank k, then
X
n
= 1 + max{X
k–1
,X
n–k
} ,
since each of the left and right subtrees
of the root are randomly built. Hence,
we have
Y
n
= 2· max{Y
k–1
,Y
n–k
} .
Introduction to Algorithms Day 17 L9.17? 2001 by Charles E. Leiserson
Analysis (continued)
Define the indicator random variable Z
nk
as
Z
nk
=
1 if the root has rank k,
0 otherwise.
Thus, Pr{Z
nk
= 1} = E[Z
nk
] = 1/n, and
()
∑
=
??
?=
n
k
knknkn
YYZY
1
1
},max{2
.
Introduction to Algorithms Day 17 L9.18? 2001 by Charles E. Leiserson
Exponential height recurrence
[] ()
?
?
?
?
?
?
?=
∑
=
??
n
k
knknkn
YYZEYE
1
1
},max{2
Take expectation of both sides.
Introduction to Algorithms Day 17 L9.19? 2001 by Charles E. Leiserson
Exponential height recurrence
[] ()
()[]
∑
∑
=
??
=
??
?=
?
?
?
?
?
?
?=
n
k
knknk
n
k
knknkn
YYZE
YYZEYE
1
1
1
1
},max{2
},max{2
Linearity of expectation.
Introduction to Algorithms Day 17 L9.20? 2001 by Charles E. Leiserson
Exponential height recurrence
[] ()
()[]
∑
∑
∑
=
??
=
??
=
??
?=
?=
?
?
?
?
?
?
?=
n
k
knknk
n
k
knknk
n
k
knknkn
YYEZE
YYZE
YYZEYE
1
1
1
1
1
1
}],[max{][2
},max{2
},max{2
Independence of the rank of the root
from the ranks of subtree roots.
Introduction to Algorithms Day 17 L9.21? 2001 by Charles E. Leiserson
Exponential height recurrence
[] ()
()[]
∑
∑
∑
∑
=
??
=
??
=
??
=
??
+≤
?=
?=
?
?
?
?
?
?
?=
n
k
knk
n
k
knknk
n
k
knknk
n
k
knknkn
YYE
n
YYEZE
YYZE
YYZEYE
1
1
1
1
1
1
1
1
][
2
}],[max{][2
},max{2
},max{2
The max of two nonnegative numbers
is at most their sum, and E[Z
nk
] = 1/n.
Introduction to Algorithms Day 17 L9.22? 2001 by Charles E. Leiserson
Exponential height recurrence
[] ()
()[]
∑
∑
∑
∑
∑
?
=
=
??
=
??
=
??
=
??
=
+≤
?=
?=
?
?
?
?
?
?
?=
1
0
1
1
1
1
1
1
1
1
][
4
][
2
}],[max{][2
},max{2
},max{2
n
k
k
n
k
knk
n
k
knknk
n
k
knknk
n
k
knknkn
YE
n
YYE
n
YYEZE
YYZE
YYZEYE
Each term appears
twice, and reindex.
Introduction to Algorithms Day 17 L9.23? 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] ≤ cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
[]
∑
?
=
=
1
0
][
4
n
k
kn
YE
n
YE
Introduction to Algorithms Day 17 L9.24? 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] ≤ cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
[]
∑
∑
?
=
?
=
≤
=
1
0
3
1
0
4
][
4
n
k
n
k
kn
ck
n
YE
n
YE
Substitution.
Introduction to Algorithms Day 17 L9.25? 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] ≤ cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
[]
∫
∑
∑
≤
≤
=
?
=
?
=
n
n
k
n
k
kn
dxx
n
c
ck
n
YE
n
YE
0
3
1
0
3
1
0
4
4
][
4
Integral method.
Introduction to Algorithms Day 17 L9.26? 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] ≤ cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
[]
?
?
?
?
?
?
=
≤
≤
=
∫
∑
∑
?
=
?
=
4
4
4
4
][
4
4
0
3
1
0
3
1
0
n
n
c
dxx
n
c
ck
n
YE
n
YE
n
n
k
n
k
kn
Solve the integral.
Introduction to Algorithms Day 17 L9.27? 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] ≤ cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
[]
3
4
0
3
1
0
3
1
0
4
4
4
4
][
4
cn
n
n
c
dxx
n
c
ck
n
YE
n
YE
n
n
k
n
k
kn
=
?
?
?
?
?
?
=
≤
≤
=
∫
∑
∑
?
=
?
=
.Algebra.
Introduction to Algorithms Day 17 L9.28? 2001 by Charles E. Leiserson
The grand finale
2
E[X
n
]
≤ E[2
X
n
]
Putting it all together, we have
Jensen’s inequality, since
f(x) = 2
x
is convex.
Introduction to Algorithms Day 17 L9.29? 2001 by Charles E. Leiserson
The grand finale
2
E[X
n
]
≤ E[2
X
n
]
= E[Y
n
]
Putting it all together, we have
Definition.
Introduction to Algorithms Day 17 L9.30? 2001 by Charles E. Leiserson
The grand finale
2
E[X
n
]
≤ E[2
X
n
]
= E[Y
n
]
≤ cn
3
.
Putting it all together, we have
What we just showed.
Introduction to Algorithms Day 17 L9.31? 2001 by Charles E. Leiserson
The grand finale
2
E[X
n
]
≤ E[2
X
n
]
= E[Y
n
]
≤ cn
3
.
Putting it all together, we have
Taking the lg of both sides yields
E[X
n
] ≤ 3lg n +O(1) .
Introduction to Algorithms Day 17 L9.32? 2001 by Charles E. Leiserson
Post mortem
Q. Does the analysis have to be this hard?
Q. Why bother with analyzing exponential
height?
Q. Why not just develop the recurrence on
X
n
= 1 + max{X
k–1
,X
n–k
}
directly?
Introduction to Algorithms Day 17 L9.33? 2001 by Charles E. Leiserson
Post mortem (continued)
A. The inequality
max{a, b} ≤ a + b .
provides a poor upper bound, since the RHS
approaches the LHS slowly as |a – b| increases.
The bound
max{2
a
,2
b
} ≤ 2
a
+ 2
b
allows the RHS to approach the LHS far more
quickly as |a – b| increases. By using the
convexity of f(x) = 2
x
via Jensen’s inequality,
we can manipulate the sum of exponentials,
resulting in a tight analysis.
Introduction to Algorithms Day 17 L9.34? 2001 by Charles E. Leiserson
Thought exercises
? See what happens when you try to do the
analysis on X
n
directly.
? Try to understand better why the proof
uses an exponential. Will a quadratic do?
? See if you can find a simpler argument.
(This argument is a little simpler than the
one in the book—I hope it’s correct!)