Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 4 Prof. Charles E. Leiserson Day 6 Introduction to Algorithms L4.2 Quicksort ? Proposed by C.A.R. Hoare in 1962. ? Divide-and-conquer algorithm. ? Sorts “in place” (like insertion sort, but not like merge sort). ? Very practical (with tuning). Day 6 Introduction to Algorithms L4.3 Divide and conquer Quicksort an n-element array: 1. Divide: Partition the array into two subarrays around a pivot x such that elements in lower subarray ≤ x ≤ elements in upper subarray. 2. Conquer: Recursively sort the two subarrays. 3. Combine: Trivial. ≤ x ≤ x x x ≥ x ≥ x Key: Linear-time partitioning subroutine. Day 6 Introduction to Algorithms L4.4 x x Running time = O(n) for n elements. Running time = (n) for n elements. Partitioning subroutine PARTITION(A, p, q) ? A[p . . q] x ← A[p] ? pivot = A[p] i ← p for j ← p + 1 to q do if A[ j] ≤ x then i ← i + 1 exchange A[i] ? A[ j] exchange A[p] ? A[i] return i ≤ x ≤ x ≥ x ≥ x ? ? pi qj Invariant: Day 6 Introduction to Algorithms L4.5 Example of partitioning ij 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 Day 6 Introduction to Algorithms L4.6 Example of partitioning ij 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 Day 6 Introduction to Algorithms L4.7 Example of partitioning ij 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 Day 6 Introduction to Algorithms L4.8 Example of partitioning 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 ij 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 Day 6 Introduction to Algorithms L4.9 Example of partitioning 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 ij 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 Day 6 Introduction to Algorithms L4.10 Example of partitioning 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 ij 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 Day 6 Introduction to Algorithms L4.11 Example of partitioning 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 ij 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 Day 6 Introduction to Algorithms L4.12 Example of partitioning 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 ij 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 Day 6 Introduction to Algorithms L4.13 Example of partitioning 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 ij 6 6 5 5 3 3 2 2 8 8 13 13 10 10 11 11 Day 6 Introduction to Algorithms L4.14 Example of partitioning 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 ij 6 6 5 5 3 3 2 2 8 8 13 13 10 10 11 11 Day 6 Introduction to Algorithms L4.15 Example of partitioning 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 ij 6 6 5 5 3 3 2 2 8 8 13 13 10 10 11 11 Day 6 Introduction to Algorithms L4.16 Example of partitioning 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 6 6 5 5 3 3 2 2 8 8 13 13 10 10 11 11 i 2 2 5 5 3 3 6 6 8 8 13 13 10 10 11 11 Day 6 Introduction to Algorithms L4.17 Pseudocode for quicksort QUICKSORT(A, p, r) if p < r then q ← PARTITION(A, p, r) QUICKSORT(A, p, q–1) QUICKSORT(A, q+1, r) Initial call: QUICKSORT(A, 1, n) Day 6 Introduction to Algorithms L4.18 Analysis of quicksort ? Assume all input elements are distinct. ? In practice, there are better partitioning algorithms for when duplicate input elements may exist. ? Let T(n) = worst-case running time on an array of n elements. Day 6 Introduction to Algorithms L4.19 Worst-case of quicksort ? Input sorted or reverse sorted. ? Partition around min or max element. ? One side of partition always has no elements. )( )()1( )()1()1( )()1()0()( 2 n nnT nnT nnTTnT Θ= Θ+?= Θ+?+Θ= Θ+?+= (arithmetic series) Day 6 Introduction to Algorithms L4.20 Worst-case recursion tree T(n) = T(0) + T(n–1) + cn Day 6 Introduction to Algorithms L4.21 Worst-case recursion tree T(n) = T(0) + T(n–1) + cn T(n) Day 6 Introduction to Algorithms L4.22 cn T(0) T(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + cn Day 6 Introduction to Algorithms L4.23 cn T(0) c(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + cn T(0) T(n–2) Day 6 Introduction to Algorithms L4.24 cn T(0) c(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + cn T(0) c(n–2) T(0) Θ(1) O Day 6 Introduction to Algorithms L4.25 cn T(0) c(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + cn T(0) c(n–2) T(0) Θ(1) O () 2 1 nk n k Θ= ? ? ? ? ? ? ? ? Θ ∑ = Day 6 Introduction to Algorithms L4.26 cn Θ(1) c(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + cn Θ(1) c(n–2) Θ(1) Θ(1) O () 2 1 nk n k Θ= ? ? ? ? ? ? ? ? Θ ∑ = T(n)= Θ(n) + Θ(n 2 ) = Θ(n 2 ) h = n Day 6 Introduction to Algorithms L4.27 Best-case analysis (For intuition only!) If we’re lucky, PARTITION splits the array evenly: T(n)= 2T(n/2) + Θ(n) = Θ(n lg n) (same as merge sort) What if the split is always 10 9 10 1 : ? ( ) ( ) )()( 10 9 10 1 nnTnTnT Θ++= What is the solution to this recurrence? Day 6 Introduction to Algorithms L4.28 Analysis of “almost-best” case )(nT Day 6 Introduction to Algorithms L4.29 Analysis of “almost-best” case cn ( )nT 10 1 ( )nT 10 9 Day 6 Introduction to Algorithms L4.30 Analysis of “almost-best” case cn cn 10 1 cn 10 9 ( )nT 100 1 ( )nT 100 9 ( )nT 100 9 ( )nT 100 81 Day 6 Introduction to Algorithms L4.31 Analysis of “almost-best” case cn cn 10 1 cn 10 9 cn 100 1 cn 100 9 cn 100 9 cn 100 81 Θ(1) Θ(1) … … log 10/9 n cn cn cn … O(n) leaves (n) leaves Day 6 Introduction to Algorithms L4.32 log 10 n Analysis of “almost-best” case cn cn 10 1 cn 10 9 cn 100 1 cn 100 9 cn 100 9 cn 100 81 Θ(1) Θ(1) … … log 10/9 n cn cn cn T(n) ≤ cn log 10/9 n + Ο(n) … cn log 10 n ≤ O(n) leaves (n) leaves Θ(n lg n) Lucky! Day 6 Introduction to Algorithms L4.33 More intuition Suppose we alternate lucky, unlucky, lucky, unlucky, lucky, …. L(n)= 2U(n/2) + Θ(n) lucky U(n)= L(n –1) + Θ(n) unlucky Solving: L(n)= 2(L(n/2 – 1) + Θ(n/2)) + Θ(n) = 2L(n/2 – 1) + Θ(n) = Θ(n lg n) How can we make sure we are usually lucky? Lucky! Day 6 Introduction to Algorithms L4.34 Randomized quicksort IDEA: Partition around a random element. ? Running time is independent of the input order. ? No assumptions need to be made about the input distribution. ? No specific input elicits the worst-case behavior. ? The worst case is determined only by the output of a random-number generator. Day 6 Introduction to Algorithms L4.35 Randomized quicksort analysis Let T(n) = the random variable for the running time of randomized quicksort on an input of size n, assuming random numbers are independent. For k = 0, 1, …, n–1, define the indicator random variable X k = 1 if PARTITION generates a k : n–k–1 split, 0 otherwise. E[X k ] = Pr{X k = 1} = 1/n, since all splits are equally likely, assuming elements are distinct. Day 6 Introduction to Algorithms L4.36 Analysis (continued) T(n) = T(0) + T(n–1) + Θ(n) if 0:n–1 split, T(1) + T(n–2) + Θ(n) if 1:n–2 split, M T(n–1) + T(0) + Θ(n) if n–1 : 0 split, () ∑ ? = Θ+??+= 1 0 )()1()( n k k nknTkTX . Day 6 Introduction to Algorithms L4.37 Calculating expectation () ? ? ? ? ? ? Θ+??+= ∑ ? = 1 0 )()1()()]([ n k k nknTkTXEnTE Take expectations of both sides. Day 6 Introduction to Algorithms L4.38 Calculating expectation () [] ∑ ∑ ? = ? = Θ+??+= ? ? ? ? ? ? Θ+??+= 1 0 1 0 )()1()( )()1()()]([ n k k n k k nknTkTXE nknTkTXEnTE Linearity of expectation. Day 6 Introduction to Algorithms L4.39 Calculating expectation () [] [][ ] ∑ ∑ ∑ ? = ? = ? = Θ+??+?= Θ+??+= ? ? ? ? ? ? Θ+??+= 1 0 1 0 1 0 )()1()( )()1()( )()1()()]([ n k k n k k n k k nknTkTEXE nknTkTXE nknTkTXEnTE Independence of X k from other random choices. Day 6 Introduction to Algorithms L4.40 Calculating expectation () [] [][ ] [] [] ∑∑∑ ∑ ∑ ∑ ? = ? = ? = ? = ? = ? = Θ+??+= Θ+??+?= Θ+??+= ? ? ? ? ? ? Θ+??+= 1 0 1 0 1 0 1 0 1 0 1 0 )( 1 )1( 1 )( 1 )()1()( )()1()( )()1()()]([ n k n k n k n k k n k k n k k n n knTE n kTE n nknTkTEXE nknTkTXE nknTkTXEnTE Linearity of expectation; E[X k ] = 1/n . Day 6 Introduction to Algorithms L4.41 Calculating expectation () [] [][ ] [] [] [] )()( 2 )( 1 )1( 1 )( 1 )()1()( )()1()( )()1()()]([ 1 1 1 0 1 0 1 0 1 0 1 0 1 0 nkTE n n n knTE n kTE n nknTkTEXE nknTkTXE nknTkTXEnTE n k n k n k n k n k k n k k n k k Θ+= Θ+??+= Θ+??+?= Θ+??+= ? ? ? ? ? ? Θ+??+= ∑ ∑∑∑ ∑ ∑ ∑ ? = ? = ? = ? = ? = ? = ? = Summations have identical terms. Day 6 Introduction to Algorithms L4.42 Hairy recurrence [] )()( 2 )]([ 1 2 nkTE n nTE n k Θ+= ∑ ? = (The k = 0, 1 terms can be absorbed in the Θ(n).) Prove: E[T(n)] ≤ anlg n for constant a > 0 . Use fact: 2 1 2 8 1 2 2 1 lglg nnnkk n k ∑ ? = ?≤ (exercise). ? Choose a large enough so that anlg n dominates E[T(n)] for sufficiently small n ≥ 2. Day 6 Introduction to Algorithms L4.43 Substitution method [] )(lg 2 )( 1 2 nkak n nTE n k Θ+≤ ∑ ? = Substitute inductive hypothesis. Day 6 Introduction to Algorithms L4.44 Substitution method [] )( 8 1 lg 2 12 )(lg 2 )( 22 1 2 nnnn n a nkak n nTE n k Θ+ ? ? ? ? ? ? ?≤ Θ+≤ ∑ ? = Use fact. Day 6 Introduction to Algorithms L4.45 Substitution method [] ? ? ? ? ? ? Θ??= Θ+ ? ? ? ? ? ? ?≤ Θ+≤ ∑ ? = )( 4 lg )( 8 1 lg 2 12 )(lg 2 )( 22 1 2 n an nan nnnn n a nkak n nTE n k Express as desired – residual. Day 6 Introduction to Algorithms L4.46 Substitution method [] nan n an nan nnnn n a nkak n nTE n k lg )( 4 lg )( 8 1 lg 2 12 )(lg 2 )( 22 1 2 ≤ ? ? ? ? ? ? Θ??= Θ+ ? ? ? ? ? ? ?= Θ+≤ ∑ ? = if a is chosen large enough so that an/4 dominates the Θ(n). , Day 6 Introduction to Algorithms L4.47 Quicksort in practice ? Quicksort is a great general-purpose sorting algorithm. ? Quicksort is typically over twice as fast as merge sort. ? Quicksort can benefit substantially from code tuning. ? Quicksort behaves well even with caching and virtual memory.