Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 5
Prof. Erik Demaine
Introduction to Algorithms Day 8      L5.2? 2001 by Charles E. Leiserson
How fast can we sort?
All the sorting algorithms we have seen so far 
are comparison sorts: only use comparisons to 
determine the relative order of elements.
? E.g., insertion sort, merge sort, quicksort, 
heapsort.
The best worst-case running time that we’ve 
seen for comparison sorting is O(n lg n) .
Is O(n lg n) the best we can do?
Decision trees can help us answer this question. 
Introduction to Algorithms Day 8      L5.3? 2001 by Charles E. Leiserson
Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each internal node is labeled i:j for i, j ∈ {1, 2,…, n}.
?The left subtree shows subsequent comparisons if a
i
≤ a
j
.
?The right subtree shows subsequent comparisons if a
i
≥ a
j
.
Sort ?a
1
, a
2
, …, a
n
?
Introduction to Algorithms Day 8      L5.4? 2001 by Charles E. Leiserson
Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each internal node is labeled i:j for i, j ∈ {1, 2,…, n}.
?The left subtree shows subsequent comparisons if a
i
≤ a
j
.
?The right subtree shows subsequent comparisons if a
i
≥ a
j
.
9 ≥ 4
Sort ?a
1
, a
2
, a
3
?
= ? 9, 4, 6 ?:
Introduction to Algorithms Day 8      L5.5? 2001 by Charles E. Leiserson
Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each internal node is labeled i:j for i, j ∈ {1, 2,…, n}.
?The left subtree shows subsequent comparisons if a
i
≤ a
j
.
?The right subtree shows subsequent comparisons if a
i
≥ a
j
.
9 ≥ 6
Sort ?a
1
, a
2
, a
3
?
= ? 9, 4, 6 ?:
Introduction to Algorithms Day 8      L5.6? 2001 by Charles E. Leiserson
Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each internal node is labeled i:j for i, j ∈ {1, 2,…, n}.
?The left subtree shows subsequent comparisons if a
i
≤ a
j
.
?The right subtree shows subsequent comparisons if a
i
≥ a
j
.
4 ≤ 6
Sort ?a
1
, a
2
, a
3
?
= ? 9, 4, 6 ?:
Introduction to Algorithms Day 8      L5.7? 2001 by Charles E. Leiserson
Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each leaf contains a permutation ?π(1), π(2),…, π(n)? to 
indicate that the ordering a
π(1)
≤ a
π(2)
≤L≤ a
π(n)
has been 
established.
4 ≤ 6 ≤ 9
Sort ?a
1
, a
2
, a
3
?
= ? 9, 4, 6 ?:
Introduction to Algorithms Day 8      L5.8? 2001 by Charles E. Leiserson
Decision-tree model
A decision tree can model the execution of 
any comparison sort:
? One tree for each input size n. 
? View the algorithm as splitting whenever 
it compares two elements.
? The tree contains the comparisons along 
all possible instruction traces.
? The running time of the algorithm = the 
length of the path taken.
? Worst-case running time = height of tree.
Introduction to Algorithms Day 8      L5.9? 2001 by Charles E. Leiserson
Lower bound for decision-
tree sorting
Theorem. Any decision tree that can sort n 
elements must have height ?(n lg n) .
Proof. The tree must contain ≥ n! leaves, since 
there are n! possible permutations.  A height-h
binary tree has ≤ 2
h
leaves.  Thus, n! ≤ 2
h
.
∴ h ≥ lg(n!) (lg is mono. increasing)
≥ lg ((n/e)
n
) (Stirling’s formula)
= n lg n – n lg e
= ?(n lg n) .
Introduction to Algorithms Day 8      L5.10? 2001 by Charles E. Leiserson
Lower bound for comparison 
sorting
Corollary. Heapsort and merge sort are 
asymptotically optimal comparison sorting 
algorithms.
Introduction to Algorithms Day 8      L5.11? 2001 by Charles E. Leiserson
Sorting in linear time
Counting sort: No comparisons between elements.
? Input: A[1 . . n], where A[ j]∈{1, 2, …, k} .
? Output: B[1 . . n], sorted.
? Auxiliary storage: C[1 . . k] .
Introduction to Algorithms Day 8      L5.12? 2001 by Charles E. Leiserson
Counting sort
for i ← 1 to k
do C[i] ← 0
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ? C[i] = |{key = i}|
for i ← 2 to k
do C[i] ← C[i] + C[i–1] ? C[i] = |{key ≤ i}|
for j ← n downto 1
do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Introduction to Algorithms Day 8      L5.13? 2001 by Charles E. Leiserson
Counting-sort example
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
1234
Introduction to Algorithms Day 8      L5.14? 2001 by Charles E. Leiserson
Loop 1
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
0
0
0
0
0
0
0
0
1234
for i ← 1 to k
do C[i] ← 0
Introduction to Algorithms Day 8      L5.15? 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
0
0
0
0
0
0
1
1
1234
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ? C[i] = |{key = i}|
Introduction to Algorithms Day 8      L5.16? 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
1
1
0
0
0
0
1
1
1234
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ? C[i] = |{key = i}|
Introduction to Algorithms Day 8      L5.17? 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
1
1
0
0
1
1
1
1
1234
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ? C[i] = |{key = i}|
Introduction to Algorithms Day 8      L5.18? 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
1
1
0
0
1
1
2
2
1234
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ? C[i] = |{key = i}|
Introduction to Algorithms Day 8      L5.19? 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
1
1
0
0
2
2
2
2
1234
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ? C[i] = |{key = i}|
Introduction to Algorithms Day 8      L5.20? 2001 by Charles E. Leiserson
Loop 3
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
1
1
0
0
2
2
2
2
1234
C':
1
1
1
1
2
2
2
2
for i ← 2 to k
do C[i] ← C[i] + C[i–1] ? C[i] = |{key ≤ i}|
Introduction to Algorithms Day 8      L5.21? 2001 by Charles E. Leiserson
Loop 3
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
1
1
0
0
2
2
2
2
1234
C':
1
1
1
1
3
3
2
2
for i ← 2 to k
do C[i] ← C[i] + C[i–1] ? C[i] = |{key ≤ i}|
Introduction to Algorithms Day 8      L5.22? 2001 by Charles E. Leiserson
Loop 3
A:
4
4
1
1
3
3
4
4
3
3
B:
12345
C:
1
1
0
0
2
2
2
2
1234
C':
1
1
1
1
3
3
5
5
for i ← 2 to k
do C[i] ← C[i] + C[i–1] ? C[i] = |{key ≤ i}|
Introduction to Algorithms Day 8      L5.23? 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
3
3
12345
C:
1
1
1
1
3
3
5
5
1234
C':
1
1
1
1
2
2
5
5
for j ← n downto 1
do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Introduction to Algorithms Day 8      L5.24? 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
3
3
4
4
12345
C:
1
1
1
1
2
2
5
5
1234
C':
1
1
1
1
2
2
4
4
for j ← n downto 1
do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Introduction to Algorithms Day 8      L5.25? 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
3
3
3
3
4
4
12345
C:
1
1
1
1
2
2
4
4
1234
C':
1
1
1
1
1
1
4
4
for j ← n downto 1
do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Introduction to Algorithms Day 8      L5.26? 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
1
1
3
3
3
3
4
4
12345
C:
1
1
1
1
1
1
4
4
1234
C':
0
0
1
1
1
1
4
4
for j ← n downto 1
do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Introduction to Algorithms Day 8      L5.27? 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
1
1
3
3
3
3
4
4
4
4
12345
C:
0
0
1
1
1
1
4
4
1234
C':
0
0
1
1
1
1
3
3
for j ← n downto 1
do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Introduction to Algorithms Day 8      L5.28? 2001 by Charles E. Leiserson
Analysis
for i ← 1 to k
do C[i] ← 0
Θ(n)
Θ(k)
Θ(n)
Θ(k)
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1
for i ← 2 to k
do C[i] ← C[i] + C[i–1]
for j ← n downto 1
do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Θ(n + k)
Introduction to Algorithms Day 8      L5.29? 2001 by Charles E. Leiserson
Running time
If k = O(n), then counting sort takes Θ(n) time.
? But, sorting takes ?(n lg n) time!
? Where’s the fallacy?
Answer:
? Comparison sorting takes ?(n lg n) time.
? Counting sort is not a comparison sort.
? In fact, not a single comparison between 
elements occurs!
Introduction to Algorithms Day 8      L5.30? 2001 by Charles E. Leiserson
Stable sorting
Counting sort is a stable sort: it preserves 
the input order among equal elements.
A:
4
4
1
1
3
3
4
4
3
3
B:
1
1
3
3
3
3
4
4
4
4
Exercise: What other sorts have this property?
Introduction to Algorithms Day 8      L5.31? 2001 by Charles E. Leiserson
Radix sort
? Origin: Herman Hollerith’s card-sorting 
machine for the 1890 U.S. Census.  (See 
Appendix     .)
? Digit-by-digit sort.
? Hollerith’s original (bad) idea: sort on 
most-significant digit first.
? Good idea: Sort on least-significant digit 
first with auxiliary stable sort.
Introduction to Algorithms Day 8      L5.32? 2001 by Charles E. Leiserson
Operation of radix sort
3 2 9
4 5 7
6 5 7
8 3 9
4 3 6
7 2 0
3 5 5
7 2 0
3 5 5
4 3 6
4 5 7
6 5 7
3 2 9
8 3 9
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
Introduction to Algorithms Day 8      L5.33? 2001 by Charles E. Leiserson
? Sort on digit t
Correctness of radix sort
Induction on digit position 
? Assume that the numbers 
are sorted by their low-order 
t –1digits.
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
Introduction to Algorithms Day 8      L5.34? 2001 by Charles E. Leiserson
? Sort on digit t
Correctness of radix sort
Induction on digit position 
? Assume that the numbers 
are sorted by their low-order 
t –1digits.
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
 ? Two numbers that differ in 
digit t are correctly sorted.
Introduction to Algorithms Day 8      L5.35? 2001 by Charles E. Leiserson
? Sort on digit t
Correctness of radix sort
Induction on digit position 
? Assume that the numbers 
are sorted by their low-order 
t –1digits.
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
 ? Two numbers that differ in 
digit t are correctly sorted.
 ? Two numbers equal in digit t
are put in the same order as 
the input ? correct order.
Introduction to Algorithms Day 8      L5.36? 2001 by Charles E. Leiserson
Analysis of radix sort
? Assume counting sort is the auxiliary stable sort.
? Sort n computer words of b bits each.
? Each word can be viewed as having b/r base-2
r
digits.
Example: 32-bit word
8888
r = 8 ? b/r =4passes of counting sort on 
base-2
8 
digits; or r = 16 ? b/r =2passes of 
counting sort on base-2
16 
digits.
How many passes should we make?
Introduction to Algorithms Day 8      L5.37? 2001 by Charles E. Leiserson
Analysis (continued)
Recall: Counting sort takes Θ(n + k) time to 
sort n numbers in the range from 0 to k –1.
If each b-bit word is broken into r-bit pieces, 
each pass of counting sort takes Θ(n + 2
r
) time.  
Since there are b/r passes, we have
( )
?
?
?
?
?
?
+Θ=
r
n
r
b
bnT 2),(
.
Choose r to minimize T(n, b):
? Increasing r means fewer passes, but as 
r >  lg n, the time grows exponentially.>
Introduction to Algorithms Day 8      L5.38? 2001 by Charles E. Leiserson
Choosing r
( )
?
?
?
?
?
?
+Θ=
r
n
r
b
bnT 2),(
Minimize T(n, b) by differentiating and setting to 0.
Or, just observe that we don’t want 2
r
>  n, and 
there’s no harm asymptotically in choosing r as 
large as possible subject to this constraint.
>
Choosing r = lg n implies T(n, b) = Θ(bn/lg n) .
? For numbers in the range from 0 to n
d
–1, we 
have b = d lg n ? radix sort runs in Θ(dn) time.
Introduction to Algorithms Day 8      L5.39? 2001 by Charles E. Leiserson
Conclusions
Example (32-bit numbers):
? At most 3 passes when sorting ≥ 2000 numbers.
? Merge sort and quicksort do at least ?lg 2000? = 
11 passes.
In practice, radix sort is fast for large inputs, as 
well as simple to code and maintain.
Downside: Unlike quicksort, radix sort displays 
little locality of reference, and thus a well-tuned 
quicksort fares better on modern processors, 
which feature steep memory hierarchies.
Introduction to Algorithms Day 8      L5.40? 2001 by Charles E. Leiserson
Appendix: Punched-card 
technology
? Herman Hollerith (1860-1929)
? Punched cards
? Hollerith’s tabulating system
? Operation of the sorter
? Origin of radix sort
? “Modern” IBM card
? Web resources on punched-
card technology
Introduction to Algorithms Day 8      L5.41? 2001 by Charles E. Leiserson
Herman Hollerith
(1860-1929)
? The 1880 U.S. Census took almost
10 years to process.
? While a lecturer at MIT, Hollerith 
prototyped punched-card technology.
? His machines, including a “card sorter,” allowed 
the 1890 census total to be reported in 6 weeks.
? He founded the Tabulating Machine Company in 
1911, which merged with other companies in 1924 
to form International Business Machines.
Introduction to Algorithms Day 8      L5.42? 2001 by Charles E. Leiserson
Punched cards
? Punched card = data record.
? Hole = value. 
? Algorithm = machine + human operator.
Replica of punch 
card from the 
1900 U.S. census:  
[Howells 2000]
Introduction to Algorithms Day 8      L5.43? 2001 by Charles E. Leiserson
Hollerith’s 
tabulating 
system
?Pantograph card 
punch
?Hand-press reader
?Dial counters
?Sorting box
See figure from 
[Howells 2000].
Introduction to Algorithms Day 8      L5.44? 2001 by Charles E. Leiserson
Operation of the sorter
? An operator inserts a card into 
the press.
? Pins on the press reach through 
the punched holes to make 
electrical contact with  mercury-
filled cups beneath the card.
? Whenever a particular digit 
value is punched, the lid of the 
corresponding sorting bin lifts.
? The operator deposits the card 
into the bin and closes the lid.
? When all cards have been processed, the front panel is opened, and 
the cards are collected in order, yielding one pass of a stable sort.
Introduction to Algorithms Day 8      L5.45? 2001 by Charles E. Leiserson
Origin of radix sort
Hollerith’s original 1889 patent alludes to a most-
significant-digit-first radix sort:
“The most complicated combinations can readily be 
counted with comparatively few counters or relays by first 
assorting the cards according to the first items entering 
into the combinations, then reassorting each group 
according to the second item entering into the combination, 
and so on, and finally counting on a few counters the last 
item of the combination for each group of cards.”
Least-significant-digit-first radix sort seems to be 
a folk invention originated by machine operators.
Introduction to Algorithms Day 8      L5.46? 2001 by Charles E. Leiserson
“Modern” IBM card
So, that’s why text windows have 80 columns!
See examples on the WWW Virtual Punch-Card Server. 
 
.
? One character per column.
Introduction to Algorithms Day 8      L5.47? 2001 by Charles E. Leiserson
Web resources on punched-
card technology
? Doug Jones’s punched card index
? Biography of Herman Hollerith
? The 1890 U.S. Census
? Early history of IBM
? Pictures of Hollerith’s inventions
? Hollerith’s patent application (borrowed 
from  Gordon Bell’s CyberMuseum)
? Impact of punched cards on U.S. history


