? Sorting,There is a series of data in random order,
we sort them depending a certain key word.
Datalist,an finity set of data waiting to be sorted.
Key,data object has many attribute areas,namely
there are many data element,one of all these
elements can be used to distinguish object,we use it
as sorting key,we also call it sorting code.
CHAPTER 6
SORTING
§ 1 Preliminaries
The stability of sorting method,if there are two objects
r[i] and r[j] in the object series,and their sorting codes
are k[i] == k[j],And before sorting,object r[i] is in front
of object r[j],If after the sorting,object r[i] is still in
front of object r[j],then we call this sorting method
stable,otherwise we call this sorting method unstable.
Inner sort and outer sort,inner sort means in the sorting
period,all the data objects are in the memory,Outer sort
means in the sorting period,since the number of object is
too large,and they cannot be put in the memory at the
same time,We must move the sorting between inner and
outer memory according to the demand,
Time spending of sort,time spending of sort is an
important standard to measure whether a method is
good or not,Time spending of sort can be measured
using data compare times and data moving times.
The kinds of inner sort
according to different principle
insertion sort,exchange sort,selection
sort,merge sort,counting sort.
according to workload
simple sort---time complexity o(n2)
advanced sort--- time complexity o(n logn)
base sort--- time complexity o(d.n)
Direct Insertion sort
0 1 2 3 4 5 temp i = 1
i = 2
0 1 2 3 4 5 temp
21 0825 49 25* 16
21 25 0849 25* 16 25
21 25 0849 25* 16
21 25 0849 25* 16 49
21 25 0849 25* 16
i = 3 21 25 0849 25* 16 25*
21 25 084925* 16
§ 2 Insertion Sort
i = 4
i = 5
21 25 084925* 16 16
21 25 084925*16
21 25 084925*16
21 2508 4925*16
08
Direct Insertion Sort Algorithm
void InsertionSort ( ElementType A[ ],int N )
{
int j,P;
ElementType Tmp;
for ( P = 1; P < N; P++ ) {
Tmp = A[ P ]; /* the next coming card */
for ( j = P; j > 0 && A[ j - 1 ] > Tmp; j-- )
A[ j ] = A[ j - 1 ];
/* shift sorted cards to provide a position
for the new coming card */
A[ j ] = Tmp; /* place the new card at the proper position */
} /* end for-P-loop */
}
The worst case,Input A[ ] is in reverse order,T( N ) = O( N2 )
The best case,Input A[ ] is in sorted order,T( N ) = O( N )
Binary Insertion sort
Basic idea,there are number of n sequential objects
V[0],V[1],…,V[n -1] in the SeqList,Insertion sort
makes use of the fact that elements in position 0
through i-1 are already in sorted order,use binary
search to find the inserting position of V[i] when
insert element v[i].
Algorithm of binary insertion sort
typedef int SortData;
void BinInsSort ( SortData V[ ],int n ) {
SortData temp; int Left,Right;
for ( int i = 1; i < n; i++) {
Left = 0; Right = i-1; temp = V[i];
while ( Left <= Right ) {
int mid = ( Left + Right )/2;
if ( temp < V[mid] ) Right = mid - 1;
else Left = mid + 1;
}
for ( int k = i-1; k >= Left; k-- ) V[k+1] = V[k];//
记录后移
V[Left] = temp; //插入
}
}
Binary insertion sort
0 1 2 3 4 5 temp
i = 1
i = 2
0 1 2 3 4 5 temp
5
21 33
i = 3
5
5
2153
21
4 4
i = 4 2153 8 8
i = 5 2153 16 168
4
4
213 84 5 16
Binary search is quicker than sequence search,so
the average performance of binary insertion sort is
better than direct insertion sort.
The number of key comparison have nothing to do
with the initial order of objects to be sort,but
depend on the number of objects,it needs the
number of?log2i? +1 key comparison when inserting
the ith object,so if we need to insert n objects,the
number of key comparison is as follows:
1
1
22 lo g1lo g
n
i
nni
Algorithm analysis
When n is large,The number of key comparison
is less than the number of direct insertion sort’s
key comparison in the worst condition,but is
more than in the best condition.
When the initial sequential object is in order or as
near as in order,The number of direct insertion
sort’s key comparison is less than the number of
the binary insertion sort’s,but the number of
moving elements is the same,depending on the
original order.
Binary insertion sort is a stable sort method.
The Time Complexity is o(n2).
§ 3 A Lower Bound for Simple Sorting Algorithms
【 Definition】 An inversion in an array of numbers is any
ordered pair ( i,j ) having the property that i < j but A[i] >
A[j].
〖 Example〗 Input list 34,8,64,51,32,21 has inversions.9
(34,8) (34,32) (34,21) (64,51) (64,32) (64,21) (51,32) (51,21) (32,21)
There are swaps needed to sort this list by insertion sort.9
Swapping two adjacent elements that are
out of place removes exactly one inversion.
T ( N,I ) = O( ) where I is the number of inversions
in the original array.
I + N
Fast if the list is almost sorted.
【 Theorem】 The average number of inversions in an array
of N distinct numbers is N ( N? 1 ) / 4.
【 Theorem】 Any algorithm that sorts by exchanging
adjacent elements requires? ( N2 ) time on average.
§ 4 Shellsort ---- by Donald Shell
〖 Example〗 Sort,
81 94 11 96 12 35 17 95 28 58 41 75 15
9628 12 58 8135 41 9417 7511 95155-sort
9641 941128 5835 75 9512 8117153-sort
1-sort 9641 9411 28 5835 75 9512 811715
Define an increment sequence h1 < h2 < … < ht ( h1 = 1 )
Define an hk-sort at each phase for k = t,t? 1,…,1
Note,An hk-sorted file that is then hk?1-sorted remains hk-sorted.
Shell’s increment sequence:
ht =? N / 2?,hk =? hk+1 / 2?
void Shellsort( ElementType A[ ],int N )
{
int i,j,Increment;
ElementType Tmp;
for ( Increment = N / 2; Increment > 0; Increment /= 2 )
/*h sequence */
for ( i = Increment; i < N; i++ ) { /* insertion sort */
Tmp = A[ i ];
for ( j = i; j >= Increment; j - = Increment )
if( Tmp < A[ j - Increment ] )
A[ j ] = A[ j - Increment ];
else
break;
A[ j ] = Tmp;
} /* end for-I and for-Increment loops */
}
Worst-Case Analysis:
【 Theorem】 The worst-case running time of Shellsort,
using Shell’s increments,is? ( N2 ).
〖 Example〗 A bad case:
1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 16
1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 168-sort
1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 164-sort
1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 162-sort
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 161-sort
Pairs of increments are not necessarily relatively prime,
Thus the smaller increment can have little effect.
Hibbard’s Increment Sequence:
hk = 2k? 1 ---- consecutive increments have no common factors.
【 Theorem】 The worst-case running time of Shellsort,
using Hibbard’s increments,is? ( N3/2 ).
Conjectures:
Tavg – Hibbard ( N ) = O ( N5/4 )
Sedgewick’s best sequence is {1,5,19,41,109,… } in
which the terms are either of the form 9?4i – 9?2i + 1 or
4i – 3?2i + 1,Tavg ( N ) = O ( N7/6 ) and Tworst ( N ) = O ( N4/3 ).
Shellsort is a very simple
algorithm,yet with an
extremely complex analysis,
It is good for sorting up to
moderately large input (tens
of thousands).
§ 5 Mergesort
1,Merge two sorted lists
1 13 24 26 2 15 27 38
1 2
Aptr Bptr
Cptr
Aptr
Cptr
Bptr
Cptr
T ( N ) = O ( ) where N is the total number of
elements.
N
2,Mergesort
void MSort( ElementType A[ ],ElementType TmpArray[ ],
int Left,int Right )
{ int Center;
if ( Left < Right ) { /* if there are elements to be sorted */
Center = ( Left + Right ) / 2;
MSort( A,TmpArray,Left,Center ); /* T( N / 2 ) */
MSort( A,TmpArray,Center + 1,Right ); /* T( N / 2 ) */
Merge( A,TmpArray,Left,Center + 1,Right ); /* O( N ) */
}
}
void Mergesort( ElementType A[ ],int N )
{ ElementType *TmpArray; /* need O(N) extra space */
TmpArray = malloc( N * sizeof( ElementType ) );
if ( TmpArray != NULL ) {
MSort( A,TmpArray,0,N - 1 );
free( TmpArray );
}
else FatalError( "No space for tmp array!!!" );
}
/*If a TmpArray is declared locally for each call of Merge,then S(N) = O( NlogN)*/
/* Lpos = start of left half,Rpos = start of right half */
void Merge( ElementType A[ ],ElementType TmpArray[ ],
int Lpos,int Rpos,int RightEnd )
{ int i,LeftEnd,NumElements,TmpPos;
LeftEnd = Rpos - 1;
TmpPos = Lpos;
NumElements = RightEnd - Lpos + 1;
while( Lpos <= LeftEnd && Rpos <= RightEnd ) /* main loop */
if ( A[ Lpos ] <= A[ Rpos ] )
TmpArray[ TmpPos++ ] = A[ Lpos++ ];
else
TmpArray[ TmpPos++ ] = A[ Rpos++ ];
while( Lpos <= LeftEnd ) /* Copy rest of first half */
TmpArray[ TmpPos++ ] = A[ Lpos++ ];
while( Rpos <= RightEnd ) /* Copy rest of second half */
TmpArray[ TmpPos++ ] = A[ Rpos++ ];
for( i = 0; i < NumElements; i++,RightEnd - - )
/* Copy TmpArray back */
A[ RightEnd ] = TmpArray[ RightEnd ];
}
3,Analysis
T ( 1 ) = 1
T ( N ) = 2T ( N / 2 ) + O( N )
= 2k T ( N / 2k ) + k * O( N )
= N * T ( 1 ) + log N * O( N )
= O( N + N log N )
Iterative version,
A 0 1 2 3 …… …… n?4 n?3 n?2 n?1
…… ……
…… ……
…… …… …… ……
Note,Mergesort requires linear
extra memory,and
copying an array is slow,
It is hardly ever used for
internal sorting,but is
quite useful for external
sorting.
Basic method,if the number of object waiting to be sorted is
n,Generally,in I,bubble sort compare every two
neighboring objects’ key from 1 to n-i+1,if negative
sequence happens,then exchange them,as a result,the
largest key of object is at position n-i+1,and do this n-1
times at most,
Basic idea,is comparing two sorting code of each object
which is waiting to be sorted,if negative sequence happens (the
order before sorting is just the opposite of the order after
sorting ),then exchange them until all the objects have been
well sorted.
Bubble Sort
§ 6 Exchange Sort
21
08
25
49
25
16
21
49
25
25
16
08
21
49
25
25
16
08
21
49
25
25
16
08 21
49
25
25
16
08
21
49
25
25
16
08
The process of bubble sort
Arithmetic of bubble sort
typedef int SortData;
void BubbleSort ( SortData V[ ],int n ) {
int i = 1;
int exchange = 1;
while ( i < n && exchange ){
exchange = 0; //标志置为 0,假定未交换
for ( int j = n-1; j >= i; j-- )
if ( V[j-1] > V[j] ) { //逆序
Swap ( V[j-1],V[j] ); //交换
exchange = 1; //标志置为 1,有交换
}
i++;
}
}
Thinking,how to achieve double bubble sort
Time I,do sorting to objects series V[i-
1],V[i],?,V[n-1],the result is the object of
smallest sorting code will be at the position i-1,
and other objects also move to their final positions,
All the objects can be well sorted to do at
most n-1 times.
If objects have been well arranged according to
sorting code from small to large,then only need
doing bubble sort once with comparing sorting
code n-1,and don’t need to move objects,This is
the best situation.
The worst situation is doing bubble sort n-1 times,at time I
(1? i? n) doing comparing sorting code n-I times,and
exchange objects n-I times,At this worst situation total
number of comparison KCN and number of object moving
RNM are:
1
1
1
1
1
2
3
3
1
2
1
n
i
n
i
nninR M N
nninK C N
)()(
)()(
Bubble sort is a stable sort method.
Quicksort -- the fastest known sorting algorithm in practice
1,The Algorithm
void Quicksort ( ElementType A[ ],int N )
{
if ( N < 2 ) return;
pivot = pick any element in A[ ];
Partition S = { A[ ] \ pivot } into two disjoint sets:
A1={ a?S | a? pivot } and A2={ a?S | a? pivot };
A = Quicksort ( A1,N1)? { pivot }? Quicksort ( A2,N2);
}
13 81
92 43 65
31 57 26
75 0
13 43
31 57 26
0
65 8192 75
0 13 26 31 43 57 65 75 81 92
0 13 26 31 43 57 65 75 81 92
The best case T(N) = O( )N log N
The pivot is placed at
the right place once
and for all.
2,Picking the Pivot
A Wrong Way,Pivot = A[ 0 ]
The worst case,A[ ] is presorted – quicksort will take
O( N2 ) time to do nothing
A Safe Maneuver,Pivot = random select from A[ ]
random number generation is expensive
Median-of-Three Partitioning:
Pivot = median ( left,center,right )
Eliminates the bad case for sorted input and actually
reduces the running time by about 5%.
3,Partitioning Strategy
8 1 4 9 0 3 5 2 7 65 92 86 9
i j
> >
j
<<
i i
<
i
>
j
<
i
<
i
< >
j i
<
4,Small Arrays
Problem,Quicksort is slower than insertion sort for small
N (? 20 ).
Solution,Cutoff when N gets small ( e.g,N = 10 ) and use
other efficient algorithms (such as insertion sort).
5,Implementation
void Quicksort( ElementType A[ ],int N )
{
Qsort( A,0,N - 1 );
/* A,the array */
/* 0,Left index */
/* N – 1,Right index */
}
/* Return median of Left,Center,and Right */
/* Order these and hide the pivot */
ElementType Median3( ElementType A[ ],int Left,int Right )
{
int Center = ( Left + Right ) / 2;
if ( A[ Left ] > A[ Center ] )
Swap( &A[ Left ],&A[ Center ] );
if ( A[ Left ] > A[ Right ] )
Swap( &A[ Left ],&A[ Right ] );
if ( A[ Center ] > A[ Right ] )
Swap( &A[ Center ],&A[ Right ] );
/* Invariant,A[ Left ] <= A[ Center ] <= A[ Right ] */
Swap( &A[ Center ],&A[ Right - 1 ] ); /* Hide pivot */
/* only need to sort A[ Left + 1 ] … A[ Right – 2 ] */
return A[ Right - 1 ]; /* Return pivot */
}
void Qsort( ElementType A[ ],int Left,int Right )
{ int i,j;
ElementType Pivot;
if ( Left + Cutoff <= Right ) { /* if the sequence is not too short */
Pivot = Median3( A,Left,Right ); /* select pivot */
i = Left; j = Right – 1; /* why not set Left+1 and Right-2? */
for( ; ; ) {
while ( A[ + +i ] < Pivot ) { } /* scan from left */
while ( A[ – –j ] > Pivot ) { } /* scan from right */
if ( i < j )
Swap( &A[ i ],&A[ j ] ); /* adjust partition */
else break; /* partition done */
}
Swap( &A[ i ],&A[ Right - 1 ] ); /* restore pivot */
Qsort( A,Left,i - 1 ); /* recursively sort left part */
Qsort( A,i + 1,Right ); /* recursively sort right part */
} /* end if - the sequence is long */
else /* do an insertion sort on the short subarray */
InsertionSort( A + Left,Right - Left + 1 );
}
6,Analysis
T( N ) = T( i ) + T( N – i – 1 ) + c N
The Worst Case:
T( N ) = T( N – 1 ) + c N T( N ) = O( N2 )
The Best Case,[,..,.,]? [,..,.,]
T( N ) = 2T( N / 2 ) + c N T( N ) = O( N log N )
The Average Case:
Assume the average value of T( i ) for any i is?
1
0
)(1 N
j
jTN
cNjTNNT
N
j
1
0
)(2)(
T( N ) = O( N log N )
〖 Example〗 Given a list of N elements and an integer k,
Find the kth largest element.
Basic thought,in the last n-I object
waiting to be sorted,we select the object
with the smallest sorting code as the
object I of the order series each time (for
example time I,i= 1,…,n -2),After n-2
times,there is only one object left,and
you don’t need to select.
§ 7 Selection sort
Direct selection sort is a simple sorting method,its
basic step is:
① In a series of object V[i]~ V[n-1],we select the
object with smallest sorting code;
② If it isn’t the first object of the object series,then
exchange it with the first object of this object
series;
③ Delete the object with the smallest sorting code
in this object series,In the remaining objects
V[i+1]~ V[n-1],repeat step ① and② until there
is only one object left.
Direct Selection Sort
21 25 49 25* 16 08
0 1 2 3 4 5
21 25*i = 0
49
25 16
25 16
08
49
08
25*49 21i = 1
i = 2 08 16 25* 25 21
beginning
The smallest 08
Exchange 21,08
The smallest 16
Exchange 25,16
The smallest 21
Exchange 49,21
the process of direct selection sort
The smallest 25*
No exchange
4925*
0 1 2 3 4 5
25*i = 4
251608
49
25* 4921result
i = 3 08 16 2521
The smallest 25
No exchange252116
08
Result after sorting
arithmetic of direct selection sort
typedef int SortData;
void SelectSort ( SortData V[ ],int n ) {
for ( int i = 0; i < n-1; i++ ) {
int k = i; //select the object with the smallest
sorting code
for ( int j = i+1; j < n; j++)
if ( V[j] < V[k] )
k = j; //the object owned the smallest sorting
code currently
if ( k != i ) //exchange to position i
Swap ( V[i],V[k] );
}
}
The times of direct selection sort’ comparison KCN has no
relationship with the beginning order,Provided total
number of object series is n,then select the object of
smallest sorting code at time I need to compare n-i-1 times,
The total times of comparison is:
2
0 2
11n
i
nninKCN )()(
The times of object moving has relationship with the
beginning order,When the beginning state is the order of
sorting code from small to large,the times of object moving
is RMN=0,is the least.
The worst situation is that every time needs to exchange,
total times of object moving is RMN=3(n-1).
direct selection sort is an unstable sorting method.
ADT Model
Objects,A finite ordered list with zero or more elements.
Operations:
PriorityQueue Initialize( int MaxElements );
void Insert( ElementType X,PriorityQueue H );
ElementType DeleteMin( PriorityQueue H );
ElementType FindMin( PriorityQueue H );
—— delete the element with the highest \ lowest priority
PRIORITY QUEUES (HEAPS)
Simple Implementations
Array,
Insertion — add one item at the end ~? ( 1 )
Deletion — find the largest \ smallest key ~? ( n )
remove the item and shift array ~ O( n )?
Linked List,
Insertion — add to the front of the chain ~? ( 1 )
Deletion — find the largest \ smallest key ~? ( n )
remove the item ~?( 1 )
Ordered Array,
Insertion — find the proper position ~ O( n )
shift array and add the item ~ O( n )
Deletion — remove the first \ last item ~?( 1 )
Ordered Linked List,
Insertion — find the proper position ~ O( n )
add the item ~?( 1 )
Deletion — remove the first \ last item ~?( 1 )
Binary Heap
1,Structure Property:
【 Definition】 A binary tree with n nodes and height h is
complete iff its nodes correspond to the nodes numbered
from 1 to n in the perfect binary tree of height h.
4
8 9
5
10 11
6
12 13
7
14 15
2 3
1
A complete binary tree of height h has between
and nodes.
2h
2h+1? 1 h =? log N?
Array Representation,BT [ n + 1 ] ( BT [ 0 ] is not used)
D
H I
E
J
F G
B C
A
BT 0 1
A
2
B
3
C
4
D
5
E
6
F
7
G
8
H
9
I
10
J
11 12 13
Complete
binary tree
Representation
Ki? K2i+1 &
Ki? K2i+2
Complete
binary tree
Representation
Ki? K2i+1 &&
Ki? K2i+2
09
09
87
87
78
7845 45
65
65 31
3153 23
23
53
17
17
【 Lemma】 If a complete binary tree with n nodes is
represented sequentially,then for any node with index i,
1? i? n,we have:
ni
nii
ic h i l dr i g h t
ni
nii
ic h i l dl e f t
i
ii
ip a r e n t
12 ifN o n e
12 if12
)(_ ofi n d e x ( 3 )
2 ifN o n e
2 if2
)(_ ofi n d e x ( 2 )
1 ifN o n e
1 if2
)( ofi n d e x ( 1 )
PriorityQueue Initialize( int MaxElements )
{
PriorityQueue H;
if ( MaxElements < MinPQSize )
return Error( "Priority queue size is too small" );
H = malloc( sizeof ( struct HeapStruct ) );
if ( H ==NULL )
return FatalError( "Out of space!!!" );
/* Allocate the array plus one extra for sentinel */
H->Elements = malloc(( MaxElements + 1 ) * sizeof( ElementType ));
if ( H->Elements == NULL )
return FatalError( "Out of space!!!" );
H->Capacity = MaxElements;
H->Size = 0;
H->Elements[ 0 ] = MinData; /* set the sentinel */
return H;
}
2,Heap Order Property:
【 Definition】 A min tree is a tree in which the key value
in each node is no larger than the key values in its
children (if any),A min heap is a complete binary tree
that is also a min tree.
Note,Analogously,we can declare a max heap by
changing the heap order property.
9
6
5
3
[1]
[2] [3]
[4]
A max heap
10
20
50
83
[1]
[2] [3]
[4]
A min heap
The largest key The smallest key
3,Basic Heap Operations:
Create MinHeap from array(array[0] is used)
Process of creation(FilterDown())
Adjust to minheap from down to up
53 53
17 1778 78
09
23 45 65 87
i
09
23
45 65 87
i = 4 i = 3
i
53 53
17
1778 7809
23
45
65
87
i
09
23
45
65
87
i = 2
i
53
17 1778 78
09
23
45
65
87
i
09
23
45
65
87
i = 1
i
53
53
17 17
78 78
09
23
45
65
87
i
09
23 45
65
87i53
MinHeap
void Crt_MinHeap ( MinHeap *H,
HeapData arr[ ],int n ) {
//根据给定数组中的数据和大小,建立堆
for ( int i = 0; i < n; i++ ) H->data[i] = arr[i];
H->CSize = n; //数组传送及当前堆大小
int CPos = (H->CSize-2)/2; //最后分支结点
while ( CPos >= 0 ) { //从下到上逐步扩大堆
FilterDown ( &H,CPos,H->CSize-1 );
CPos--;
}
}
Create MinHeap
void FilterDown ( MinHeap *H,int start,
int EndOfHeap ) {
int i = start,j = 2*i+1; // j 是 i 的左子女
HeapData temp = H->data[i];
while ( j <= EndOfHeap ) {
if ( j < EndOfHeap && //两子女中选小者
H->data[j] > H->data[j+1] ) j++;
if ( temp <= H->data[j] ) break;
else { H->data[i] = H->data[j];
i = j; j = 2*j+1; } //向下滑动
}
H->data[i] = temp;
}
FilterDown algorithm of MinHeap
Process of creating MaxHeap
21
25
25*
49
16 08
0
1 2
3 4 5
i 21
25
25* 16
49
08
0
2
543
1
i
Key set at the beginning i = 2 part adjustment
21
25
25*
49
16 08
0
1 2
3 4 5
i
49
25
25* 16
21
08
0
2
543
1
i = 0 part adjustment
and become MaxHeap
i = 1 part adjustment
FilterDown algorithm of MaxHeap
void FilterDown ( MaxHeap *H,
int start,int EndOfHeap ) {
int i = start; int j = 2*i+1;
HeadData temp = H->data[i];
while ( j <= EndOfHeap ) { //最后位置
if ( j < EndOfHeap &&
H->data[j] < H->data[j+1] )
j = j+1; //选两子女的大者
if ( temp >= H->data[j] ) break;
else {
H->data[i] = H->data[j]; //大 子女上移
i = j; j = 2*j+1; //向下继续调整
}
}
H->data[i] = temp; //回放到合适位置
}
Transform array into Maxheap
for ( int i = ( H->CSize-2) / 2 ; i >= 0; i-- )
FilterDown ( H,i,H->CSize-1 );
insertion
10
12
15
20
[1]
[2] [3]
[4]
18
[5] [6]
Sketch of the idea:
The only possible position
for a new node
since a heap must be
a complete binary tree.
Case 1,new_item = 21
21
20 21<
Case 2,new_item = 17
17
20 17>
17
20
10 17<
Case 3,new_item = 9 20 9>
9
10 9>
9
10
/* H->Element[ 0 ] is a sentinel */
void Insert( ElementType X,PriorityQueue H )
{
int i;
if ( IsFull( H ) ) {
Error( "Priority queue is full" );
return;
}
for ( i = ++H->Size; H->Elements[ i / 2 ] > X; i /= 2 )
H->Elements[ i ] = H->Elements[ i / 2 ];
H->Elements[ i ] = X;
}
T (N) = O ( log N )
DeleteMin
Sketch of the idea:
10
12
15
20
[1]
[2] [3]
[4]
18
[5]
The node which must be
removed to keep a
complete binary tree.
move 18 up to the root18
find the smaller child of 18 12 18<18
12
15 18<18
15
T (N) = O ( log N )
ElementType DeleteMin( PriorityQueue H )
{
int i,Child;
ElementType MinElement,LastElement;
if ( IsEmpty( H ) ) {
Error( "Priority queue is empty" );
return H->Elements[ 0 ]; }
MinElement = H->Elements[ 1 ]; /* save the min element */
LastElement = H->Elements[ H->Size-- ]; /* take last and reset size */
for ( i = 1; i * 2 <= H->Size; i = Child ) { /* Find smaller child */
Child = i * 2;
if (Child != H->Size && H->Elements[Child+1] < H->Elements[Child])
Child++;
if ( LastElement > H->Elements[ Child ] ) /* Percolate one level */
H->Elements[ i ] = H->Elements[ Child ];
else break; /* find the proper position */
}
H->Elements[ i ] = LastElement;
return MinElement;
}
Algorithm 1(from min to max)
{
Crt_MinHeap ( MinHeap *H,HeapData arr[ ],int n );
for ( i=0; i<N; i++ )
TmpH[ i ] = DeleteMin( H );
for ( i=0; i<N; i++ )
H[ i ] = TmpH[ i ];
}
/* O( N ) */
/* O( log N ) */
/* O( 1 ) */
T ( N ) = O ( N log N )
The space requirement is doubled.
HeapSort
b
d a
c
A [0]
[1] [2]
[3]
d
c
bd
c
b c
a
a
Algorithm 2,void Heapsort( ElementType A[ ],int N )
{ int i;
for ( i = N / 2; i >= 0; i - - ) /* BuildHeap */
FilterDown( A,i,N );
for ( i = N - 1; i > 0; i - - ) {
Swap( &A[ 0 ],&A[ i ] ); /* DeleteMin */
FilterDown( A,0,i );
}
}
【 Theorem】 The average number of comparisons used to
heapsort a random permutation of N distinct items is
2N log N? O( N log log N ),
Note,Although Heapsort gives the best average time,in practice it is
slower than a version of Shellsort that uses Sedgewick’s
increment sequence.
4,Other Heap Operations:
Note,Finding any key except the minimum one will
have to take a linear scan through the entire heap.
DecreaseKey ( P,?,H )
Lower the value of the key in the heap H at
position P by a positive amount of?……so my
programs can run with highest priority?.sys,admin.
IncreaseKey ( P,?,H )
Percolate up
sys,admin.
Increases the value of the key in the heap H at
position P by a positive amount of?……drop
the priority of a process that is consuming
excessive CPU time.
Percolate down
Delete ( P,H )
Remove the node at position P from the heap
H …… delete the process that is terminated
(abnormally) by a user.sys,admin.
DecreaseKey(P,?,H); DeleteMin(H)
BuildHeap ( H )
Place N input keys into an empty heap H.
sys,admin.
N successive Insertions?
30
100 20
10
90 60
70
50 120
110
140 130
80 40
150
150,80,40,30,10,70,110,100,20,90,60,50,120,140,130
PercolateDown (7)
PercolateDown (6)
50
70
PercolateDown (5)
PercolateDown (4)
20
30
PercolateDown (3)
PercolateDown (2)
80
10
80
60
PercolateDown (1)
150
10
150
20
150
T (N) =?
【 Theorem】 For the perfect binary tree of height h
containing 2h+1? 1 nodes,the sum of the heights of the
nodes is 2h+1? 1? (h + 1).
T ( N ) = O ( N )
Applications of Priority Queues
〖 Example〗 Given a list of N elements and an integer k,
Find the kth largest element.
d-Heaps ---- All nodes have d children
3
1513 6
5
178 9
2
74 10
911
1
3-heap
Question,Shall we make d as large as possible?
Note:?DeleteMin will take d? 1 comparisons to find the smallest
child,Hence the total time complexity would be O(d logd N).
*2 or /2 is merely a bit shift,but *d or /d is not.
When the priority queue is too large to fit entirely in main
memory,a d-heap will become interesting.
§ 8 Sorting Large Structures
Problem,Swapping large structures can be very much expensive.
Solution:Add a pointer field to the structure and swap pointers instead
– indirect sorting,Physically rearrange the structures at last
if it is really necessary.
list
key
table
[0]
d
0
[1]
b
1
[2]
f
2
[3]
c
3
[4]
a
4
[5]
e
54 3 0 5 2The sorted list is
list [ table[0] ],list [ table[1] ],……,list [ table[n?1] ]
Note,Every permutation is made up of disjoint cycles.
list
key
table
[0]
d
4
[1]
b
1
[2]
f
3
[3]
c
0
[4]
a
5
[5]
e
2
temp = d
current = 0
next = 4
a
0
4
5
e
4
5
2
f
5
2
3
c
2
3d
3
In the worst case there are? cycles and requires?
record moves.
N / 2 3N / 2?
T = O( m N ) where m is the size of a structure.
〖 Example〗 Table Sort
§ 9 A General Lower Bound for Sorting
【 Theorem】 Any algorithm that sorts by comparisons only
must have a worst case computing time of?( N log N ).
Proof,K0? K1
K1? K2
K0? K2stop[0,1,2]
stop
[0,2,1]
stop
[2,0,1]
T F
T F
K0? K2
K1? K2stop[1,0,2]
stop
[1,2,0]
stop
[2,1,0]
T F
T F
T F
Decision tree for insertion sort on R0,R1,and R2
When sorting N distinct
elements,there are N! different
possible results.
Thus any decision tree must
have at least N! leaves.
If the height of the tree
is k,then N!? 2k?1 (# of
leaves in a complete
binary tree)
k? log(N!) + 1
Since N!? (N/2)N/2 and log2 N!? (N/2)log2(N/2) =? ( N log2 N )
Therefore T(N) = k? c? N log2 N,
§ 10 Bucket Sort and Radix Sort
Bucket Sort
〖 Example〗 Suppose that we have N students,each has a
grade record in the range 0 to 100 (thus there are M = 101
possible distinct grades),How to sort them according to their
grades in linear time?
count
0 1 10088
Algorithm
{
initialize count[ ];
while (read in a student’s record)
insert to list count[stdnt.grade];
for (i=0; i<M; i++) {
if (count[i])
output list count[i];
}
}
T(N,M) = O( M+N )
〖 Example〗 Given N = 10 integers in the range 0 to 999 ( M = 1000 )
Is it possible to sort them in linear time?
Radix Sort
Input,64,8,216,512,27,729,0,1,343,125
0Bucket 1 2 3 4 5 6 7 8 9
Sort according to the Least Significant Digit first.
0Pass 1 1 512 343 64 125 216 27 8 729
Pass 2
0
1
512 343 64125
216 27
8 729
Pass 3
0
1
8
512216125
27
729343
64
Output,0,1,8,27,64,125,216,343,512,729
T=O(P(N+B))
where P is the
number of
passes,N is the
number of
elements to sort,
and B is the
number of
buckets.
What if we sort
according to the Most
Significant Digit first?
Ki j,:= the j-th key of record Ri
Ki 0,:= the most significant key of record Ri
Suppose that the record Ri has r keys.
Ki r?1,:= the least significant key of record Ri
A list of records R0,...,Rn?1 is lexically sorted with respect
to the keys K 0,K 1,...,K r?1 iff
.10),,,,(),,,( 111 10 1110 niKKKKKK riiiriii
That is,Ki 0 = Ki+1 0,...,Ki l = Ki+1 l,Ki l+1 < Ki+1 l+1 for
some l < r? 1.
〖 Example〗 A deck of cards sorted on 2 keys
K 0 [Suit]? <? <? <?
K 1 [Face value] 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < J < Q < K < A
Sorting result,2?,.,A? 2?,.,A? 2?,.,A? 2?,.,A?
MSD ( Most Significant Digit ) Sort
Sort on K 0,for example,create 4 buckets for the suits
3?
3?
5?
5?
A?
A?
4?
4?
Sort each bucket independently (using any sorting
technique)
§ 10 Bucket Sort and Radix Sort
LSD ( Least Significant Digit ) Sort
Sort on K 1,for example,create 13 buckets for the face
values
2?
2?
3?
3?
4?
4?
5?
5?
A?
A?
...
Reform them into a single pile A?
A?
3?
3?
2?
2?
Create 4 buckets and resort
Question:
Is LSD always faster than MSD?
Comparison of all sort method
S o r t m e t h o d C o m p a r i n g
n u m b e r
M o v i n g
n u m b e r
s t a b i l
i t y
A dd i t i o n
s t o r a g e
b e s t w o r s t b e s t w o r
st
b e s t w o r
st
D i r e c t i n s e r t i o n s o r t n n
2
0 n
2
1
B i n a r y i n s e r t i o n s o r t n l o g
2
n 0 n
2
1
B u b b l e s o r t n n
2
0 n
2
1
Q u i c k s o r t n l o g
2
n n
2
n l o g
2
n n
2
l o g
2
n n
2
D i r e c t s e l e c t i o n s o r t n
2
0 n? 1
T o u r n a m e n t s o r t n l o g
2
n n l o g
2
n? n
H e a p s o r t n l o g
2
n n l o g
2
n? 1
M e r g e s o r t n l o g
2
n n l o g
2
n? n
we sort them depending a certain key word.
Datalist,an finity set of data waiting to be sorted.
Key,data object has many attribute areas,namely
there are many data element,one of all these
elements can be used to distinguish object,we use it
as sorting key,we also call it sorting code.
CHAPTER 6
SORTING
§ 1 Preliminaries
The stability of sorting method,if there are two objects
r[i] and r[j] in the object series,and their sorting codes
are k[i] == k[j],And before sorting,object r[i] is in front
of object r[j],If after the sorting,object r[i] is still in
front of object r[j],then we call this sorting method
stable,otherwise we call this sorting method unstable.
Inner sort and outer sort,inner sort means in the sorting
period,all the data objects are in the memory,Outer sort
means in the sorting period,since the number of object is
too large,and they cannot be put in the memory at the
same time,We must move the sorting between inner and
outer memory according to the demand,
Time spending of sort,time spending of sort is an
important standard to measure whether a method is
good or not,Time spending of sort can be measured
using data compare times and data moving times.
The kinds of inner sort
according to different principle
insertion sort,exchange sort,selection
sort,merge sort,counting sort.
according to workload
simple sort---time complexity o(n2)
advanced sort--- time complexity o(n logn)
base sort--- time complexity o(d.n)
Direct Insertion sort
0 1 2 3 4 5 temp i = 1
i = 2
0 1 2 3 4 5 temp
21 0825 49 25* 16
21 25 0849 25* 16 25
21 25 0849 25* 16
21 25 0849 25* 16 49
21 25 0849 25* 16
i = 3 21 25 0849 25* 16 25*
21 25 084925* 16
§ 2 Insertion Sort
i = 4
i = 5
21 25 084925* 16 16
21 25 084925*16
21 25 084925*16
21 2508 4925*16
08
Direct Insertion Sort Algorithm
void InsertionSort ( ElementType A[ ],int N )
{
int j,P;
ElementType Tmp;
for ( P = 1; P < N; P++ ) {
Tmp = A[ P ]; /* the next coming card */
for ( j = P; j > 0 && A[ j - 1 ] > Tmp; j-- )
A[ j ] = A[ j - 1 ];
/* shift sorted cards to provide a position
for the new coming card */
A[ j ] = Tmp; /* place the new card at the proper position */
} /* end for-P-loop */
}
The worst case,Input A[ ] is in reverse order,T( N ) = O( N2 )
The best case,Input A[ ] is in sorted order,T( N ) = O( N )
Binary Insertion sort
Basic idea,there are number of n sequential objects
V[0],V[1],…,V[n -1] in the SeqList,Insertion sort
makes use of the fact that elements in position 0
through i-1 are already in sorted order,use binary
search to find the inserting position of V[i] when
insert element v[i].
Algorithm of binary insertion sort
typedef int SortData;
void BinInsSort ( SortData V[ ],int n ) {
SortData temp; int Left,Right;
for ( int i = 1; i < n; i++) {
Left = 0; Right = i-1; temp = V[i];
while ( Left <= Right ) {
int mid = ( Left + Right )/2;
if ( temp < V[mid] ) Right = mid - 1;
else Left = mid + 1;
}
for ( int k = i-1; k >= Left; k-- ) V[k+1] = V[k];//
记录后移
V[Left] = temp; //插入
}
}
Binary insertion sort
0 1 2 3 4 5 temp
i = 1
i = 2
0 1 2 3 4 5 temp
5
21 33
i = 3
5
5
2153
21
4 4
i = 4 2153 8 8
i = 5 2153 16 168
4
4
213 84 5 16
Binary search is quicker than sequence search,so
the average performance of binary insertion sort is
better than direct insertion sort.
The number of key comparison have nothing to do
with the initial order of objects to be sort,but
depend on the number of objects,it needs the
number of?log2i? +1 key comparison when inserting
the ith object,so if we need to insert n objects,the
number of key comparison is as follows:
1
1
22 lo g1lo g
n
i
nni
Algorithm analysis
When n is large,The number of key comparison
is less than the number of direct insertion sort’s
key comparison in the worst condition,but is
more than in the best condition.
When the initial sequential object is in order or as
near as in order,The number of direct insertion
sort’s key comparison is less than the number of
the binary insertion sort’s,but the number of
moving elements is the same,depending on the
original order.
Binary insertion sort is a stable sort method.
The Time Complexity is o(n2).
§ 3 A Lower Bound for Simple Sorting Algorithms
【 Definition】 An inversion in an array of numbers is any
ordered pair ( i,j ) having the property that i < j but A[i] >
A[j].
〖 Example〗 Input list 34,8,64,51,32,21 has inversions.9
(34,8) (34,32) (34,21) (64,51) (64,32) (64,21) (51,32) (51,21) (32,21)
There are swaps needed to sort this list by insertion sort.9
Swapping two adjacent elements that are
out of place removes exactly one inversion.
T ( N,I ) = O( ) where I is the number of inversions
in the original array.
I + N
Fast if the list is almost sorted.
【 Theorem】 The average number of inversions in an array
of N distinct numbers is N ( N? 1 ) / 4.
【 Theorem】 Any algorithm that sorts by exchanging
adjacent elements requires? ( N2 ) time on average.
§ 4 Shellsort ---- by Donald Shell
〖 Example〗 Sort,
81 94 11 96 12 35 17 95 28 58 41 75 15
9628 12 58 8135 41 9417 7511 95155-sort
9641 941128 5835 75 9512 8117153-sort
1-sort 9641 9411 28 5835 75 9512 811715
Define an increment sequence h1 < h2 < … < ht ( h1 = 1 )
Define an hk-sort at each phase for k = t,t? 1,…,1
Note,An hk-sorted file that is then hk?1-sorted remains hk-sorted.
Shell’s increment sequence:
ht =? N / 2?,hk =? hk+1 / 2?
void Shellsort( ElementType A[ ],int N )
{
int i,j,Increment;
ElementType Tmp;
for ( Increment = N / 2; Increment > 0; Increment /= 2 )
/*h sequence */
for ( i = Increment; i < N; i++ ) { /* insertion sort */
Tmp = A[ i ];
for ( j = i; j >= Increment; j - = Increment )
if( Tmp < A[ j - Increment ] )
A[ j ] = A[ j - Increment ];
else
break;
A[ j ] = Tmp;
} /* end for-I and for-Increment loops */
}
Worst-Case Analysis:
【 Theorem】 The worst-case running time of Shellsort,
using Shell’s increments,is? ( N2 ).
〖 Example〗 A bad case:
1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 16
1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 168-sort
1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 164-sort
1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 162-sort
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 161-sort
Pairs of increments are not necessarily relatively prime,
Thus the smaller increment can have little effect.
Hibbard’s Increment Sequence:
hk = 2k? 1 ---- consecutive increments have no common factors.
【 Theorem】 The worst-case running time of Shellsort,
using Hibbard’s increments,is? ( N3/2 ).
Conjectures:
Tavg – Hibbard ( N ) = O ( N5/4 )
Sedgewick’s best sequence is {1,5,19,41,109,… } in
which the terms are either of the form 9?4i – 9?2i + 1 or
4i – 3?2i + 1,Tavg ( N ) = O ( N7/6 ) and Tworst ( N ) = O ( N4/3 ).
Shellsort is a very simple
algorithm,yet with an
extremely complex analysis,
It is good for sorting up to
moderately large input (tens
of thousands).
§ 5 Mergesort
1,Merge two sorted lists
1 13 24 26 2 15 27 38
1 2
Aptr Bptr
Cptr
Aptr
Cptr
Bptr
Cptr
T ( N ) = O ( ) where N is the total number of
elements.
N
2,Mergesort
void MSort( ElementType A[ ],ElementType TmpArray[ ],
int Left,int Right )
{ int Center;
if ( Left < Right ) { /* if there are elements to be sorted */
Center = ( Left + Right ) / 2;
MSort( A,TmpArray,Left,Center ); /* T( N / 2 ) */
MSort( A,TmpArray,Center + 1,Right ); /* T( N / 2 ) */
Merge( A,TmpArray,Left,Center + 1,Right ); /* O( N ) */
}
}
void Mergesort( ElementType A[ ],int N )
{ ElementType *TmpArray; /* need O(N) extra space */
TmpArray = malloc( N * sizeof( ElementType ) );
if ( TmpArray != NULL ) {
MSort( A,TmpArray,0,N - 1 );
free( TmpArray );
}
else FatalError( "No space for tmp array!!!" );
}
/*If a TmpArray is declared locally for each call of Merge,then S(N) = O( NlogN)*/
/* Lpos = start of left half,Rpos = start of right half */
void Merge( ElementType A[ ],ElementType TmpArray[ ],
int Lpos,int Rpos,int RightEnd )
{ int i,LeftEnd,NumElements,TmpPos;
LeftEnd = Rpos - 1;
TmpPos = Lpos;
NumElements = RightEnd - Lpos + 1;
while( Lpos <= LeftEnd && Rpos <= RightEnd ) /* main loop */
if ( A[ Lpos ] <= A[ Rpos ] )
TmpArray[ TmpPos++ ] = A[ Lpos++ ];
else
TmpArray[ TmpPos++ ] = A[ Rpos++ ];
while( Lpos <= LeftEnd ) /* Copy rest of first half */
TmpArray[ TmpPos++ ] = A[ Lpos++ ];
while( Rpos <= RightEnd ) /* Copy rest of second half */
TmpArray[ TmpPos++ ] = A[ Rpos++ ];
for( i = 0; i < NumElements; i++,RightEnd - - )
/* Copy TmpArray back */
A[ RightEnd ] = TmpArray[ RightEnd ];
}
3,Analysis
T ( 1 ) = 1
T ( N ) = 2T ( N / 2 ) + O( N )
= 2k T ( N / 2k ) + k * O( N )
= N * T ( 1 ) + log N * O( N )
= O( N + N log N )
Iterative version,
A 0 1 2 3 …… …… n?4 n?3 n?2 n?1
…… ……
…… ……
…… …… …… ……
Note,Mergesort requires linear
extra memory,and
copying an array is slow,
It is hardly ever used for
internal sorting,but is
quite useful for external
sorting.
Basic method,if the number of object waiting to be sorted is
n,Generally,in I,bubble sort compare every two
neighboring objects’ key from 1 to n-i+1,if negative
sequence happens,then exchange them,as a result,the
largest key of object is at position n-i+1,and do this n-1
times at most,
Basic idea,is comparing two sorting code of each object
which is waiting to be sorted,if negative sequence happens (the
order before sorting is just the opposite of the order after
sorting ),then exchange them until all the objects have been
well sorted.
Bubble Sort
§ 6 Exchange Sort
21
08
25
49
25
16
21
49
25
25
16
08
21
49
25
25
16
08
21
49
25
25
16
08 21
49
25
25
16
08
21
49
25
25
16
08
The process of bubble sort
Arithmetic of bubble sort
typedef int SortData;
void BubbleSort ( SortData V[ ],int n ) {
int i = 1;
int exchange = 1;
while ( i < n && exchange ){
exchange = 0; //标志置为 0,假定未交换
for ( int j = n-1; j >= i; j-- )
if ( V[j-1] > V[j] ) { //逆序
Swap ( V[j-1],V[j] ); //交换
exchange = 1; //标志置为 1,有交换
}
i++;
}
}
Thinking,how to achieve double bubble sort
Time I,do sorting to objects series V[i-
1],V[i],?,V[n-1],the result is the object of
smallest sorting code will be at the position i-1,
and other objects also move to their final positions,
All the objects can be well sorted to do at
most n-1 times.
If objects have been well arranged according to
sorting code from small to large,then only need
doing bubble sort once with comparing sorting
code n-1,and don’t need to move objects,This is
the best situation.
The worst situation is doing bubble sort n-1 times,at time I
(1? i? n) doing comparing sorting code n-I times,and
exchange objects n-I times,At this worst situation total
number of comparison KCN and number of object moving
RNM are:
1
1
1
1
1
2
3
3
1
2
1
n
i
n
i
nninR M N
nninK C N
)()(
)()(
Bubble sort is a stable sort method.
Quicksort -- the fastest known sorting algorithm in practice
1,The Algorithm
void Quicksort ( ElementType A[ ],int N )
{
if ( N < 2 ) return;
pivot = pick any element in A[ ];
Partition S = { A[ ] \ pivot } into two disjoint sets:
A1={ a?S | a? pivot } and A2={ a?S | a? pivot };
A = Quicksort ( A1,N1)? { pivot }? Quicksort ( A2,N2);
}
13 81
92 43 65
31 57 26
75 0
13 43
31 57 26
0
65 8192 75
0 13 26 31 43 57 65 75 81 92
0 13 26 31 43 57 65 75 81 92
The best case T(N) = O( )N log N
The pivot is placed at
the right place once
and for all.
2,Picking the Pivot
A Wrong Way,Pivot = A[ 0 ]
The worst case,A[ ] is presorted – quicksort will take
O( N2 ) time to do nothing
A Safe Maneuver,Pivot = random select from A[ ]
random number generation is expensive
Median-of-Three Partitioning:
Pivot = median ( left,center,right )
Eliminates the bad case for sorted input and actually
reduces the running time by about 5%.
3,Partitioning Strategy
8 1 4 9 0 3 5 2 7 65 92 86 9
i j
> >
j
<<
i i
<
i
>
j
<
i
<
i
< >
j i
<
4,Small Arrays
Problem,Quicksort is slower than insertion sort for small
N (? 20 ).
Solution,Cutoff when N gets small ( e.g,N = 10 ) and use
other efficient algorithms (such as insertion sort).
5,Implementation
void Quicksort( ElementType A[ ],int N )
{
Qsort( A,0,N - 1 );
/* A,the array */
/* 0,Left index */
/* N – 1,Right index */
}
/* Return median of Left,Center,and Right */
/* Order these and hide the pivot */
ElementType Median3( ElementType A[ ],int Left,int Right )
{
int Center = ( Left + Right ) / 2;
if ( A[ Left ] > A[ Center ] )
Swap( &A[ Left ],&A[ Center ] );
if ( A[ Left ] > A[ Right ] )
Swap( &A[ Left ],&A[ Right ] );
if ( A[ Center ] > A[ Right ] )
Swap( &A[ Center ],&A[ Right ] );
/* Invariant,A[ Left ] <= A[ Center ] <= A[ Right ] */
Swap( &A[ Center ],&A[ Right - 1 ] ); /* Hide pivot */
/* only need to sort A[ Left + 1 ] … A[ Right – 2 ] */
return A[ Right - 1 ]; /* Return pivot */
}
void Qsort( ElementType A[ ],int Left,int Right )
{ int i,j;
ElementType Pivot;
if ( Left + Cutoff <= Right ) { /* if the sequence is not too short */
Pivot = Median3( A,Left,Right ); /* select pivot */
i = Left; j = Right – 1; /* why not set Left+1 and Right-2? */
for( ; ; ) {
while ( A[ + +i ] < Pivot ) { } /* scan from left */
while ( A[ – –j ] > Pivot ) { } /* scan from right */
if ( i < j )
Swap( &A[ i ],&A[ j ] ); /* adjust partition */
else break; /* partition done */
}
Swap( &A[ i ],&A[ Right - 1 ] ); /* restore pivot */
Qsort( A,Left,i - 1 ); /* recursively sort left part */
Qsort( A,i + 1,Right ); /* recursively sort right part */
} /* end if - the sequence is long */
else /* do an insertion sort on the short subarray */
InsertionSort( A + Left,Right - Left + 1 );
}
6,Analysis
T( N ) = T( i ) + T( N – i – 1 ) + c N
The Worst Case:
T( N ) = T( N – 1 ) + c N T( N ) = O( N2 )
The Best Case,[,..,.,]? [,..,.,]
T( N ) = 2T( N / 2 ) + c N T( N ) = O( N log N )
The Average Case:
Assume the average value of T( i ) for any i is?
1
0
)(1 N
j
jTN
cNjTNNT
N
j
1
0
)(2)(
T( N ) = O( N log N )
〖 Example〗 Given a list of N elements and an integer k,
Find the kth largest element.
Basic thought,in the last n-I object
waiting to be sorted,we select the object
with the smallest sorting code as the
object I of the order series each time (for
example time I,i= 1,…,n -2),After n-2
times,there is only one object left,and
you don’t need to select.
§ 7 Selection sort
Direct selection sort is a simple sorting method,its
basic step is:
① In a series of object V[i]~ V[n-1],we select the
object with smallest sorting code;
② If it isn’t the first object of the object series,then
exchange it with the first object of this object
series;
③ Delete the object with the smallest sorting code
in this object series,In the remaining objects
V[i+1]~ V[n-1],repeat step ① and② until there
is only one object left.
Direct Selection Sort
21 25 49 25* 16 08
0 1 2 3 4 5
21 25*i = 0
49
25 16
25 16
08
49
08
25*49 21i = 1
i = 2 08 16 25* 25 21
beginning
The smallest 08
Exchange 21,08
The smallest 16
Exchange 25,16
The smallest 21
Exchange 49,21
the process of direct selection sort
The smallest 25*
No exchange
4925*
0 1 2 3 4 5
25*i = 4
251608
49
25* 4921result
i = 3 08 16 2521
The smallest 25
No exchange252116
08
Result after sorting
arithmetic of direct selection sort
typedef int SortData;
void SelectSort ( SortData V[ ],int n ) {
for ( int i = 0; i < n-1; i++ ) {
int k = i; //select the object with the smallest
sorting code
for ( int j = i+1; j < n; j++)
if ( V[j] < V[k] )
k = j; //the object owned the smallest sorting
code currently
if ( k != i ) //exchange to position i
Swap ( V[i],V[k] );
}
}
The times of direct selection sort’ comparison KCN has no
relationship with the beginning order,Provided total
number of object series is n,then select the object of
smallest sorting code at time I need to compare n-i-1 times,
The total times of comparison is:
2
0 2
11n
i
nninKCN )()(
The times of object moving has relationship with the
beginning order,When the beginning state is the order of
sorting code from small to large,the times of object moving
is RMN=0,is the least.
The worst situation is that every time needs to exchange,
total times of object moving is RMN=3(n-1).
direct selection sort is an unstable sorting method.
ADT Model
Objects,A finite ordered list with zero or more elements.
Operations:
PriorityQueue Initialize( int MaxElements );
void Insert( ElementType X,PriorityQueue H );
ElementType DeleteMin( PriorityQueue H );
ElementType FindMin( PriorityQueue H );
—— delete the element with the highest \ lowest priority
PRIORITY QUEUES (HEAPS)
Simple Implementations
Array,
Insertion — add one item at the end ~? ( 1 )
Deletion — find the largest \ smallest key ~? ( n )
remove the item and shift array ~ O( n )?
Linked List,
Insertion — add to the front of the chain ~? ( 1 )
Deletion — find the largest \ smallest key ~? ( n )
remove the item ~?( 1 )
Ordered Array,
Insertion — find the proper position ~ O( n )
shift array and add the item ~ O( n )
Deletion — remove the first \ last item ~?( 1 )
Ordered Linked List,
Insertion — find the proper position ~ O( n )
add the item ~?( 1 )
Deletion — remove the first \ last item ~?( 1 )
Binary Heap
1,Structure Property:
【 Definition】 A binary tree with n nodes and height h is
complete iff its nodes correspond to the nodes numbered
from 1 to n in the perfect binary tree of height h.
4
8 9
5
10 11
6
12 13
7
14 15
2 3
1
A complete binary tree of height h has between
and nodes.
2h
2h+1? 1 h =? log N?
Array Representation,BT [ n + 1 ] ( BT [ 0 ] is not used)
D
H I
E
J
F G
B C
A
BT 0 1
A
2
B
3
C
4
D
5
E
6
F
7
G
8
H
9
I
10
J
11 12 13
Complete
binary tree
Representation
Ki? K2i+1 &
Ki? K2i+2
Complete
binary tree
Representation
Ki? K2i+1 &&
Ki? K2i+2
09
09
87
87
78
7845 45
65
65 31
3153 23
23
53
17
17
【 Lemma】 If a complete binary tree with n nodes is
represented sequentially,then for any node with index i,
1? i? n,we have:
ni
nii
ic h i l dr i g h t
ni
nii
ic h i l dl e f t
i
ii
ip a r e n t
12 ifN o n e
12 if12
)(_ ofi n d e x ( 3 )
2 ifN o n e
2 if2
)(_ ofi n d e x ( 2 )
1 ifN o n e
1 if2
)( ofi n d e x ( 1 )
PriorityQueue Initialize( int MaxElements )
{
PriorityQueue H;
if ( MaxElements < MinPQSize )
return Error( "Priority queue size is too small" );
H = malloc( sizeof ( struct HeapStruct ) );
if ( H ==NULL )
return FatalError( "Out of space!!!" );
/* Allocate the array plus one extra for sentinel */
H->Elements = malloc(( MaxElements + 1 ) * sizeof( ElementType ));
if ( H->Elements == NULL )
return FatalError( "Out of space!!!" );
H->Capacity = MaxElements;
H->Size = 0;
H->Elements[ 0 ] = MinData; /* set the sentinel */
return H;
}
2,Heap Order Property:
【 Definition】 A min tree is a tree in which the key value
in each node is no larger than the key values in its
children (if any),A min heap is a complete binary tree
that is also a min tree.
Note,Analogously,we can declare a max heap by
changing the heap order property.
9
6
5
3
[1]
[2] [3]
[4]
A max heap
10
20
50
83
[1]
[2] [3]
[4]
A min heap
The largest key The smallest key
3,Basic Heap Operations:
Create MinHeap from array(array[0] is used)
Process of creation(FilterDown())
Adjust to minheap from down to up
53 53
17 1778 78
09
23 45 65 87
i
09
23
45 65 87
i = 4 i = 3
i
53 53
17
1778 7809
23
45
65
87
i
09
23
45
65
87
i = 2
i
53
17 1778 78
09
23
45
65
87
i
09
23
45
65
87
i = 1
i
53
53
17 17
78 78
09
23
45
65
87
i
09
23 45
65
87i53
MinHeap
void Crt_MinHeap ( MinHeap *H,
HeapData arr[ ],int n ) {
//根据给定数组中的数据和大小,建立堆
for ( int i = 0; i < n; i++ ) H->data[i] = arr[i];
H->CSize = n; //数组传送及当前堆大小
int CPos = (H->CSize-2)/2; //最后分支结点
while ( CPos >= 0 ) { //从下到上逐步扩大堆
FilterDown ( &H,CPos,H->CSize-1 );
CPos--;
}
}
Create MinHeap
void FilterDown ( MinHeap *H,int start,
int EndOfHeap ) {
int i = start,j = 2*i+1; // j 是 i 的左子女
HeapData temp = H->data[i];
while ( j <= EndOfHeap ) {
if ( j < EndOfHeap && //两子女中选小者
H->data[j] > H->data[j+1] ) j++;
if ( temp <= H->data[j] ) break;
else { H->data[i] = H->data[j];
i = j; j = 2*j+1; } //向下滑动
}
H->data[i] = temp;
}
FilterDown algorithm of MinHeap
Process of creating MaxHeap
21
25
25*
49
16 08
0
1 2
3 4 5
i 21
25
25* 16
49
08
0
2
543
1
i
Key set at the beginning i = 2 part adjustment
21
25
25*
49
16 08
0
1 2
3 4 5
i
49
25
25* 16
21
08
0
2
543
1
i = 0 part adjustment
and become MaxHeap
i = 1 part adjustment
FilterDown algorithm of MaxHeap
void FilterDown ( MaxHeap *H,
int start,int EndOfHeap ) {
int i = start; int j = 2*i+1;
HeadData temp = H->data[i];
while ( j <= EndOfHeap ) { //最后位置
if ( j < EndOfHeap &&
H->data[j] < H->data[j+1] )
j = j+1; //选两子女的大者
if ( temp >= H->data[j] ) break;
else {
H->data[i] = H->data[j]; //大 子女上移
i = j; j = 2*j+1; //向下继续调整
}
}
H->data[i] = temp; //回放到合适位置
}
Transform array into Maxheap
for ( int i = ( H->CSize-2) / 2 ; i >= 0; i-- )
FilterDown ( H,i,H->CSize-1 );
insertion
10
12
15
20
[1]
[2] [3]
[4]
18
[5] [6]
Sketch of the idea:
The only possible position
for a new node
since a heap must be
a complete binary tree.
Case 1,new_item = 21
21
20 21<
Case 2,new_item = 17
17
20 17>
17
20
10 17<
Case 3,new_item = 9 20 9>
9
10 9>
9
10
/* H->Element[ 0 ] is a sentinel */
void Insert( ElementType X,PriorityQueue H )
{
int i;
if ( IsFull( H ) ) {
Error( "Priority queue is full" );
return;
}
for ( i = ++H->Size; H->Elements[ i / 2 ] > X; i /= 2 )
H->Elements[ i ] = H->Elements[ i / 2 ];
H->Elements[ i ] = X;
}
T (N) = O ( log N )
DeleteMin
Sketch of the idea:
10
12
15
20
[1]
[2] [3]
[4]
18
[5]
The node which must be
removed to keep a
complete binary tree.
move 18 up to the root18
find the smaller child of 18 12 18<18
12
15 18<18
15
T (N) = O ( log N )
ElementType DeleteMin( PriorityQueue H )
{
int i,Child;
ElementType MinElement,LastElement;
if ( IsEmpty( H ) ) {
Error( "Priority queue is empty" );
return H->Elements[ 0 ]; }
MinElement = H->Elements[ 1 ]; /* save the min element */
LastElement = H->Elements[ H->Size-- ]; /* take last and reset size */
for ( i = 1; i * 2 <= H->Size; i = Child ) { /* Find smaller child */
Child = i * 2;
if (Child != H->Size && H->Elements[Child+1] < H->Elements[Child])
Child++;
if ( LastElement > H->Elements[ Child ] ) /* Percolate one level */
H->Elements[ i ] = H->Elements[ Child ];
else break; /* find the proper position */
}
H->Elements[ i ] = LastElement;
return MinElement;
}
Algorithm 1(from min to max)
{
Crt_MinHeap ( MinHeap *H,HeapData arr[ ],int n );
for ( i=0; i<N; i++ )
TmpH[ i ] = DeleteMin( H );
for ( i=0; i<N; i++ )
H[ i ] = TmpH[ i ];
}
/* O( N ) */
/* O( log N ) */
/* O( 1 ) */
T ( N ) = O ( N log N )
The space requirement is doubled.
HeapSort
b
d a
c
A [0]
[1] [2]
[3]
d
c
bd
c
b c
a
a
Algorithm 2,void Heapsort( ElementType A[ ],int N )
{ int i;
for ( i = N / 2; i >= 0; i - - ) /* BuildHeap */
FilterDown( A,i,N );
for ( i = N - 1; i > 0; i - - ) {
Swap( &A[ 0 ],&A[ i ] ); /* DeleteMin */
FilterDown( A,0,i );
}
}
【 Theorem】 The average number of comparisons used to
heapsort a random permutation of N distinct items is
2N log N? O( N log log N ),
Note,Although Heapsort gives the best average time,in practice it is
slower than a version of Shellsort that uses Sedgewick’s
increment sequence.
4,Other Heap Operations:
Note,Finding any key except the minimum one will
have to take a linear scan through the entire heap.
DecreaseKey ( P,?,H )
Lower the value of the key in the heap H at
position P by a positive amount of?……so my
programs can run with highest priority?.sys,admin.
IncreaseKey ( P,?,H )
Percolate up
sys,admin.
Increases the value of the key in the heap H at
position P by a positive amount of?……drop
the priority of a process that is consuming
excessive CPU time.
Percolate down
Delete ( P,H )
Remove the node at position P from the heap
H …… delete the process that is terminated
(abnormally) by a user.sys,admin.
DecreaseKey(P,?,H); DeleteMin(H)
BuildHeap ( H )
Place N input keys into an empty heap H.
sys,admin.
N successive Insertions?
30
100 20
10
90 60
70
50 120
110
140 130
80 40
150
150,80,40,30,10,70,110,100,20,90,60,50,120,140,130
PercolateDown (7)
PercolateDown (6)
50
70
PercolateDown (5)
PercolateDown (4)
20
30
PercolateDown (3)
PercolateDown (2)
80
10
80
60
PercolateDown (1)
150
10
150
20
150
T (N) =?
【 Theorem】 For the perfect binary tree of height h
containing 2h+1? 1 nodes,the sum of the heights of the
nodes is 2h+1? 1? (h + 1).
T ( N ) = O ( N )
Applications of Priority Queues
〖 Example〗 Given a list of N elements and an integer k,
Find the kth largest element.
d-Heaps ---- All nodes have d children
3
1513 6
5
178 9
2
74 10
911
1
3-heap
Question,Shall we make d as large as possible?
Note:?DeleteMin will take d? 1 comparisons to find the smallest
child,Hence the total time complexity would be O(d logd N).
*2 or /2 is merely a bit shift,but *d or /d is not.
When the priority queue is too large to fit entirely in main
memory,a d-heap will become interesting.
§ 8 Sorting Large Structures
Problem,Swapping large structures can be very much expensive.
Solution:Add a pointer field to the structure and swap pointers instead
– indirect sorting,Physically rearrange the structures at last
if it is really necessary.
list
key
table
[0]
d
0
[1]
b
1
[2]
f
2
[3]
c
3
[4]
a
4
[5]
e
54 3 0 5 2The sorted list is
list [ table[0] ],list [ table[1] ],……,list [ table[n?1] ]
Note,Every permutation is made up of disjoint cycles.
list
key
table
[0]
d
4
[1]
b
1
[2]
f
3
[3]
c
0
[4]
a
5
[5]
e
2
temp = d
current = 0
next = 4
a
0
4
5
e
4
5
2
f
5
2
3
c
2
3d
3
In the worst case there are? cycles and requires?
record moves.
N / 2 3N / 2?
T = O( m N ) where m is the size of a structure.
〖 Example〗 Table Sort
§ 9 A General Lower Bound for Sorting
【 Theorem】 Any algorithm that sorts by comparisons only
must have a worst case computing time of?( N log N ).
Proof,K0? K1
K1? K2
K0? K2stop[0,1,2]
stop
[0,2,1]
stop
[2,0,1]
T F
T F
K0? K2
K1? K2stop[1,0,2]
stop
[1,2,0]
stop
[2,1,0]
T F
T F
T F
Decision tree for insertion sort on R0,R1,and R2
When sorting N distinct
elements,there are N! different
possible results.
Thus any decision tree must
have at least N! leaves.
If the height of the tree
is k,then N!? 2k?1 (# of
leaves in a complete
binary tree)
k? log(N!) + 1
Since N!? (N/2)N/2 and log2 N!? (N/2)log2(N/2) =? ( N log2 N )
Therefore T(N) = k? c? N log2 N,
§ 10 Bucket Sort and Radix Sort
Bucket Sort
〖 Example〗 Suppose that we have N students,each has a
grade record in the range 0 to 100 (thus there are M = 101
possible distinct grades),How to sort them according to their
grades in linear time?
count
0 1 10088
Algorithm
{
initialize count[ ];
while (read in a student’s record)
insert to list count[stdnt.grade];
for (i=0; i<M; i++) {
if (count[i])
output list count[i];
}
}
T(N,M) = O( M+N )
〖 Example〗 Given N = 10 integers in the range 0 to 999 ( M = 1000 )
Is it possible to sort them in linear time?
Radix Sort
Input,64,8,216,512,27,729,0,1,343,125
0Bucket 1 2 3 4 5 6 7 8 9
Sort according to the Least Significant Digit first.
0Pass 1 1 512 343 64 125 216 27 8 729
Pass 2
0
1
512 343 64125
216 27
8 729
Pass 3
0
1
8
512216125
27
729343
64
Output,0,1,8,27,64,125,216,343,512,729
T=O(P(N+B))
where P is the
number of
passes,N is the
number of
elements to sort,
and B is the
number of
buckets.
What if we sort
according to the Most
Significant Digit first?
Ki j,:= the j-th key of record Ri
Ki 0,:= the most significant key of record Ri
Suppose that the record Ri has r keys.
Ki r?1,:= the least significant key of record Ri
A list of records R0,...,Rn?1 is lexically sorted with respect
to the keys K 0,K 1,...,K r?1 iff
.10),,,,(),,,( 111 10 1110 niKKKKKK riiiriii
That is,Ki 0 = Ki+1 0,...,Ki l = Ki+1 l,Ki l+1 < Ki+1 l+1 for
some l < r? 1.
〖 Example〗 A deck of cards sorted on 2 keys
K 0 [Suit]? <? <? <?
K 1 [Face value] 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < J < Q < K < A
Sorting result,2?,.,A? 2?,.,A? 2?,.,A? 2?,.,A?
MSD ( Most Significant Digit ) Sort
Sort on K 0,for example,create 4 buckets for the suits
3?
3?
5?
5?
A?
A?
4?
4?
Sort each bucket independently (using any sorting
technique)
§ 10 Bucket Sort and Radix Sort
LSD ( Least Significant Digit ) Sort
Sort on K 1,for example,create 13 buckets for the face
values
2?
2?
3?
3?
4?
4?
5?
5?
A?
A?
...
Reform them into a single pile A?
A?
3?
3?
2?
2?
Create 4 buckets and resort
Question:
Is LSD always faster than MSD?
Comparison of all sort method
S o r t m e t h o d C o m p a r i n g
n u m b e r
M o v i n g
n u m b e r
s t a b i l
i t y
A dd i t i o n
s t o r a g e
b e s t w o r s t b e s t w o r
st
b e s t w o r
st
D i r e c t i n s e r t i o n s o r t n n
2
0 n
2
1
B i n a r y i n s e r t i o n s o r t n l o g
2
n 0 n
2
1
B u b b l e s o r t n n
2
0 n
2
1
Q u i c k s o r t n l o g
2
n n
2
n l o g
2
n n
2
l o g
2
n n
2
D i r e c t s e l e c t i o n s o r t n
2
0 n? 1
T o u r n a m e n t s o r t n l o g
2
n n l o g
2
n? n
H e a p s o r t n l o g
2
n n l o g
2
n? 1
M e r g e s o r t n l o g
2
n n l o g
2
n? n