Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 11 Prof. Erik Demaine Introduction to Algorithms Day 20 L11.2? 2001 by Charles E. Leiserson Dynamic order statistics OS-SELECT(i, S): returns the ith smallest element in the dynamic set S. OS-RANK(x, S): returns the rank of x ∈ S in the sorted order of S’s elements. IDEA: Use a red-black tree for the set S, but keep subtree sizes in the nodes. key size key size Notation for nodes: Introduction to Algorithms Day 20 L11.3? 2001 by Charles E. Leiserson Example of an OS-tree M 9 9 C 5 C 5 A 1 A 1 F 3 F 3 N 1 N 1 Q 1 Q 1 P 3 P 3 H 1 H 1 D 1 D 1 size[x] = size[left[x]] + size[right[x]] + 1 Introduction to Algorithms Day 20 L11.4? 2001 by Charles E. Leiserson Selection OS-SELECT(x, i) ? ith smallest element in the subtree rooted at x k ← size[left[x]] + 1 ? k = rank(x) if i = k then return x if i < k then return OS-SELECT(left[x], i ) else return OS-SELECT(right[x], i – k ) Implementation trick: Use a sentinel (dummy record) for NIL such that size[NIL] = 0. (OS-RANK is in the textbook.) Introduction to Algorithms Day 20 L11.5? 2001 by Charles E. Leiserson Example M 9 9 C 5 C 5 A 1 A 1 F 3 F 3 N 1 N 1 Q 1 Q 1 P 3 P 3 H 1 H 1 D 1 D 1 OS-SELECT(root, 5) i = 5 k = 6 i = 5 k = 2 i = 3 k = 2 i = 1 k = 1 Running time = O(h) = O(lg n) for red-black trees. Introduction to Algorithms Day 20 L11.6? 2001 by Charles E. Leiserson Data structure maintenance Q. Why not keep the ranks themselves in the nodes instead of subtree sizes? A. They are hard to maintain when the red-black tree is modified. Modifying operations: INSERT and DELETE. Strategy: Update subtree sizes when inserting or deleting. Introduction to Algorithms Day 20 L11.7? 2001 by Charles E. Leiserson Example of insertion M 9 9 C 5 C 5 A 1 A 1 F 3 F 3 N 1 N 1 Q 1 Q 1 P 3 P 3 H 1 H 1 D 1 D 1 INSERT(“K”) 10 10 6 6 4 4 2 2 K 1 K 1 Introduction to Algorithms Day 20 L11.8? 2001 by Charles E. Leiserson Handling rebalancing Don’t forget that RB-INSERT and RB-DELETE may also need to modify the red-black tree in order to maintain balance. ? Recolorings: no effect on subtree sizes. ? Rotations: fix up subtree sizes in O(1) time. Example: C 11 C 11 E 16 E 16 7 3 4 C 16 C 16 E 8 E 8 7 34 ∴RB-INSERT and RB-DELETE still run in O(lg n) time. Introduction to Algorithms Day 20 L11.9? 2001 by Charles E. Leiserson Data-structure augmentation Methodology: (e.g., order-statistics trees) 1. Choose an underlying data structure (red- black trees). 2. Determine additional information to be stored in the data structure (subtree sizes). 3. Verify that this information can be maintained for modifying operations (RB- INSERT, RB-DELETE — don’t forget rotations). 4. Develop new dynamic-set operations that use the information (OS-SELECT and OS-RANK). These steps are guidelines, not rigid rules. Introduction to Algorithms Day 20 L11.10? 2001 by Charles E. Leiserson Interval trees Goal: To maintain a dynamic set of intervals, such as time intervals. low[i] = 7 10 = high[i] i = [7, 10] 5 4152 1711 818 19 23 Query: For a given query interval i, find an interval in the set that overlaps i. Introduction to Algorithms Day 20 L11.11? 2001 by Charles E. Leiserson Following the methodology 1. Choose an underlying data structure. ? Red-black tree keyed on low (left) endpoint. int m int 2. Determine additional information to be stored in the data structure. ? Store in each node x the largest value m[x] in the subtree rooted at x, as well as the interval int[x] corresponding to the key. Introduction to Algorithms Day 20 L11.12? 2001 by Charles E. Leiserson 17,19 23 17,19 23 Example interval tree 5,11 18 5,11 18 4,8 8 4,8 8 15,18 18 15,18 18 7,10 10 7,10 10 22,23 23 22,23 23 m[x] = max high[int[x]] m[left[x]] m[right[x]] Introduction to Algorithms Day 20 L11.13? 2001 by Charles E. Leiserson Modifying operations 3. Verify that this information can be maintained for modifying operations. ? INSERT: Fix m’s on the way down. 6,20 30 6,20 30 11,15 19 11,15 19 19 19 14 14 30 30 11,15 30 11,15 30 6,20 30 6,20 30 30 30 14 14 19 19 ? Rotations — Fixup = O(1) time per rotation: Total INSERT time = O(lg n); DELETE similar. Introduction to Algorithms Day 20 L11.14? 2001 by Charles E. Leiserson New operations 4. Develop new dynamic-set operations that use the information. INTERVAL-SEARCH(i) x ← root while x ≠ NIL and (low[i] > high[int[x]] or low[int[x]] > high[i]) do ? i and int[x] don’t overlap if left[x] ≠ NIL and low[i] ≤ m[left[x]] then x ← left[x] else x ← right[x] return x Introduction to Algorithms Day 20 L11.15? 2001 by Charles E. Leiserson Example 1: INTERVAL-SEARCH([14,16]) 17,19 23 17,19 23 5,11 18 5,11 18 4,8 8 4,8 8 15,18 18 15,18 18 7,10 10 7,10 10 22,23 23 22,23 23 x x ← root [14,16] and [17,19] don’t overlap 14 ≤ 18 ? x ← left[x] Introduction to Algorithms Day 20 L11.16? 2001 by Charles E. Leiserson Example 1: INTERVAL-SEARCH([14,16]) 17,19 23 17,19 23 5,11 18 5,11 18 4,8 8 4,8 8 15,18 18 15,18 18 7,10 10 7,10 10 22,23 23 22,23 23 x [14,16] and [5,11] don’t overlap 14 > 8 ? x ← right[x] Introduction to Algorithms Day 20 L11.17? 2001 by Charles E. Leiserson Example 1: INTERVAL-SEARCH([14,16]) 17,19 23 17,19 23 5,11 18 5,11 18 4,8 8 4,8 8 15,18 18 15,18 18 7,10 10 7,10 10 22,23 23 22,23 23 x [14,16] and [15,18] overlap return [15,18] Introduction to Algorithms Day 20 L11.18? 2001 by Charles E. Leiserson Example 2: INTERVAL-SEARCH([12,14]) 17,19 23 17,19 23 5,11 18 5,11 18 4,8 8 4,8 8 15,18 18 15,18 18 7,10 10 7,10 10 22,23 23 22,23 23 x x ← root [12,14] and [17,19] don’t overlap 12 ≤ 18 ? x ← left[x] Introduction to Algorithms Day 20 L11.19? 2001 by Charles E. Leiserson Example 2: INTERVAL-SEARCH([12,14]) 17,19 23 17,19 23 5,11 18 5,11 18 4,8 8 4,8 8 15,18 18 15,18 18 7,10 10 7,10 10 22,23 23 22,23 23 x [12,14] and [5,11] don’t overlap 12 > 8 ? x ← right[x] Introduction to Algorithms Day 20 L11.20? 2001 by Charles E. Leiserson Example 2: INTERVAL-SEARCH([12,14]) 17,19 23 17,19 23 5,11 18 5,11 18 4,8 8 4,8 8 15,18 18 15,18 18 7,10 10 7,10 10 22,23 23 22,23 23 x [12,14] and [15,18] don’t overlap 12 > 10 ? x ← right[x] Introduction to Algorithms Day 20 L11.21? 2001 by Charles E. Leiserson Example 2: INTERVAL-SEARCH([12,14]) 17,19 23 17,19 23 5,11 18 5,11 18 4,8 8 4,8 8 15,18 18 15,18 18 7,10 10 7,10 10 22,23 23 22,23 23 x x = NIL ? no interval that overlaps [12,14] exists Introduction to Algorithms Day 20 L11.22? 2001 by Charles E. Leiserson Analysis Time = O(h) = O(lg n), since INTERVAL-SEARCH does constant work at each level as it follows a simple path down the tree. List all overlapping intervals: ? Search, list, delete, repeat. ? Insert them all again at the end. This is an output-sensitive bound. Best algorithm to date: O(k + lg n). Time = O(k lg n), where k is the total number of overlapping intervals. Introduction to Algorithms Day 20 L11.23? 2001 by Charles E. Leiserson Correctness Theorem. Let L be the set of intervals in the left subtree of node x, and let R be the set of intervals in x’s right subtree. ? If the search goes right, then { i′∈L : i′ overlaps i } = ?. ? If the search goes left, then {i′∈L : i′ overlaps i } = ? ? {i′∈R : i′ overlaps i } = ?. In other words, it’s always safe to take only 1 of the 2 children: we’ll either find something, or nothing was to be found. Introduction to Algorithms Day 20 L11.24? 2001 by Charles E. Leiserson Correctness proof Proof. Suppose first that the search goes right. ? If left[x] = NIL, then we’re done, since L = ?. ? Otherwise, the code dictates that we must have low[i] > m[left[x]]. The value m[left[x]] corresponds to the right endpoint of some interval j ∈ L, and no other interval in L can have a larger right endpoint than high( j). L high( j) =m[left[x]] i low(i) ? Therefore, {i′∈L : i′ overlaps i } = ?. Introduction to Algorithms Day 20 L11.25? 2001 by Charles E. Leiserson Proof (continued) Suppose that the search goes left, and assume that {i′∈L : i′ overlaps i } = ?. ? Then, the code dictates that low[i] ≤ m[left[x]] = high[ j] for some j ∈ L. ? Since j ∈ L, it does not overlap i, and hence high[i] < low[ j]. ? But, the binary-search-tree property implies that for all i′∈R, we have low[ j] ≤ low[i′]. ? But then {i′∈R : i′ overlaps i } = ?. L i j i′