1 1清华大学清华大学 宋斌恒宋斌恒 第六讲:分摊分析法第六讲:分摊分析法 Amortized Analysis 宋斌恒宋斌恒 清华大学清华大学 宋斌恒宋斌恒 2 Overview Aggregate analysis The accounting method The potential method Example: Dynamic tables 清华大学清华大学 宋斌恒宋斌恒 3 Overview 由分摊分析法解决的问题 Sequence of operation on a dada structure Average cost per operation 【比如网站的整体效率问题与此问题的关系】 三类主要分析手段 Aggregate, worst case + average Accounting, Amortized cost Potential, is a whole index for the data structure 清华大学清华大学 宋斌恒宋斌恒 4 聚集分析法聚集分析法 A sequence of n operations takes worst- case time T(n) in total The average cost T(n)/n is called the amortized cost, the amortized cost applies to each operation, even they are different. 清华大学清华大学 宋斌恒宋斌恒 5 Stack operation Let S be a stack object, it has two basic operations: Pop() Push() Adding another operation MultiPop(k), which pops k elements at once, its cost is at most k pop()’s cost Attention: what it occurs when S.getSize()<k? 清华大学清华大学 宋斌恒宋斌恒 6 粗略分析粗略分析 In the worst case of these operations, the worst cost is n, the size of the dada structure; A sequence of m operations cost at most m*n The amortized cost of the operations is less than O(n) It is right but not tight! 2 清华大学清华大学 宋斌恒宋斌恒 7 聚集分析法聚集分析法 Consider the entire sequence of operations: A MultiPop(k) takes at most m = min{k, S.getSize()} operations It has at least m push operation had taken before the pops. On an initial empty Stack, a sequence of n operations of push, pop, multipop takes at most O(n) steps of operations. Amortized cost = O(n)/n = O(1) 清华大学清华大学 宋斌恒宋斌恒 8 示例:计数器示例:计数器 Consider the problem of incrementing a k- bit binary counter that counts upward from 0. Array A=[0,1,…,k-1] represents the k bits 清华大学清华大学 宋斌恒宋斌恒 9 Algorithm Increment(A) 1. i ? 0 2. while i < length[A] and A[i]==1 3. { 1. A[i]?0; 2. i++; 4. } 5. if i < length[A] then A[i]?1 清华大学清华大学 宋斌恒宋斌恒 10 3100001016 2611110015 2501110014 2310110013 2200110012 1911010011 1801010010 161001009 150001008 111110007 100110006 81010005 70010004 41100003 30100002 11000001 00000000 Total cost A[0]A[1]A[2]A[3]A[4]A[5]Counter value 清华大学清华大学 宋斌恒宋斌恒 11 Analysis the algorithm The most costing operation is k, the length of the bits, the worst cost will not exceed O(nk) where n is the operation numbers; A tighten analysis Half the operations take 1 operation Quarter of the operations takes 2 operations 2 -i of operations takes i operations 清华大学清华大学 宋斌恒宋斌恒 12 Total cost The total number of flips. ?? nn i i i n i i 22/1 2 0 lg 0 << ∑∑ ∞ == 3 清华大学清华大学 宋斌恒宋斌恒 13 讨论讨论 长整数二进制加法的线性性证明: ?高位相加快?低位相加快?或者一样? 清华大学清华大学 宋斌恒宋斌恒 14 小结小结 聚集分析方法关键点 所有 n个运算集中考虑 不考虑不同运算的消费不同 只考虑所有运算的总复杂度 最后分摊到每个运算 清华大学清华大学 宋斌恒宋斌恒 15 The accounting method In the accounting method we assign different amortized cost for different operations, some of the amortized cost is less or greater than its actual cost. Let a i be the amortized cost for the i th operation and c i is its actual cost than it should satisfies for all n. 【分摊约束条件】 ∑∑ == ≥ n i i n i i ca 11 清华大学清华大学 宋斌恒宋斌恒 16 Stack operation The actual cost of operations were Push 1 Pop 1 Multipop min{k, s} Amortized cost Push 2 Pop 0 Multipop 0 清华大学清华大学 宋斌恒宋斌恒 17 “花花 ”账中成本的含义账中成本的含义 Push 的实际成本为 1,同时为其在 Pop时花 的费用存入资金 1,故其分摊成本为 2。 Pop 的实际成本为 1,但是由于已经预付, 故其分摊成本为 0。 MultiPop的实际成本为 m,但由于已经预 付,故其分摊成本为 0。 清华大学清华大学 宋斌恒宋斌恒 18 Amortized analysis for incrementing the binary counter Actual cost for a flip Set a bit from 0 to 1 cost 1 dollar. Set a bit from 1 to 0 cost 1 dollar. Amortized cost for a flip Set a bit from 0 to 1 cost 2 dollar, pay for its reverse transfer. Set a bit from 1 to 0 cost 0 dollar, because it is paid already . 4 清华大学清华大学 宋斌恒宋斌恒 19 Amortized Cost for an incrementing Actual cost Several 1 to 0 flips plus 1 flip of 0 to1 Amortized cost n*0+1*2=2 清华大学清华大学 宋斌恒宋斌恒 20 做账分摊法小结做账分摊法小结 可以为不同的操作赋予不同的分摊成本 分摊成本与实际成本要满足分摊约束条件 分摊的经验 如果一个操作的作用可以被另外一个或几个操 作消除,则可以把消除函数的实际成本分摊到 此操作本身,而其消除操作的分摊成本为 0 复杂情况可以通过上述情况的方程式进行求解 清华大学清华大学 宋斌恒宋斌恒 21 The potential method The potential method represent the prepaid work as credit stored with the specific objects in the data structure. We call the prepaid work as “potential energy” or just the “potential” 清华大学清华大学 宋斌恒宋斌恒 22 势方法势方法 Start with an initial data structure D 0 On which there are n operations being performed After each operation the data structure change to D i 转下页 清华大学清华大学 宋斌恒宋斌恒 23 势方法(续)势方法(续) Φ: D i ?R Actual cost is c i , amortized cost is a i then a i =c i + Φ(D i ) – Φ(D i-1 ) Require Φ(D i ) – Φ(D 0 ) ≥ 0 清华大学清华大学 宋斌恒宋斌恒 24 如何理解势方法如何理解势方法 势方法适合那种有状态的数据结构上的各 种操作成本的分摊 每个操作的分摊成本是 实际成本+势能的增加 以上分摊成本可以理解为,一个操作不仅 要为自己付帐,同时要为自己对环境的影 响而付帐【可能是收益】 5 清华大学清华大学 宋斌恒宋斌恒 25 Stack operation Φ(D i )=length(D i ) Push a i =c i + Φ(D i ) – Φ(D i-1 )=1+1=2 Pop a i =c i + Φ(D i ) – Φ(D i-1 )=1-1=0 Multipop(k) a i =c i + Φ(D i ) – Φ(D i-1 )=k- k=0 解释:【课堂提问】 清华大学清华大学 宋斌恒宋斌恒 26 Incrementing a binary counter Φ= b i the number of 1’s in the counter after the i th opertion, The i th opertaion reset t i bits b i ≤b i-1 –t i +1 a i =c i + Φ(D i ) – Φ(D i-1 ) ≤(t i +1) + (1-t i ) ≤2 As Φ(D i ) is nonnegative hence the n sequence of operation cannot exceed 2n of cost hence the amortized cost is O(1) 清华大学清华大学 宋斌恒宋斌恒 27 势方法小结势方法小结 当因素太多时,做账分摊太难 势方法把帐做到一起:势 操作对势有贡献【一般是原始操作,为其后的 消除操作付帐】,增加势 操作对势有索取【一般是消除操作,有人已经 付帐】,势减少 混合操作,则根据不同操作分别累计即可 清华大学清华大学 宋斌恒宋斌恒 28 Sample: Dynamic tables Dynamic table is a base data structure for store variation objects in a table. 动态表上可以加许多操作,我们先考虑一 下两种: Insert【事实上是 Append】 Delete【事实上是删除最后的数据】 清华大学清华大学 宋斌恒宋斌恒 29 Insert an element 插入过程: If there are enough space to store the element, just put it after the last element, If there is not enough space, than apply a double more space, and then copy the original value to the new one and append the new element. 清华大学清华大学 宋斌恒宋斌恒 30 TABLE-INSERT(T,x) 1 if size[T]=0 2 then allocate table[T] with 1 slot 3 size[T]←←1 4 if num[T]=size[T] 5 then allocate new-table with 2 ? size[T] slots 6 insert all items in table[T]into new-table 7 free table[T] 8 table[T]←←new-table 9 size[T]←←2 ? size[T] 10 insert x into table[T] 11 num[T]←←num[T]+1 6 清华大学清华大学 宋斌恒宋斌恒 31 Analysis a dynamic table from zero size with only insertion operation A sequence of n insert operations The initial capacity of the table is 1 Analysis the cost of these n insertion: 清华大学清华大学 宋斌恒宋斌恒 32 The total cost of n TABLE-INSERT operations is therefore ? ? ? ? = .1 ,21 otherwise ofpowerexactanisiifi c i nn 2+< ,3n= [] ∑∑ == +≤ n j j n i i nc lg 01 2 清华大学清华大学 宋斌恒宋斌恒 33 Amortized feeling Insert实际成本 没有扩张: 1 有扩张: n+ 1,不计算分配空间的成本 Insert均摊成本 :把未来的扩张成本预付 插入: 1 为消费掉一个空间付帐: 1【为扩张安排自己 的位置做准备】 为准备一个新空间付帐: 1【为扩张安排同样 多的其它数据做准备】 清华大学清华大学 宋斌恒宋斌恒 34 势函数势函数 Potential function【 capacity = size】 Φ(T) = 2num[T] –capacity[T] 清华大学清华大学 宋斌恒宋斌恒 35 注意: capacity i = capacity i-1 ,没有变化 1 ? ? Φ?Φ+= iiii cc )capacity2()capacity2(1 11 ?? ?????+= iiii numnum )capacity)1(2()capacity2(1 1? ?????+= iiii numnum .3= If the i th operation does not trigger an expansion, then we have: 清华大学清华大学 宋斌恒宋斌恒 36 If the i th operation does trigger an expansion, then we have capacity i =2· capacity i-1 and capacity i-1 =num i-1 =num i -1,which implies that capacity i =2·(num i -1). Thus, the amortized cost of the operation is 7 清华大学清华大学 宋斌恒宋斌恒 37 1 ? ? Φ?Φ+= iiii cc )capacity 2()capacity 2( 11 ?? ?????+= iiiii numnumnum ))1()1(2())1(22( ????????+= iiiii numnumnumnumnum )1(2 ??+= ii numnum .3= 清华大学清华大学 宋斌恒宋斌恒 38 0 4 8 12 16 20 24 28 32 0246810121416182022426283032 势能 数量 容量 清华大学清华大学 宋斌恒宋斌恒 39 带有删除操作的动态表带有删除操作的动态表 ? ? ? <? ≥?? =Φ = .2/1)(][2/][ ,2/1)(][][2 )( ][/][)( TifTnumTsize TifTsizeTnum T TsizeTnumT α α α 清华大学清华大学 宋斌恒宋斌恒 40 当当 α i <1/2时插入时插入 1 ? ? Φ?Φ+= iiii cc )2/()2/(1 11 ?? ???+= iiii numsizenumsize ))1(2/()2/(1 ????+= iiii numsizenumsize 0= 清华大学清华大学 宋斌恒宋斌恒 41 当当 αα i-1 <1/2 但但 αα i ≥≥ 1/2, 则则 1 ? ? Φ?Φ+= iiii cc )2/()2(1 11 ?? ????+= iiii numsizesizenum )2/())1(2(1 1111 ???? ???++= iiii numsizesizenum 3 2 3 3 11 +??= ?? ii sizenum 3 2 3 3 111 +?= ??? iii sizesizeα 3 2 3 112 3 +?< ?? ii sizesize .3= 清华大学清华大学 宋斌恒宋斌恒 42 删除操作:删除操作: α i <1/2 1 ? ? Φ?Φ+= iiii cc )2/()2/(1 11 ?? ???+= iiii numsizenumsize ))1(2/()2/(1 +???+= iiii numsizenumsize .2= 8 清华大学清华大学 宋斌恒宋斌恒 43 删除操作:删除操作: α i ≥≥ 1/2 1 ? ? Φ?Φ+= iiii cc )2/()2/()1( 11 ?? ???++= iiiii numsizenumsizenum ))1()22(())1(()1( +?+???+++= iiiii numnumnumnumnum .1= 清华大学清华大学 宋斌恒宋斌恒 44 思考思考 动态表的实现: 1. 考虑该表地址查找频繁 2. 考虑该表地址查找稀少 清华大学清华大学 宋斌恒宋斌恒 45 总结总结 分摊分析把一个系统作为整体考虑,有三 种常用方法: 聚集 做账 势 统筹安排 ——另一种设计思路: 利用提高某些操作的成本,来达到减少其它操 作成本,最后达到整体操作成本最优。