清华大学：算法分析与设计：算法分析与设计2005春01

分类：计算机格式：pdf 日期：2007年03月28日

1 算法分析与设计清华大学宋斌恒清华大学宋斌恒 2 授课内容【按照层次分】 n 设计理论 n 分析方法 n 实现技术 n 测试技术 n 应用范围清华大学宋斌恒 3 Contents【按照内容分】 n Divide and conquer【分治】 n Dynamic Programming【动态规划】 n Greedy method【贪婪算法】 n Amortized Analysis【均摊分析法】 n Algorithms in Graphics【图论中算法】 n NP-Completeness【 NP完全问题，算法理论】 n Selected Topics【专题讲座】 n Comprehensive Training【综合训练】清华大学宋斌恒 4 教学目的 n 理论分析能力培养：掌握算法分析与设计的基本理论和方法，具有设计新算法和分析复杂性的能力 n 实践能力的培养：学会如何实现设计好的算法，如何测试其正确性和效率，应用与实际问题 n 团队能力培养：和各种人员合作工作能力 n 交流能力的培养：表达和接受能力 n 独立研究能力培养：具有独立开展研究的能力清华大学宋斌恒 5 上课的必要条件 n 数据结构（没修过的请修过后再选） n 掌握一门对面向对象编程语言（ C＋＋或 Java）清华大学宋斌恒 6 基本要求 n 上课不应迟到，迟到一次扣 1分，自己申报，如果迟到没有申报被发现扣 10分。 n 不得抄袭、剽窃。参考文献、著作、教材和包括其它同学的作业在内的所有资料，必须指明出处。其中 n 文献、著作、教材类公开出版的参考资料，按照文献索引方式引用。有可能的话最好提供电子拷贝。 n 网络资料，除指出网络路径 URL，还应当提供资料电子拷贝。 n 参考同学作业应当指出作业编号和提供原作业拷贝。多人讨论的成果，应当在作业中反映。 n 引用他人成果而没有指出出处的以抄袭论处。如有抄袭，第一次发现，总成绩减去该阶段分值，第二次发现，成绩记 0分，并以考试作弊向上汇报。 2 清华大学宋斌恒 7 鼓励 n 创新 n 提出问题 n 讨论 n 根据平时情况，可以得到加分。清华大学宋斌恒 8 教学网站 n 清华大学网络学堂清华大学宋斌恒 9 主要参考书 n Introduction to algorithms, Thomas H. Cormen, etc., second edition, The MIT Press. n The Art of Computer Programming, Donald E. Knuth. Volume 1-3, Second Edition. n Data Structures, Algorithms, and Applications in C++(Part 3) Sartaj Sahni, China Machine Press n算法设计与分析，王晓东，清华大学出版社清华大学宋斌恒 10 综合训练报告框架和要求 1. 问题表述 2. 该问题的研究历史综述【该问题与参考资料的关系】 1. 该问题的最新算法【如果有比上述典型算法效率更好的算法】介绍 3. 该问题的典型算法介绍： 1. 该算法主要思想 2. 算法描述 3. 算法正确性说明或者证明 4. 算法复杂度理论分析 5. *涉及到的理论方法总结和推广 4. 算法的实现与测试【分别用 Java和 C＋＋】： 1. 算法接口设计 2. 算法使用说明 3. 典型算法的实现，包括异常处理和性能估计 4. 典型算法实现的测试分析报告： 1. 结果是否正确？ 2. 不同规模输入情况下的效率分析，是否与理论分析一致，如果不一致，为什么？ 5. 该问题的应用介绍： 1. 应用背景，使用条件等等 2. *部分典型算法的演示软件。 6. *提出该问题自己的算法 7. 参考资料清华大学宋斌恒 11 综合训练问题集 n 大整数运算【大整数表示、加法、减法、乘法、除法、密指数，及其模运算下的上述运算】 n 密码支撑运算 1【随机数生成】 n 密码支撑运算 2【素数生成算法】 n 密码支撑运算 3【标准和经典文献算法 5个左右 Hash运算】 n 对称加密算法【标准和经典文献算法 5个左右对称加密算法】 n 非对称加密算法【标准和经典文献算法 5个左右非对称加密算法】清华大学宋斌恒 12 综合练习问题集【续】 n 字符串【模式】匹配 n 字符串相似度理论【字符串距离编辑等】 n 索引与查找 n 图文排版优化 n 网络任务调度 3 清华大学宋斌恒 13 综合练习问题集【续 2】 n 图的所有点最短距离问题 n 应用：纹理合成中的缝合法。 n 图的单源最短路径 n 贪婪算法理论： Matroid、 Greedoid、 GreedSet n 旅行商问题【精确解、近似解】 n 背包问题【同上】 n 自选清华大学宋斌恒 14 课程安排介绍 n 前面以介绍设计方法为主 n 后面以解决问题为主清华大学宋斌恒 15 算法简介 n 什么是算法：算法是一个定义好的用来计算，或者更加一般地说，是用于把一定的输入转换成需要的输出的过程。大家最熟悉不过的算法就是算术中的加法和乘法。下面我们通过一个二进制的加法来说明问题： n 10011011101 n ＋ 111011010001 n ＝ 1001110101110 n 算法：一个循环： n i=0; i++; i<length(第二个数 ): n 如果第二个数 [i]==1 plus(第一个数 , i) n plus(数， i)： //在数的第 i位加 1 n 如果数 [i]==1 数 [i]=0; plus（数， i＋ 1）；清华大学宋斌恒 16 本课程要回答算法的几个基本问题 n 它会停下来吗？ n 它的结果正确吗？ n 它运算快吗？ n 它使用了多少内存？ n 进一步我们需要回答： n 它能够应用到那些领域？ n 利用不同语言实现需要那些技巧 ? 清华大学宋斌恒 17 算术 n 历史上，算术运算是一件非常有意思的工作，现在看起来人人都会用的算术，其广泛应用也只是最近的事。算术的关键是表示，如十进制、罗马进制。利用罗马进制进行算术运算并不是一件轻松的事。清华大学宋斌恒 18 进制问题 n 中国一直使用十进制来表示数字，而通过算盘这种 2－ 5－ 10进制可以进行快速算术计算，宋朝开始普及。 4 清华大学宋斌恒 19 半坡时代 n 中国最早使用十进制，可以追溯到半坡时代（约公元前 4000年，陕西地区）。 n 根据出土的陶器上面记录的数字符号，半坡人至少掌握了三十以内的自然数，并且使用十进制计数 ⑴ 。 n 不过这时候的计数称之为十进制有一些勉强。因为它还没有 “零 ”和 “空位 ”的概念。 n 半坡人只会记录 “10”， “20”， “30”这些整十倍的数，而对于其他十以上的数就没有办法了。清华大学宋斌恒 20 商朝 n 到了商朝（约公元前 1400年），出现了甲骨文。甲骨文比半坡计数更进了一步，出现了表示 “百 ”“千 ”“万 ”的专门符号，并通过这些符号与基本数字的组合可以记录高达几万的整数 ⑴ 。 n 说明这时的人已经有了 “位 ”的概念。周朝《周易》书中，就有 “万有一千五百二十 ”的记载。清华大学宋斌恒 21 春秋战国时期 n 到春秋战国时期出现了 “算筹 ”这种计算工具。算筹分横式和纵式两种摆法。在计数时个位放纵式，十位放横式，依此类推。若是两个横式或者纵式连在了一起，就表示中间存在一个 “空位 ”⑴ 。 “空位 ”概念的出现，应该是现代计数方法的一个里程碑。根据《孙子算经》中记载： “凡算之法，先识其位，一从十横，百立千僵，千十相望，万百相当 ”。可以说，这时候的计数虽然还没有表示 “零 ”的符号，但已经意识到了 “零 ”这个概念。除了不能准确表达多个连续的空位等问题以外，与阿拉伯十进制计数已经没什么区别了。这个时期的田齐量制，也采用了 “升 — 斗 — 石 ”的十进位制 ⑷ ，可见十进制在这个时期开始已经开始被广泛应用了。清华大学宋斌恒 22 汉代 n 汉代蔡伦（约公元 100年）发明造纸术，大大推动了数学的发展。不久之后为了避免混淆采用方框来表示空位，后来又演变为圆圈 ⑴ 。可以说，完备的十进制计数法最终形成了。清华大学宋斌恒 23 全球 n 作为四大文明古国之一，中国在十进制计数方面走在了最前面。古巴比伦人（公元前 2400～前 1600年）使用的是六十进制；古埃及（公元前 1850～前 1650年）的数学虽然以十为基数，但没有形成 “位 ”的概念；而印度直到公元 825年才有著作描述了 “完备的印度数系 ”⑴ ，我们所采用的阿拉伯数字及计数法，便是这之前不久（约公元 600年）由印度经阿拉伯流传到世界各地的 ⑶ 。清华大学宋斌恒 24 参考文献 n ⑴ 潘天骥，十进位值制的记数法首创于中国，九江师专学报， 1995年 05期 n ⑵ 王永建，进位制的由来，江苏教育， 1997年 01期 n ⑶ C.Marchal.，刘义燧，计数法的简史及展望，自然杂志， 1995年 05期 n ⑷ 宁可，有关汉代农业生产的几个数字 n http://www.guoxue.com/economics/ReadNews.asp? NewsID=455&BigClassID=26&SmallClassID=44&Sp ecialID=120 n ⑸ 数学大事年表 http://jamesjoe.51.net/refrence/matheven.html 5 清华大学宋斌恒 25 算盘 n 算盘被誉作中国贡献于世界的 “第五大发明 ”，可谓地道中国特产了。 n 随着对陜西歧山出土文物西周陶丸的研究，中国算盘的发明时间已经提前到二千多年前的西周时期。陕西岐山县西周宫室遗址中出土的 90 粒青黄两色陶丸，青色 20粒，黄色 70粒。【但有学术争论】清华大学宋斌恒 26 陈聪同学 2004年秋天的文献综述 n “陶丸是珠算工具 ”的这种说法，可以在网上搜索到很多，姑且找一个比较权威的网站。新华网 2003年 10月 2日报道 “记者日前在岐山县采访时获悉，全县最小的乡 ――― 京当乡近年来出土了大批的珍贵文物，其中有 18件占据了中国考古之最 ”， “出土的陶丸，是我国最早的计算工具算珠 ”。标明稿件来源：西安晚报。 ① “中国珠算协会副会长、陕西数学史研究会会长李培业对数学史最大的贡献是在珠算史研究方面 ”， “根据陕西岐山县西周宫室遗址中出土的 90粒青黄两色陶丸，结合《数术记遗》中对珠算算具的文字注释，证明了这是珠算工具，从而把中国古珠算的历史年代推前了 1000余年，证明中国是最早发明珠算的国家 ”② 清华大学宋斌恒 27 续 n 可见，新华网的这种说法的根据是李培业先生的研究结果。而李培业的文章《对西周宫室遗址出土陶丸之考察》发表于 1984年《珠算》 04 期，同时期研究者发表类似意见的文章还有：陕西周原岐山文管所刘亮的《试说周原遗址出土的陶丸》（《珠算》 1984年 04期），户谷清一（日本）的《从西周宫室遗址出土的陶丸探索算盘的起源》（《珠算》 1985年 05期，《齐鲁珠坛》 1994年 04期） ③ ，以及姜克华的《西周陶丸之研究》（《齐鲁珠坛》 1994年 02期）等等。清华大学宋斌恒 28 n 而近十年来考古界的研究成果，很多都否定了这种说法。 1996 年 12 月 11 日至 13 日在陕西省岐山县召开的 “中国珠算协会第四届珠算史专业委员会暨陶丸鉴定会 ” 认为陶丸与 “三才算 ”有关，而不是珠算。 ④ 王为桐和王世玉的《驳西周陶丸异义》（《齐鲁珠坛》 1997年 06 期）也认为陶丸与珠算无关。 n 考古界对此看法纷纭。张福汉和肖宗史认为 “三才算 ”也属于 “珠算 ”的范畴，关键是确定陶丸是 “哪一种珠算的算珠 ”⑤ ；王为桐，王世玉和公维萍更是认为 “西周陶丸不是珠算用珠 , 而是弹丸 ”⑥ n 从以上资料可以看出，学术界对于西周陶丸究竟何用并无一个统一的说法。清华大学宋斌恒 29 参考文献 n ① 新华网《陕西岐山最小乡出土 18件文物居中国考古之最》 http://www.xinhuanet.com/chinanews/2003- 10/23/content_1094597.htm n ② 新华网《康熙数学专著发现经过大解密》 http://news.xinhuanet.com/newscenter/2003- 03/03/content_755023.htm n ③ 齐鲁珠坛， 1997年 04期，王为桐，王世玉，贺西周陶丸研究的突破 n ④ 齐鲁珠坛， 1997年 06期，王为桐，王世玉，驳西周陶丸异义 n ⑤ 陕西经贸学院学报， 1997年 03期，张福汉，肖宗史，岐山 “西周陶丸 ”再考 n ⑥ 齐鲁珠坛， 1999年 01期，王为桐，王世玉，公维萍，西周陶丸用途的结论清华大学宋斌恒 30 Some of major successes fields n Parsing algorithms - these form the basis of the field of programming languages n Fast Fourier transform - the field of digital signal processing is built upon this algorithm. n Linear programming - this algorithm is extensively used in resource scheduling. n Sorting algorithms - until recently, sorting used up the bulk of computer cycles. n String matching algorithms - these are extensively used in computational biology. 6 清华大学宋斌恒 31 Number of major successes fields n Number theoretic algorithms - these algorithms make it possible to implement cryptosystems such as the RSA public key cryptosystem. n Compression algorithms - these algorithms allow us to transmit data more efficiently over, for example, phone lines. n Geometric algorithms - displaying images quickly on a screen often makes use of sophisticated algorithmic techniques. 清华大学宋斌恒 32 Related aspect n Abstract model of computer: n High level of languanges 清华大学宋斌恒 33 Divide and Conquer n Divide and Conquer is a method n Applying cases: n A large size problem, how to solve it? n We can solve small size problems! n Divide it into several small size problems, n Solve these small size sub-problems! n Next, what shall we do? n Combine the sub-solution to get the solution of the original problem. 清华大学宋斌恒 34 Example of a Divide and Conquer n Problem: sorting {12,3,45,5,21,7,9,1,10} n Divide: it is too big to solve, if it is smaller we may solve it, so we divide it into two problem: n Sub-problem1: sorting{12,3,45,5,21} n Sub-problem2: sorting{7,9,1,10} 清华大学宋斌恒 35 n Conquer: we use some methods (any kind of) to solve it and get the sub-problems’solutions, that means we sort the two sub-problems: n Solution of sub-problem1:{3,5,12,21,45} n Solution of sub-problem2:{1,7,9,10} n Combine: we should use these two solutions to construct the solution of the original problem, see next page 清华大学宋斌恒 36 1. Merge(A, B) //A and B are two sorted Arrays 1. r=s=0; 2. Create array C //to store the result 3. i=0 //index of C 4. While not end of A and B //r<A.length & s<B.length 1. If(A[r]<B[s]) then 1. C[i]=A[r]; 2. r++; i++; 2. Else 1. C[i]=B[s]; 2. s++; i++; 5. If(r==A.length) 1. Append B[s,… ,B.length-1] to C 6. Else 1. Append A[r,… ,A.length-1] to C 7. Return C 7 清华大学宋斌恒 37 Solution! n Using the merge algorithm we get the solution:{1,3,5,7,9,10,12,21,45} n Any left problems? 清华大学宋斌恒 38 Correctness n Is your solution correct? n Does your merge algorithm outcome a sorted array in any case? n Is the condition for merge algorithm fulfilled? n How to fulfill the condition? n Using other method n Using the same divide and conquer method n To a trivial state: for example there are only one element for sorting. 清华大学宋斌恒 39 Complexity Analysis n How many memories does the algorithm use? n How many steps does the algorithm use? n Let T(n) be the number of steps used in the algorithm for an array of length n, then we know that by the divide and conquer that: n T(n)=2T(n/2)+Θ( n), where Θ is an asymptotic expression means tighten bound, its definition will be given in the next page【我们假设每次均分，而且正好可以均分】清华大学宋斌恒 40 Notation Θ n If there exists two positive constants c and d such that when n tends to infinity we have cg(n) ≤ f(n) ≤ dg(n) then we call g(n) is an asymptotically tight bounds of f(n) and denote it as f(n)= Θ(g(n)) n Other notations o[not tight upper bound], O[upper bound] and ?[lower bound], ω[not tight lower bound] can be found in p.41 in the reference book. 清华大学宋斌恒 41 Left problems n What is the upper bounds and lower bounds for the sorting problem? n How to solve the recursive equality(or inequality)? n T(n)=2T(n/2)+Θ( n) n How to converge the above equality to a well posed problem? n How to solve the problem?[Master theorem see p.76] 清华大学宋斌恒 42 Exercise n Big integer’s expression, addition and multiplication in a 32-bits microcomputer. n How to express a big integer? n We use an n-length of array of integers A[0,… ,n-1] to express a big integer which is less than or equal to 232(n+1)-1 but is bigger than 232n n Attention! We use 232 radix. 8 清华大学宋斌恒 43 Addition of big integer n Can you ever do a plus for big integers? n can you do a plus for small integers which is 32-bits or less? n How to divide the big integers summation to a series of small integers summations? n How to combine it together? n How many steps do you use in your algorithm? 清华大学宋斌恒 44 Multiplication of big integers n A=a1 232r + a2 n B=b1 232s + b 2 n Where r=A.length/2, s=B.length/2, hence the size of a1, a2 are half size of A, it is similar for B, b1, b2 n AB=a1 b1 232(s+r) +b1 a2 232s +a1 b2 232r +a2 b2 n Converge the multiplication to 4 smaller integers multiplications and summations 清华大学宋斌恒 45 Complexity analysis n T(n,m)----the steps used for multiplication of a 32n-bits and 32m -bits integers n T(n,m)=4T(n/2,m/2)+Θ(m+n) n Result? n Let n=m then n T(n)=4T(n/2)+Θ(n) n T(n)= Θ( n2 ) 清华大学宋斌恒 46 The above performance can be obtained by a directed method n Can you give a na?ve algorithm for multiplication of big integers? 清华大学宋斌恒 47 An improved algorithm n A=a1 232r + a2 n B=b1 232r + b2 n Let n=A.length= B.length and r=n/2, n AB=a1 b1 264r +(b1 a2 +a1 b2) 232r +a2 b2 n We can converge the multiplication to 4 smaller integers multiplications and 3 summations n Can we do better? 清华大学宋斌恒 48 We need only 3 multiplications! n u= a1 b1 n v= a2 b2 n (b1 a2 +a1 b2) =(a1 + a2) (b1 + b 2)-u-v n We needs only 3 multiplications to get: n a1 b1, a2 b2, (b1 a2 +a1 b2) 9 清华大学宋斌恒 49 More Divide Conquer Examples n Can you give more? n二路归并排序 n最接近点对 n二分搜索

课件简介

课件名称：	清华大学：算法分析与设计
课件分类：	计算机
课件类型：	电子教案
文件大小：	2.69MB
下载次数：	5
评论次数：	8
用户评分：	6.8

用户列表