1 1清华大学清华大学宋斌恒宋斌恒 第七讲第七讲图的基本算法图的基本算法 清华大学清华大学 宋斌恒宋斌恒 清华大学清华大学宋斌恒宋斌恒2 内容提要内容提要 Representation of graphics Breadth-first search Depth-first search Topological sort Strongly connected components 清华大学清华大学宋斌恒宋斌恒3 图的应用背景图的应用背景 可以利用图描述的经典问题有 Petri-Net Work Flow 城市与连接城市的道路 区域与隔离区域间的分界线 人与人之间的认识关系 任何二元关系都可以用图来表示 基本算法的应用 许多与图有关的算法需要做一种搜索 清华大学清华大学宋斌恒宋斌恒4 Representation of graphs Definition of a graph: G = (V, E) where V is a set of vertices, E is another set call Edges is a sub set of {(u,v): u and v is element of V} Concepts of graphs: Dense graph: if |E| is close to |V| 2 . Sparse graph: If it is not dense 清华大学清华大学宋斌恒宋斌恒5 1 2 5 4 3 2 1 2 2 4 5 5 4 5 1 / / 3 3 2 / / 4 / 01011 10110 01010 11101 10010 Figure 22.1 Two representations of an undirected graph. (a) An undirected graph G having five vertices and seven edges. (b) An adjacency-list representation of G. (c) The adjacency-matrix representation of G. 清华大学清华大学宋斌恒宋斌恒6 1 2 5 4 3 2 5 / 6 2 / 4 / 5 / / 4 / 000000 001000 000010 110000 010000 001010 6 2 清华大学清华大学宋斌恒宋斌恒7 Adjacency-list representation Let G be a graph, for any u ∈ V, we denote Adj[u] = { v: (u,v) ∈ E}, call the list of all adjacency vertices of u. In general element in Adj[u] is unordered In an undirected graph Sum(Adj[u], for all u ∈ V) = 2 |E| But in a directed graph Sum(Adj[u], for all u ∈ V) = 1 |E| 清华大学清华大学宋斌恒宋斌恒8 Data structure of Adjacency-list representation A set of vertex objects Vertex object contains one vertex u, and a set of vertices Adj[u], which contains all its adjacency vertices of u. ? Using what data structure to represent a set for the vertex objects ? Using what data structure to represent a set for Adj[u] 清华大学清华大学宋斌恒宋斌恒9 Weighted Graph We can assign value to the edges of a graph, then we call such a graph is a weighted graph. Classically, a weighted graph has its edge value belong to R, but we can extend it to values from any particular spaces, say, spaces containing complex objects w: E ?R is the weight function 清华大学清华大学宋斌恒宋斌恒10 Adjacency-list representation for Weighted Graphics We reconstruct the set Adj[u] as Adj[u]={(v,r): v ∈ V, r ∈ R} where R is the space of real number or generally any object space 清华大学清华大学宋斌恒宋斌恒11 More general graphics We can also assign values to the vertices of a graph, then it needs a more general representation to present such a graph !Give an Adjacency-list like data structure to represent such a graph. 清华大学清华大学宋斌恒宋斌恒12 Matrix representation A = (a ij ) n*n . a ij = 1 if (i,j) ∈E, 0 else. It is called the adjacency matrix ?How to present it for the case of weighted graph? ?For more general case: there are information both on the vertices and edges? 3 清华大学清华大学宋斌恒宋斌恒13 基本事实基本事实 Given a graph represented by adjacency- list, Compute the out-degree and in- degree of every vertices. Transpose of a graph Square of a directed graph: G 2 =(V,E 2 ), where E 2 is a set of (u,v) such that there exists at least one w in V such that both (u,w) and (w,v) are in V. Give an algorithm to compute it. 清华大学清华大学宋斌恒宋斌恒14 BFS: Breadth first search 1. BFS(G,s) 1. For each vertex u ∈ V[G]-{s} 1. Color[u]=white 2. d[u]=∞ 3. π [u]=null 2. color[s]=grey 3. d[s]=0 4. π [s]=null 5. Q=φ 6. Q.enqueue(s) 清华大学清华大学宋斌恒宋斌恒15 BFS 7. while(Q != φ) 1. u=Q.dequeue() 2. For each v ∈ Adj[u] 1. If (color[v] == white) 1. color[v]=gray 2. d[v]=d[u]+1 3. π [v]=u 4. Q.enqueue(v) 3. color[u]=black 清华大学清华大学宋斌恒宋斌恒16 BFS ∞ ∞ ∞ ∞ ∞∞∞∞ v w t ur x y s ∞ 0 ∞ ∞ ∞∞∞∞ v w t ur x y s Q={} Q={s} 清华大学清华大学宋斌恒宋斌恒17 BFS[续续] 1 0 ∞ ∞ ∞∞1∞ v w t ur x y s Q={r,w} 1 0 ∞ ∞ ∞∞12 v w t ur x y s Q={w,v} 清华大学清华大学宋斌恒宋斌恒18 BFS[续续] 1 0 2 ∞ ∞212 v w t ur x y s Q={v,t,x} 1 0 2 ∞ ∞212 v w t ur x y s Q={t,x} 4 清华大学清华大学宋斌恒宋斌恒19 BFS[续续] 1 0 2 3 ∞212 v w t ur x y s Q={x,u} 1 0 2 3 3212 v w t ur x y s Q={u,y} 清华大学清华大学宋斌恒宋斌恒20 BFS[续续] 1 0 2 3 3212 v w t ur x y s Q={y} 1 0 2 3 3212 v w t ur x y s Q={} 清华大学清华大学宋斌恒宋斌恒21 Analysis Using aggregate analysis,the complexity of BFS is O(E+V). Can you? 清华大学清华大学宋斌恒宋斌恒22 Shortest path 什么是最短路径【definition?】 By a BFS, we got a shortest-path distance δ(s,v) from s to v as the minimum number of edges in any path from vertex s to vertex v. If there is no path from s to v, then δ(s,v)=∞. 清华大学清华大学宋斌恒宋斌恒23 Lemma 22.1 Lemma 22.1. if (u,v)∈E[G], then δ(s,v) ≤δ(s,u)+1【三角不等式】 Proof. Case 1. u is reachable from s, then the distance from v is less than the length of the shortest path to u + (u,v),hence … Case 2. u is not reachable from s, then δ(s,u)=∞, hence … 清华大学清华大学宋斌恒宋斌恒24 Lemma 22.2 After the BFS run on the graph G from a given source vertex s, then upon termination, for each vertex v in V, d[v]≥δ(s,v) 5 清华大学清华大学宋斌恒宋斌恒25 Proof By induction of the number of enqueue operations. The inductive hypothesis is that d[v]≥δ(s,v) for all v ∈V all the time while computing. 【d[v]在下降】 清华大学清华大学宋斌恒宋斌恒26 证明(续)证明(续) Just after the initializing, that is the time that s enqueued. We can verify that the hypothesis is hold here: 1)d[s]= 0 = δ[s,s] 2)d[v]= ∞≥δ[s,v] 清华大学清华大学宋斌恒宋斌恒27 证明(续)证明(续) For the inductive step, consider a white vertex v that is discovered during the search from a vertex u. The inductive hypothesis implies that d[u] ≥δ[s,u]. s v u 清华大学清华大学宋斌恒宋斌恒28 证明(续)证明(续) from lemma 22.1, we obtain d[v]=d[u]+1≥δ[s,u]+1 ≥δ[s,v] Vertex v is enqueued, and d[v] never changed during the following computing, thus the hypothesis is maintained 清华大学清华大学宋斌恒宋斌恒29 Lemma 22.3 Let Q={v 1 ,v 2 ,…,v r }, v 1 is the head of the queue, and v r is the tail, then: ? d[v r ] ≤d[v 1 ]+1 ? d[v i ] ≤d[v i+1 ] for all i = 1,2,…,r-1. 清华大学清华大学宋斌恒宋斌恒30 When v enqueues, the last dequeued element is u, now its head v 1 satisfies d[u]≤d[v 1 ]. d[v r+1 ]=d[v]=d[u]+1≤d[v 1 ]+1 d[v r ]≤d[u]+1 (by induction) Implies d[v r ] ≤d[u]+1 =d[v]= d[v r+1 ] Hence the enqueue operation maintains the hypothesis. s v u Q={v 1 ,…, v r , v} Q={u, v 1 ,…, v r } 6 清华大学清华大学宋斌恒宋斌恒31 Corollary 22.4 d[u]≤d[v] if u enqueued before v. Proof. Immediate from the above lemma. 清华大学清华大学宋斌恒宋斌恒32 Theorem 22.5 (Correctness of the BFS): Let G be a directed or undirected graph, and suppose that BFS is run on G from a given s in V. Then, During the searching, it discovers all the vertices is reachable from s, and upon termination, d[v] = δ(s,v) One of the shortest path from s to v contains (π[v],v) if v is reachable from s. 清华大学清华大学宋斌恒宋斌恒33 Proof of Theorem 22.4 反证法:设顶点v是不满足以上定理的距离最 短的顶点之一,则d[v] ≠δ[s,v], 很显然, v≠s, 由于d[v] ≥δ[s,v]得到:d[v] >δ[s,v]. v 是从s可到达的,否则矛盾,故有最短路径 到达v,且其前一个顶点为u【一定有】,进 而得到:δ[s,v] =δ[s,u]+1? s v u 清华大学清华大学宋斌恒宋斌恒34 证明(续)证明(续) d[v]> δ[s,v] = δ[s,u]+1=d[u]+1.【利用假 设】 下边说明在u出队列之时,不论v是白、 灰、黑,都矛盾: 白:d[v] = d[u]+1,【白即缺省色:蓝】 灰:d[v] ≤d[u]+1, 黑:d[v] ≤d[u]。 而不是d[v]> d[u]+1 !! 清华大学清华大学宋斌恒宋斌恒35 Breadth first tree The procedure BFS builds a breadth-first tree. s is the root. 1 0 2 3 3212 v w t ur x y s 清华大学清华大学宋斌恒宋斌恒36 Diameter of a tree The diameter of a tree T = (V, E) is given by Max(δ(u,v), for u,v ∈V) Give an efficient solution. 7 清华大学清华大学宋斌恒宋斌恒37 DFS: Depth-first search 1/12 2/11 3/104/5 15/16 13/14 6/7 8/9 清华大学清华大学宋斌恒宋斌恒38 1. DFS(G) 1. For each vertex u ∈ V[G] 1. color[u]?white 2. π[u]?null 2. time?0 3. For each vertex u ∈ V[G] 1. If color[u] == white then DFS-Visit(u) 2. DFS-Visit(u) 1. color[u]?grey 2. time++ 3. d[u]?time 4. For each v ∈ Adj[u] 1. If color[v] == white then 1. π[v]?u 2. DFS-Visit(v) 5. color[u]?black 6. time++ 7. f(u)?time 清华大学清华大学宋斌恒宋斌恒39 1/ 1/ 2/ 1/ 2/ 3/ 1/ 2/ 4/ 3/ 清华大学清华大学宋斌恒宋斌恒40 1/ 2/ 4/ 3/ 1/ 2/ 4/5 1/ 2/ 4/5 3/6 1/ 2/7 3/ 3/64/5 清华大学清华大学宋斌恒宋斌恒41 1/ 2/7 3/64/5 1/8 2/7 3/64/5 1/8 2/7 9/ 3/64/5 1/8 2/7 9/ 3/64/5 清华大学清华大学宋斌恒宋斌恒42 1/8 2/7 9/ 10/3/64/5 1/8 2/7 9/ 10/3/64/5 1/8 2/7 9/ 10/113/64/5 1/8 2/7 9/12 10/113/64/5 8 清华大学清华大学宋斌恒宋斌恒43 DFS: Depth-first search To search as “deeper” as possible: If there is more edges can explore during the search, explore it. DFS result: It composes a DFS forest by a predecessor π. G π =(V,E π ), where E π ={(π[v],v): v ∈ V and π[v] != null} G π call the depth-first forest E π call the tree edges 清华大学清华大学宋斌恒宋斌恒44 Properties of depth-first search Theorem 22.7 (Parenthesis theorem) for any different vertices u and v in a graph, after a DFS, it satisfies: 1. The (d[u],f[u]) and (d[v],f[v]) are disjoint or one contains another entirely. 1. If (d[u],f[u]) contains (d[v],f[v]) entirely, then v is the descendent of u in a depth-first tree,反之亦 然 2. If (d[u],f[u]) and (d[v],f[v]) are disjoint, then u and v are not descendent each other. 清华大学清华大学宋斌恒宋斌恒45 Theorem 22.9 (White path theorem) Vertex v is a descendent of u, if at the discovery time d[u] of u, vertex v can be reached from u along a path consisting entirely of white vertices. 清华大学清华大学宋斌恒宋斌恒46 Classification of edges Tree edges Back edges Forward edges Cross edges 1/8 2/7 9/12 10/113/64/5 清华大学清华大学宋斌恒宋斌恒47 Concepts on topological sort On a DAG – directed acyclic graph, it can sort the vertices in an ordered sequence such that all of the edges is a subset of {(u,v): u appears earlier than v} 1/8 2/7 9/12 10/113/64/5 1/8 2/7 9/12 10/113/64/5 清华大学清华大学宋斌恒宋斌恒48 Topological sort Topological-sort(G) 1. Call DFS(G) 2. At each vertex is finished, insert it onto the front of a linked list 3. Return the linked list 9 清华大学清华大学宋斌恒宋斌恒49 Lemma22.11 A directed graph G is acyclic if and only if a depth-first search of G yields no back edges Proof. It makes a cycle in the graph if there exists a back edges. 清华大学清华大学宋斌恒宋斌恒50 Theorem 22.12 Topological-Sort(G) produces a topological sort of a directed acyclic graph G. 清华大学清华大学宋斌恒宋斌恒51 证明证明 设DFS 在图G=(V,E)运行完毕并得到所有顶点的 完成时间。我们只需要证明对任意的边(u,v)都有 f[v]<f[u]. 考虑对G中任意边(u,v),当此边被搜索到时,【此时u 是灰色的,】v不能是灰色的。 如果这样,v是u的祖先,而(u,v)成为回退边矛盾,故v只 能是白或黑, 如果v是白的,它将成为u的孩子【后代】,故有 f[v]<f[u], 如果是黑色,则它已经被搜索完毕,而u还未结束,故 有f[v]<f[u]。 按照完成时间逆序排列后所有的边将有序。 清华大学清华大学宋斌恒宋斌恒52 穿衣实例穿衣实例 袜子 鞋 裤子裤带 手表 衬衣领带内裤 手套大衣 清华大学清华大学宋斌恒宋斌恒53 Strongly connected components 强连通分支! Definition: in a directed graph, a strongly connected component contains maximal set of vertices C subset of V such that for every pair of vertices u and v in C, we have both u --->v and v -->u. 清华大学清华大学宋斌恒宋斌恒54 Strongly-Connected-Components(G) 1. Call DFS(G) to compute finishing time f[u] 2. Compute G T 3. Call DFS(G T ), but in the main loop of DFS, the vertices is selected in the order of decreasing f[u] 4. Output the vertices of each tree in the second DFS(G T ), we got the separate strongly connected components 10 清华大学清华大学宋斌恒宋斌恒55 强连通分支的性质强连通分支的性质 Component Graph: G SCC =(V SCC ,E SCC ), where V SCC is the representation of the strongly connected component, E SCC is the representation of the edges between the components. G SCC is a dag. Two way connection 清华大学清华大学宋斌恒宋斌恒56 Lemma 22.13 如果C和C’是图G的两个不同的强连通分 支,设u,v属于C;u’, v’属于C’,若有u到 u’的通路,则必没有v’到v的通路。 证明:平凡 清华大学清华大学宋斌恒宋斌恒57 引理引理22.14 设C和C’是图G的两个强连通分支,设(u,v) 是一条连接C和C’的边,其中u属于C,v属 于C’,则当DFS完成后有:f(C)>f(C’). 证明: 情况1:C中元素首先发现:白色路径定理 情况2:C’中元素首先发现:C’完成后才能进入 C 清华大学清华大学宋斌恒宋斌恒58 推论推论22.15 对于G T 则上述引理的不等式正好相反。 证明:平凡 清华大学清华大学宋斌恒宋斌恒59 定理定理22.16 上述算法可以得到一个图的强连通分支。 证明:提示:对G T 上进行DFS过程中生产 的每一棵树是强连通分支进行归纳法。 k=0即开始时是正确的【平凡】 设前k棵树是强连通分支,我们来分析第k+1 棵树:设此树之根为u,而u在图的强连通分支 C内,故有 f[u]=f(C)>f(C’),其中C’是任何没有被发现的连通 分支 清华大学清华大学宋斌恒宋斌恒60 证明(续)证明(续) 此时,C中其它顶点必为白色,由白色路径 定理得知,C中的其它顶点一定是此次DFS 搜索中u的后代, 另外,所有G T 其它从C引出的边都在已经发 现的强连通分支内,故不会有其它非C的未 访问的点在完成u的搜索中被访问到。 即u为根的树是G的一个强连通分支。 11 清华大学清华大学宋斌恒宋斌恒61 总结总结 BFS 距离 实质上的贪婪算法 聚集方法 DFS 白色路径定理 DAG的拓扑排序 强连通分支算法