1
1清华大学清华大学宋斌恒宋斌恒
第七讲第七讲图的基本算法图的基本算法
清华大学清华大学
宋斌恒宋斌恒
清华大学清华大学宋斌恒宋斌恒2
内容提要内容提要
Representation of graphics
Breadth-first search
Depth-first search
Topological sort
Strongly connected components
清华大学清华大学宋斌恒宋斌恒3
图的应用背景图的应用背景
可以利用图描述的经典问题有
Petri-Net
Work Flow
城市与连接城市的道路
区域与隔离区域间的分界线
人与人之间的认识关系
任何二元关系都可以用图来表示
基本算法的应用
许多与图有关的算法需要做一种搜索
清华大学清华大学宋斌恒宋斌恒4
Representation of graphs
Definition of a graph:
G = (V, E) where V is a set of vertices, E is
another set call Edges is a sub set of {(u,v): u
and v is element of V}
Concepts of graphs:
Dense graph: if |E| is close to |V|
2
.
Sparse graph: If it is not dense
清华大学清华大学宋斌恒宋斌恒5
1 2
5 4
3
2
1
2
2
4
5
5
4
5
1
/
/
3
3
2
/
/
4 /
01011
10110
01010
11101
10010
Figure 22.1 Two representations of an undirected graph.
(a) An undirected graph G having five vertices and seven edges.
(b) An adjacency-list representation of G.
(c) The adjacency-matrix representation of G.
清华大学清华大学宋斌恒宋斌恒6
1
2
5
4
3
2
5 /
6
2 /
4
/
5
/
/
4 /
000000
001000
000010
110000
010000
001010
6
2
清华大学清华大学宋斌恒宋斌恒7
Adjacency-list representation
Let G be a graph, for any u ∈ V, we
denote Adj[u] = { v: (u,v) ∈ E}, call the list
of all adjacency vertices of u.
In general element in Adj[u] is unordered
In an undirected graph
Sum(Adj[u], for all u ∈ V) = 2 |E|
But in a directed graph
Sum(Adj[u], for all u ∈ V) = 1 |E|
清华大学清华大学宋斌恒宋斌恒8
Data structure of Adjacency-list
representation
A set of vertex objects
Vertex object contains one vertex u, and a
set of vertices Adj[u], which contains all its
adjacency vertices of u.
? Using what data structure to represent a
set for the vertex objects
? Using what data structure to represent a
set for Adj[u]
清华大学清华大学宋斌恒宋斌恒9
Weighted Graph
We can assign value to the edges of a
graph, then we call such a graph is a
weighted graph.
Classically, a weighted graph has its edge
value belong to R, but we can extend it to
values from any particular spaces, say,
spaces containing complex objects
w: E ?R is the weight function
清华大学清华大学宋斌恒宋斌恒10
Adjacency-list representation for
Weighted Graphics
We reconstruct the set Adj[u] as
Adj[u]={(v,r): v ∈ V, r ∈ R} where R is the
space of real number or generally any object
space
清华大学清华大学宋斌恒宋斌恒11
More general graphics
We can also assign values to the vertices
of a graph, then it needs a more general
representation to present such a graph
!Give an Adjacency-list like data structure
to represent such a graph.
清华大学清华大学宋斌恒宋斌恒12
Matrix representation
A = (a
ij
)
n*n
.
a
ij
= 1 if (i,j) ∈E, 0 else.
It is called the adjacency matrix
?How to present it for the case of weighted
graph?
?For more general case: there are
information both on the vertices and edges?
3
清华大学清华大学宋斌恒宋斌恒13
基本事实基本事实
Given a graph represented by adjacency-
list, Compute the out-degree and in-
degree of every vertices.
Transpose of a graph
Square of a directed graph: G
2
=(V,E
2
),
where E
2
is a set of (u,v) such that there
exists at least one w in V such that both
(u,w) and (w,v) are in V. Give an algorithm
to compute it.
清华大学清华大学宋斌恒宋斌恒14
BFS: Breadth first search
1. BFS(G,s)
1. For each vertex u ∈ V[G]-{s}
1. Color[u]=white
2. d[u]=∞
3. π [u]=null
2. color[s]=grey
3. d[s]=0
4. π [s]=null
5. Q=φ
6. Q.enqueue(s)
清华大学清华大学宋斌恒宋斌恒15
BFS
7. while(Q != φ)
1. u=Q.dequeue()
2. For each v ∈ Adj[u]
1. If (color[v] == white)
1. color[v]=gray
2. d[v]=d[u]+1
3. π [v]=u
4. Q.enqueue(v)
3. color[u]=black
清华大学清华大学宋斌恒宋斌恒16
BFS
∞ ∞ ∞ ∞
∞∞∞∞
v w
t ur
x y
s
∞ 0 ∞ ∞
∞∞∞∞
v w
t ur
x y
s
Q={}
Q={s}
清华大学清华大学宋斌恒宋斌恒17
BFS[续续]
1 0 ∞ ∞
∞∞1∞
v w
t ur
x y
s
Q={r,w}
1 0 ∞ ∞
∞∞12
v w
t ur
x y
s
Q={w,v}
清华大学清华大学宋斌恒宋斌恒18
BFS[续续]
1 0 2 ∞
∞212
v w
t ur
x y
s
Q={v,t,x}
1 0 2 ∞
∞212
v w
t ur
x y
s
Q={t,x}
4
清华大学清华大学宋斌恒宋斌恒19
BFS[续续]
1 0 2 3
∞212
v w
t ur
x y
s
Q={x,u}
1 0 2 3
3212
v w
t ur
x y
s
Q={u,y}
清华大学清华大学宋斌恒宋斌恒20
BFS[续续]
1 0 2 3
3212
v w
t ur
x y
s
Q={y}
1 0 2 3
3212
v w
t ur
x y
s
Q={}
清华大学清华大学宋斌恒宋斌恒21
Analysis
Using aggregate analysis,the complexity
of BFS is O(E+V).
Can you?
清华大学清华大学宋斌恒宋斌恒22
Shortest path
什么是最短路径【definition?】
By a BFS, we got a shortest-path distance
δ(s,v) from s to v as the minimum number
of edges in any path from vertex s to
vertex v.
If there is no path from s to v, then
δ(s,v)=∞.
清华大学清华大学宋斌恒宋斌恒23
Lemma 22.1
Lemma 22.1. if (u,v)∈E[G], then δ(s,v)
≤δ(s,u)+1【三角不等式】
Proof.
Case 1. u is reachable from s, then the
distance from v is less than the length of the
shortest path to u + (u,v),hence …
Case 2. u is not reachable from s, then
δ(s,u)=∞, hence …
清华大学清华大学宋斌恒宋斌恒24
Lemma 22.2
After the BFS run on the graph G from a
given source vertex s, then upon
termination, for each vertex v in V,
d[v]≥δ(s,v)
5
清华大学清华大学宋斌恒宋斌恒25
Proof
By induction of the number of enqueue
operations.
The inductive hypothesis is that d[v]≥δ(s,v)
for all v ∈V all the time while computing.
【d[v]在下降】
清华大学清华大学宋斌恒宋斌恒26
证明(续)证明(续)
Just after the initializing, that is the time
that s enqueued. We can verify that the
hypothesis is hold here:
1)d[s]= 0 = δ[s,s]
2)d[v]= ∞≥δ[s,v]
清华大学清华大学宋斌恒宋斌恒27
证明(续)证明(续)
For the inductive step, consider a white
vertex v that is discovered during the
search from a vertex u. The inductive
hypothesis implies that
d[u] ≥δ[s,u].
s
v
u
清华大学清华大学宋斌恒宋斌恒28
证明(续)证明(续)
from lemma 22.1, we obtain
d[v]=d[u]+1≥δ[s,u]+1 ≥δ[s,v]
Vertex v is enqueued, and d[v] never
changed during the following computing,
thus the hypothesis is maintained
清华大学清华大学宋斌恒宋斌恒29
Lemma 22.3
Let Q={v
1
,v
2
,…,v
r
}, v
1
is the head of the
queue, and v
r
is the tail, then:
? d[v
r
] ≤d[v
1
]+1
? d[v
i
] ≤d[v
i+1
] for all i = 1,2,…,r-1.
清华大学清华大学宋斌恒宋斌恒30
When v enqueues, the last dequeued
element is u, now its head v
1
satisfies
d[u]≤d[v
1
].
d[v
r+1
]=d[v]=d[u]+1≤d[v
1
]+1
d[v
r
]≤d[u]+1 (by induction)
Implies d[v
r
] ≤d[u]+1 =d[v]= d[v
r+1
]
Hence the enqueue operation maintains
the hypothesis.
s
v
u
Q={v
1
,…, v
r
, v}
Q={u, v
1
,…, v
r
}
6
清华大学清华大学宋斌恒宋斌恒31
Corollary 22.4
d[u]≤d[v] if u enqueued before v.
Proof. Immediate from the above lemma.
清华大学清华大学宋斌恒宋斌恒32
Theorem 22.5
(Correctness of the BFS): Let G be a
directed or undirected graph, and suppose
that BFS is run on G from a given s in V.
Then,
During the searching, it discovers all the
vertices is reachable from s, and upon
termination,
d[v] = δ(s,v)
One of the shortest path from s to v contains
(π[v],v) if v is reachable from s.
清华大学清华大学宋斌恒宋斌恒33
Proof of Theorem 22.4
反证法:设顶点v是不满足以上定理的距离最
短的顶点之一,则d[v] ≠δ[s,v], 很显然,
v≠s, 由于d[v] ≥δ[s,v]得到:d[v] >δ[s,v].
v 是从s可到达的,否则矛盾,故有最短路径
到达v,且其前一个顶点为u【一定有】,进
而得到:δ[s,v] =δ[s,u]+1 ?
s
v
u
清华大学清华大学宋斌恒宋斌恒34
证明(续)证明(续)
d[v]> δ[s,v] = δ[s,u]+1=d[u]+1.【利用假
设】
下边说明在u出队列之时,不论v是白、
灰、黑,都矛盾:
白:d[v] = d[u]+1,【白即缺省色:蓝】
灰:d[v] ≤d[u]+1,
黑:d[v] ≤d[u]。
而不是d[v]> d[u]+1 !!
清华大学清华大学宋斌恒宋斌恒35
Breadth first tree
The procedure BFS builds a breadth-first
tree. s is the root.
1 0 2 3
3212
v w
t ur
x y
s
清华大学清华大学宋斌恒宋斌恒36
Diameter of a tree
The diameter of a tree T = (V, E) is given
by
Max(δ(u,v), for u,v ∈V)
Give an efficient solution.
7
清华大学清华大学宋斌恒宋斌恒37
DFS: Depth-first search
1/12 2/11
3/104/5
15/16
13/14
6/7 8/9
清华大学清华大学宋斌恒宋斌恒38
1. DFS(G)
1. For each vertex u ∈ V[G]
1. color[u] ?white
2. π[u] ?null
2. time ?0
3. For each vertex u ∈ V[G]
1. If color[u] == white then DFS-Visit(u)
2. DFS-Visit(u)
1. color[u] ?grey
2. time++
3. d[u] ?time
4. For each v ∈ Adj[u]
1. If color[v] == white then
1. π[v] ?u
2. DFS-Visit(v)
5. color[u] ?black
6. time++
7. f(u) ?time
清华大学清华大学宋斌恒宋斌恒39
1/ 1/ 2/
1/ 2/
3/
1/ 2/
4/ 3/
清华大学清华大学宋斌恒宋斌恒40
1/ 2/
4/ 3/
1/ 2/
4/5
1/ 2/
4/5 3/6
1/ 2/7
3/
3/64/5
清华大学清华大学宋斌恒宋斌恒41
1/ 2/7
3/64/5
1/8 2/7
3/64/5
1/8 2/7 9/
3/64/5
1/8 2/7 9/
3/64/5
清华大学清华大学宋斌恒宋斌恒42
1/8 2/7 9/
10/3/64/5
1/8 2/7 9/
10/3/64/5
1/8 2/7 9/
10/113/64/5
1/8 2/7 9/12
10/113/64/5
8
清华大学清华大学宋斌恒宋斌恒43
DFS: Depth-first search
To search as “deeper” as possible:
If there is more edges can explore during the
search, explore it.
DFS result:
It composes a DFS forest by a predecessor π.
G
π
=(V,E
π
), where
E
π
={(π[v],v): v ∈ V and π[v] != null}
G
π
call the depth-first forest
E
π
call the tree edges
清华大学清华大学宋斌恒宋斌恒44
Properties of depth-first search
Theorem 22.7 (Parenthesis theorem) for
any different vertices u and v in a graph,
after a DFS, it satisfies:
1. The (d[u],f[u]) and (d[v],f[v]) are disjoint or
one contains another entirely.
1. If (d[u],f[u]) contains (d[v],f[v]) entirely, then v is
the descendent of u in a depth-first tree,反之亦
然
2. If (d[u],f[u]) and (d[v],f[v]) are disjoint, then u and
v are not descendent each other.
清华大学清华大学宋斌恒宋斌恒45
Theorem 22.9 (White path
theorem)
Vertex v is a descendent of u, if at the
discovery time d[u] of u, vertex v can be
reached from u along a path consisting
entirely of white vertices.
清华大学清华大学宋斌恒宋斌恒46
Classification of edges
Tree edges
Back edges
Forward edges
Cross edges
1/8 2/7 9/12
10/113/64/5
清华大学清华大学宋斌恒宋斌恒47
Concepts on topological sort
On a DAG – directed acyclic graph, it can
sort the vertices in an ordered sequence
such that all of the edges is a subset of
{(u,v): u appears earlier than v}
1/8 2/7 9/12
10/113/64/5
1/8 2/7 9/12
10/113/64/5
清华大学清华大学宋斌恒宋斌恒48
Topological sort
Topological-sort(G)
1. Call DFS(G)
2. At each vertex is finished, insert it onto the
front of a linked list
3. Return the linked list
9
清华大学清华大学宋斌恒宋斌恒49
Lemma22.11
A directed graph G is acyclic if and only if
a depth-first search of G yields no back
edges
Proof. It makes a cycle in the graph if
there exists a back edges.
清华大学清华大学宋斌恒宋斌恒50
Theorem 22.12
Topological-Sort(G) produces a
topological sort of a directed acyclic
graph G.
清华大学清华大学宋斌恒宋斌恒51
证明证明
设DFS 在图G=(V,E)运行完毕并得到所有顶点的
完成时间。我们只需要证明对任意的边(u,v)都有
f[v]<f[u].
考虑对G中任意边(u,v),当此边被搜索到时,【此时u
是灰色的,】v不能是灰色的。
如果这样,v是u的祖先,而(u,v)成为回退边矛盾,故v只
能是白或黑,
如果v是白的,它将成为u的孩子【后代】,故有
f[v]<f[u],
如果是黑色,则它已经被搜索完毕,而u还未结束,故
有f[v]<f[u]。
按照完成时间逆序排列后所有的边将有序。
清华大学清华大学宋斌恒宋斌恒52
穿衣实例穿衣实例
袜子
鞋
裤子裤带
手表
衬衣领带内裤
手套大衣
清华大学清华大学宋斌恒宋斌恒53
Strongly connected components
强连通分支!
Definition: in a directed graph, a strongly
connected component contains maximal
set of vertices C subset of V such that for
every pair of vertices u and v in C, we
have both u --->v and v -->u.
清华大学清华大学宋斌恒宋斌恒54
Strongly-Connected-Components(G)
1. Call DFS(G) to compute finishing time f[u]
2. Compute G
T
3. Call DFS(G
T
), but in the main loop of DFS,
the vertices is selected in the order of
decreasing f[u]
4. Output the vertices of each tree in the
second DFS(G
T
), we got the separate
strongly connected components
10
清华大学清华大学宋斌恒宋斌恒55
强连通分支的性质强连通分支的性质
Component Graph: G
SCC
=(V
SCC
,E
SCC
),
where V
SCC
is the representation of the
strongly connected component, E
SCC
is the
representation of the edges between the
components.
G
SCC
is a dag.
Two way connection
清华大学清华大学宋斌恒宋斌恒56
Lemma 22.13
如果C和C’是图G的两个不同的强连通分
支,设u,v属于C;u’, v’属于C’,若有u到
u’的通路,则必没有v’到v的通路。
证明:平凡
清华大学清华大学宋斌恒宋斌恒57
引理引理22.14
设C和C’是图G的两个强连通分支,设(u,v)
是一条连接C和C’的边,其中u属于C,v属
于C’,则当DFS完成后有:f(C)>f(C’).
证明:
情况1:C中元素首先发现:白色路径定理
情况2:C’中元素首先发现:C’完成后才能进入
C
清华大学清华大学宋斌恒宋斌恒58
推论推论22.15
对于G
T
则上述引理的不等式正好相反。
证明:平凡
清华大学清华大学宋斌恒宋斌恒59
定理定理22.16
上述算法可以得到一个图的强连通分支。
证明:提示:对G
T
上进行DFS过程中生产
的每一棵树是强连通分支进行归纳法。
k=0即开始时是正确的【平凡】
设前k棵树是强连通分支,我们来分析第k+1
棵树:设此树之根为u,而u在图的强连通分支
C内,故有
f[u]=f(C)>f(C’),其中C’是任何没有被发现的连通
分支
清华大学清华大学宋斌恒宋斌恒60
证明(续)证明(续)
此时,C中其它顶点必为白色,由白色路径
定理得知,C中的其它顶点一定是此次DFS
搜索中u的后代,
另外,所有G
T
其它从C引出的边都在已经发
现的强连通分支内,故不会有其它非C的未
访问的点在完成u的搜索中被访问到。
即u为根的树是G的一个强连通分支。
11
清华大学清华大学宋斌恒宋斌恒61
总结总结
BFS
距离
实质上的贪婪算法
聚集方法
DFS
白色路径定理
DAG的拓扑排序
强连通分支算法