Massive Algorithms: Geek: Kosaraju's Algorithm to Find Strongly Connected Components

Geek: Kosaraju's Algorithm to Find Strongly Connected Components

http://en.wikipedia.org/wiki/Kosaraju's_algorithm

Kosaraju's algorithm works as follows:

Let G be a directed graph and S be an empty stack.
While S does not contain all vertices:
- Choose an arbitrary vertex v not in S. Perform a depth-first search starting at v. Each time that depth-first search finishes expanding a vertex u, push u onto S.
Reverse the directions of all arcs to obtain the transpose graph.
While S is nonempty:
- Pop the top vertex v from S. Perform a depth-first search starting at v in the transpose graph. The set of visited vertices will give the strongly connected component containing v; record this and remove all these vertices from the graph G and the stack S. Equivalently, breadth-first search (BFS) can be used instead of depth-first search.

Provided the graph is described using an adjacency list, Kosaraju's algorithm performs two complete traversals of the graph and so runs in Θ(V+E) (linear) time, which isasymptotically optimal because there is a matching lower bound (any algorithm must examine all vertices and edges). It is the conceptually simplest efficient algorithm, but is not as efficient in practice as Tarjan's strongly connected components algorithm and the path-based strong component algorithm, which perform only one traversal of the graph.

If the graph is represented as an adjacency matrix, the algorithm requires Ο(V²) time.

How it Works?

http://scienceblogs.com/goodmath/2007/10/30/computing-strongly-connected-c/

Take the graph G, and
do a recursive depth-first traversal of the graph, along with an initially empty stack of vertices. As the recursion started at a vertex V finishes, push V onto the stack. At the end of the traversal, you’ll
have all of the vertices of G in the stack. The order of the reverse traversal will be starting
with the vertices on the top of that stack.

So you reverse all of the edges in the graph, creating the reverse graph, G’. Start with the
vertex on top of the stack, and to a traversal from that vertex. All of the nodes reachable from
that vertex form one strongly connected component. Remove everything in that SCC from the stack, and
then repeat the process with the new top of the stack. When the stack is empty, you’ll have accumulated all of the SCCs.

i-56b36b0280bacb6dc765751d3ba52d56-example-scc-graph.png

As usual, things are a whole lot clearer with an example. Let’s do that using the graph to the right as an example.

First, we do a depth-first traversal of the graph, putting vertices on a stack as we finish the
recursive step started with that node. We’ll start the traversal with “A”. A,F,E,C,G,H – H is finished,
so it goes on the stack. Then G goes on the stack; then C, then E, then F, and we’re back to A. So
at this point, the stack is [H, G, C, E, F]. Then we keep going from A: A, B, D. D is done, so it goes
on the stack; then B, then A. So the final stack is [H, G, C, E, F, D, B, A]

i-00b24fb44c66c3d127eb9338bffee76f-example-scc-graph-rev.png

Then, we reverse the graph, and start traversing from whatever is on top of the stack. So first, we start from A in the reversed graph. That gives us A, D, B. So {A, D, B} form one strongly connected
component. Now, the top of the stack that hasn’t been traversed yet is F. So we do F, E. {F,E} is
another SCC. The remaining stack is now [H, G, C], which traverses straightforwardly as C, H, G. So
we end up with {A, B, D}, {E, F}, and {C, G, H} as our SCCs.

Can we do better than that? Order of complexity, no. You can’t do better than the cost of a single traversal of the graph. You can eliminate the second traversal. Instead of doing that second
traversal, you can do the forward traversal using the stack, and then pull things off the stack
checking if they are the root of a strongly connected component, and if so, removing all members
of that component. Tarjan’s algorithm works this way, and ends up doing one traversal, but considering
each edge in the graph twice. So it’s slightly, but not dramatically faster. It’s generally preferred
just because it avoids the computation of the reverse graph.

http://www.acmerblog.com/strongly-connected-components-6099.html
Kosaraju算法、Tarjan算法、Gabow算法皆为寻找有向图强连通分量的有效算法。但是在Tarjan 算法和 Gabow 算法的过程中，只需要进行一次的深度优先搜索，而Kosaraju算法需要两次DFS，因而相对 Kosaraju 算法较有效率。这些算法可简称为SSC(strongly connected components)算法;
Kosaraju 算法即为算法导论一书给出的算法，比较直观和易懂。这个算法可以说是最容易理解，最通用的算法，其比较关键的部分是同时应用了原图G和反图GT。它利用了有向图的这样一个性质，一个图和他的transpose graph(边全部反向)具有相同的强连通分量！

算法思路：
1， DFS遍历原图，对每个访问到的节点标记时间戳（访问完成时间）。
2，按照时间戳的降序DFS遍历反向图，得到的每个连通块就是一个强连通分量。
算法步骤如下：
1) 创建一个空的栈 ‘S’ ，然后对图做DFS遍历. 在顶点访问完成后加入栈中。访问完成是说回溯返回时，而不是第一次发现该节点时。fillOrder()函数
2) 得到图转置，即将所有边反向。
3) 从S中依次弹出每个顶点，设为 ‘v’. 将v看做遍历的源点 (调用 DFSUtil(v)). 做第二次DFS遍历，可以找到以v为起点的强连通分量，打印出即可。

http://greatpowerlaw.wordpress.com/2013/05/03/scc-kosarajus-algorithm/

1. 第一次对图进行DFS遍历，获得所有节点的“reverse postordering”;
2. 将原图做逆向，即边 a -> b 改成 b -> a;
3. 按照节点的“reverse postordering”，对逆向图做DFS;

算法的依据就在于一个图的SCC，在逆图中也是SCC。所以，对逆图的DFS遍历，其实是对之前找到的连通子图再做一次过滤。后一个DFS理论上也能改成BFS。

算法最难理解的地方，在于为什么第二次DFS必须按照节点的“reverse postordering”来进行？我找了很多资料，最后在算法导论22.5中找到了最清楚的解释。

先解释一下如何获得节点的“reverse postordering”：
void dfs1 ( int v )
{
vis[v]=true;
for (int i=0 ; i<adj[v].size() ; i++ )
if ( !vis[adj[v][i]] )
dfs1(u);

ord.push_back(v); //reverse postordering
}
只有当节点的所有邻居都被DFS遍历完，然后将节点压栈，得到的才是reverse postordering，也是graph的topological ordering。

对于一个biGraph，一定存在一个SCC，它和其它的SCC之间只有out-edge而没有in-edge。如上图中的SCC-abe。对图做DFS，节点的“reverse postordering”的第一个节点一定会落到这样的SCC中。例如上图(a)，从节点c开始做DFS，最后完成的节点是b，位于SCC-abe中。

当对逆图进行DFS遍历时，因为SCC-abe在原图中没有in-edge，所有它在逆图中没有out-edge。因此，如果从b节点开始遍历，能够保证不会遍历到其他的SCC中去。所以“reverse postordering”的作用，在于找到逆图中没有out-edge的SCC中的节点。（简单的解释，更详细的还是得看算法导论）

006class Graph {

007    int V;    // 顶点个数

008    list<int> *adj;    //邻接表

010    // 最晚的完成时间的顶点放在栈顶

011    void fillOrder(int v, bool visited[], stack<int> &Stack);

012

013    // DFS打印以v起点的边

014    void DFSUtil(int v, bool visited[]);

015public:

016    Graph(int V);

017    void addEdge(int v, int w);

018

019    //打印所有的强连通分量

020    void printSCCs();

021

022    //得到当前图的转置图

023    Graph getTranspose();

024};

025

026Graph::Graph(int V) {

027    this->V = V;

028    adj = new list<int> [V];

029}

030

031void Graph::DFSUtil(int v, bool visited[]) {

032    visited[v] = true;

033    cout << v << " ";

034    list<int>::iterator i;

035    for (i = adj[v].begin(); i != adj[v].end(); ++i)

036        if (!visited[*i])

037            DFSUtil(*i, visited);

038}

039

040Graph Graph::getTranspose() {

041    Graph g(V);

042    for (int v = 0; v < V; v++) {

043        list<int>::iterator i;

044        for (i = adj[v].begin(); i != adj[v].end(); ++i) {

045            g.adj[*i].push_back(v);

046        }

047    }

048    return g;

049}

050

051void Graph::addEdge(int v, int w) {

052    adj[v].push_back(w);

053}

054

055void Graph::fillOrder(int v, bool visited[], stack<int> &Stack) {

056    visited[v] = true;

057    list<int>::iterator i;

058    for (i = adj[v].begin(); i != adj[v].end(); ++i)

059        if (!visited[*i])

060            fillOrder(*i, visited, Stack);

061

062    //所有从v顶点可达的顶点都已处理完，v顶点访问完成

063    Stack.push(v);

064}

065

066void Graph::printSCCs() {

067    stack<int> Stack;

068

069    bool *visited = new bool[V];

070    for (int i = 0; i < V; i++)

071        visited[i] = false;

072

073    // 根据完成时间压入栈中，栈顶是完成时间最晚的顶点

074    for (int i = 0; i < V; i++)

075        if (visited[i] == false)

076            fillOrder(i, visited, Stack);

077

078    // 创建转置图

079    Graph gr = getTranspose();

080

081    // 准备第二次DFS

082    for (int i = 0; i < V; i++)

083        visited[i] = false;

084

085    while (Stack.empty() == false) {

086        int v = Stack.top();

087        Stack.pop();

088        // 打印以v为起点的强连通分量

089        if (visited[v] == false) {

090            gr.DFSUtil(v, visited);

091            cout << endl;

092        }

093    }

094}

http://www.sanfoundry.com/java-program-kosaraju-algorithm/

```
public class Kosaraju
```
```
{
```
```
    /** DFS function **/
```

  public void DFS(List<Integer>[] graph, int v, boolean[] visited,

```
     List<Integer> comp) 
```
```
  {
```
```
        visited[v] = true;
```

        for (int i = 0; i < graph[v].size(); i++)

            if (!visited[graph[v].get(i)])

                DFS(graph, graph[v].get(i), visited, comp);

```
        comp.add(v);
```
```
  }
```
```
    /** function fill order **/
```

  public List<Integer> fillOrder(List<Integer>[] graph, boolean[] visited)

```
  {        
```
```
        int V = graph.length;
```

        List<Integer> order = new ArrayList<Integer>();

```
 
```
```
        for (int i = 0; i < V; i++)
```
```
            if (!visited[i])
```

                DFS(graph, i, visited, order);

```
        return order;
```
```
    }
```

    /** function to get transpose of graph **/

    public List<Integer>[] getTranspose(List<Integer>[] graph)

```
    {
```
```
        int V = graph.length;
```

        List<Integer>[] g = new List[V];

```
        for (int i = 0; i < V; i++)
```

            g[i] = new ArrayList<Integer>();

```
        for (int v = 0; v < V; v++)
```

            for (int i = 0; i < graph[v].size(); i++)

                g[graph[v].get(i)].add(v);

```
        return g;
```
```
    }
```
```
    /** function to get all SCC **/
```

    public List<List<Integer>> getSCComponents(List<Integer>[] graph)

```
    {
```
```
        int V = graph.length;
```

        boolean[] visited = new boolean[V];

        List<Integer> order = fillOrder(graph, visited);

        /** get transpose of the graph **/

        List<Integer>[] reverseGraph = getTranspose(graph);

```
        /** clear visited array **/
```
```
        visited = new boolean[V];
```

        /** reverse order or alternatively use a stack for order **/

```
        Collections.reverse(order);
```
```
 
```
```
        /** get all scc **/
```

        List<List<Integer>> SCComp = new ArrayList<>();

        for (int i = 0; i < order.size(); i++)

```
        {
```

            /** if stack is used for order pop out elements **/

```
            int v = order.get(i);
```
```
            if (!visited[v]) 
```
```
            {
```

                List<Integer> comp = new ArrayList<>();

                DFS(reverseGraph, v, visited, comp);

```
                SCComp.add(comp);
```
```
            }
```
```
        }
```
```
        return SCComp;
```
```
    }
```

Java code - DFS: Strongly connected components. Kosaraju's algorithm
http://www.keithschwarz.com/interesting/code/kosaraju/Kosaraju.java.html
http://algs4.cs.princeton.edu/42directed/KosarajuSharirSCC.java.html

The Proof
http://lcm.csa.iisc.ernet.in/dsa/node171.html
http://www.cse.yorku.ca/~andy/courses/3101/lecture-notes/SCC.pdf

Also check http://www.geeksforgeeks.org/strongly-connected-components/
http://blog.sina.com.cn/s/blog_6383bcba0100wdge.html
Read full article from Geek: Kosaraju's Algorithm to Find Strongly Connected Components

Geek: Kosaraju's Algorithm to Find Strongly Connected Components

Labels

Popular Posts