Massive Algorithms: Greedy Algorithms | Set 2 (Kruskal's Minimum Spanning Tree Algorithm)

Greedy Algorithms | Set 2 (Kruskal's Minimum Spanning Tree Algorithm) | GeeksforGeeks

http://www.geeksforgeeks.org/greedy-algorithms-set-2-kruskals-minimum-spanning-tree-mst/

What is Minimum Spanning Tree?
Given a connected and undirected graph, a spanning tree of that graph is a subgraph

that is a tree and connects all the vertices together.

A single graph can have many different spanning trees.

Aminimum spanning tree (MST) or minimum weight spanning tree for a weighted,

connected and undirected graph is a spanning tree with weight less than or

equal to the weight of every other spanning tree.

How many edges does a minimum spanning tree has?
A minimum spanning tree has (V – 1) edges where V is the number of vertices in the given graph.

1. Sort all the edges in non-decreasing order of their weight.
2. Pick the smallest edge. Check if it forms a cycle with the spanning tree 
formed so far. If cycle is not formed, include this edge. Else, discard it.  
3. Repeat step#2 until there are (V-1) edges in the spanning tree.


将原图中所有的边按权值从小到大排序
从权值最小的边开始，如果这条边连接的两个节点于图G中不在同一个连通分量中，则添加这条边到图G中
重复3，直至图G中所有的节点都在同一个连通分量中


The step#2 uses Union-Find algorithm to detect cycle.

The algorithm is a Greedy Algorithm. The Greedy Choice is to pick the smallest weight edge that does not cause a cycle in the MST constructed so far.

Time Complexity: O(ElogE) or O(ElogV). Sorting of edges takes O(ELogE) time. After sorting, we iterate through all edges and apply find-union algorithm. The find and union operations can take atmost O(LogV) time. So overall complexity is O(ELogE + ELogV) time. The value of E can be atmost V^2, so O(LogV) are O(LogE) same. Therefore, overall time complexity is O(ElogE) or O(ElogV)

class Graph

{

    // A class to represent a graph edge

    class Edge implements Comparable<Edge>

    {

        int src, dest, weight;

        // Comparator function used for sorting edges based on

        // their weight

        public int compareTo(Edge compareEdge)

        {

            return this.weight-compareEdge.weight;

        }

    };

    // A class to represent a subset for union-find

    class subset

    {

        int parent, rank;

    };

    int V, E;    // V-> no. of vertices & E->no.of edges

    Edge edge[]; // collection of all edges

    // Creates a graph with V vertices and E edges

    Graph(int v, int e)

    {

        V = v;

        E = e;

        edge = new Edge[E];

        for (int i=0; i<e; ++i)

            edge[i] = new Edge();

    }

    // A utility function to find set of an element i

    // (uses path compression technique)

    int find(subset subsets[], int i)

    {

        // find root and make root as parent of i (path compression)

        if (subsets[i].parent != i)

            subsets[i].parent = find(subsets, subsets[i].parent);

        return subsets[i].parent;

    }

    // A function that does union of two sets of x and y

    // (uses union by rank)

    void Union(subset subsets[], int x, int y)

    {

        int xroot = find(subsets, x);

        int yroot = find(subsets, y);

        // Attach smaller rank tree under root of high rank tree

        // (Union by Rank)

        if (subsets[xroot].rank < subsets[yroot].rank)

            subsets[xroot].parent = yroot;

        else if (subsets[xroot].rank > subsets[yroot].rank)

            subsets[yroot].parent = xroot;

        // If ranks are same, then make one as root and increment

        // its rank by one

        else

        {

            subsets[yroot].parent = xroot;

            subsets[xroot].rank++;

        }

    }

    // The main function to construct MST using Kruskal's algorithm

    void KruskalMST()

    {

        Edge result[] = new Edge[V];  // Tnis will store the resultant MST

        int e = 0;  // An index variable, used for result[]

        int i = 0;  // An index variable, used for sorted edges

        for (i=0; i<V; ++i)

            result[i] = new Edge();

        // Step 1:  Sort all the edges in non-decreasing order of their

        // weight.  If we are not allowed to change the given graph, we

        // can create a copy of array of edges

        Arrays.sort(edge);

        // Allocate memory for creating V ssubsets

        subset subsets[] = new subset[V];

        for(i=0; i<V; ++i)

            subsets[i]=new subset();

        // Create V subsets with single elements

        for (int v = 0; v < V; ++v)

        {

            subsets[v].parent = v;

            subsets[v].rank = 0;

        }

        i = 0;  // Index used to pick next edge

        // Number of edges to be taken is equal to V-1

        while (e < V - 1)

        {

            // Step 2: Pick the smallest edge. And increment the index

            // for next iteration

            Edge next_edge = new Edge();

            next_edge = edge[i++];

            int x = find(subsets, next_edge.src);

            int y = find(subsets, next_edge.dest);

            // If including this edge does't cause cycle, include it

            // in result and increment the index of result for next edge

            if (x != y)

            {

                result[e++] = next_edge;

                Union(subsets, x, y);

            }

            // Else discard the next_edge

        }

        // print the contents of result[] to display the built MST

        System.out.println("Following are the edges in the constructed MST");

        for (i = 0; i < e; ++i)

            System.out.println(result[i].src+" -- "+result[i].dest+" == "+

                               result[i].weight);

    }

}

Java Implementaion
http://algs4.cs.princeton.edu/43mst/KruskalMST.java.html
http://codingrecipies.blogspot.com/2013/09/kruskals-algorithm_17.html

 private  void KruskalMST()
 {
  // sort the edge list
  Collections.sort(mEdgeList);
  
  UnionFind uf=new UnionFind(mNumVertices);
  
  // Iterating over the sorted input edgeList
  for(int i=0;i<mNumVertices;i++)
  {
         Edge edge=mEdgeList.get(i);
         int v1 = uf.Find(edge.src);  //parent vertex for source
         int v2 = uf.Find(edge.dest); //parent vertex for destination
         // if parents do not match, consider edge list for MST and , union the two vertex
         if(v1!=v2)
         {
          mResultantEdgeList.add(edge);
          uf.Union(v1, v2);
         }
  }
  // print the final MST
  printKruskalEdges();
 }

class UFNode {
 int parent; // parent of Vertex at i in the nodeHolder
 int rank; // Number of object present in the tree/ Cluster

 UFNode(int parent, int rank) {
  this.parent = parent;
  this.rank = rank;
 }
}
public class UnionFind {
 // Node Holder haveing UFNode
 private UFNode[] nodeHolder;

 // number of node
 private int count;

 public UnionFind(int size) {
  if (size < 0)
   throw new IllegalArgumentException();

  count = size;
  nodeHolder = new UFNode[size];
  for (int i = 0; i < size; i++) {
   nodeHolder[i] = new UFNode(i, 1); // default values, node points to
            // itself and rank is 1
  }
 }

 /**
  * Finds the parent of a given vertex, using recursion
  * 
  * @param vertex
  * @return
  */
 public int Find(int vertex) {
  if (vertex < 0 || vertex >= nodeHolder.length)
   throw new IndexOutOfBoundsException();

  if (nodeHolder[vertex].parent != vertex)
   nodeHolder[vertex].parent = Find(nodeHolder[vertex].parent);

  return nodeHolder[vertex].parent;
 }

 public int getCount() {
  return count;
 }
  * @return true if both vertex have same parent
 public boolean isConnected(int v1, int v2) {
  return Find(v1) == Find(v2);
 }

 public void Union(int v1, int v2) {
  int i = Find(v1);
  int j = Find(v2);

  if (i == j)
   return;

  if (nodeHolder[i].rank < nodeHolder[j].rank) {
   nodeHolder[i].parent = j;
   nodeHolder[j].rank = nodeHolder[j].rank + nodeHolder[i].rank;
  } else {
   nodeHolder[j].parent = i;
   nodeHolder[i].rank = nodeHolder[i].rank + nodeHolder[j].rank;
  }
  count--;
 }
}

http://www.keithschwarz.com/interesting/code/kruskal/Kruskal.java.html
  /* Now, sweep over the edges, adding each edge if its endpoints aren't
         * in the same partition.
         */
        for (Edge<T> edge: edges) {
            /* If the endpoints are connected, skip this edge. */
            if (unionFind.find(edge.start) == unionFind.find(edge.end))
                continue;

            /* Otherwise, add the edge. */
            result.addEdge(edge.start, edge.end, edge.cost);

            /* Link the endpoints together. */
            unionFind.union(edge.start, edge.end);

            /* If we've added enough edges already, we can quit. */
            if (++numEdges == graph.size()) break;
        }
https://www.quora.com/What-is-the-difference-in-Kruskals-and-Prims-algorithm
The basic difference is in which edge you choose to add next to the spanning tree in each step.

In Prim's, you always keep a connected component, starting with a single vertex. You look at all edges from the current component to other vertices and find the smallest among them. You then add the neighbouring vertex to the component, increasing its size by 1. In N-1 steps, every vertex would be merged to the current one if we have a connected graph.

In Kruskal's, you do not keep one connected component but a forest. At each stage, you look at the globally smallest edge that does not create a cycle in the current forest. Such an edge has to necessarily merge two trees in the current forest into one. Since you start with N single-vertex trees, in N-1 steps, they would all have merged into one if the graph was connected.

http://www.geeksforgeeks.org/greedy-algorithms-set-2-kruskals-minimum-spanning-tree-mst/
O(ElogE) or O(ElogV). Sorting of edges takes O(ELogE) time. After sorting, we iterate through all edges and apply find-union algorithm. The find and union operations can take atmost O(LogV) time. So overall complexity is O(ELogE + ELogV) time. The value of E can be atmost V^2, so O(LogV) are O(LogE) same. Therefore, overall time complexity is O(ElogE) or O(ElogV)
http://www.codebytes.in/2015/03/kruskals-algorithm-implementation-in.html
http://algs4.cs.princeton.edu/43mst/
Kruskal's algorithm processes the edges in order of their weight values (smallest to largest), taking for the MST (coloring black) each edge that does not form a cycle with edges previously added, stopping after adding V-1 edges
we use a priority queue to consider the edges in order by weight, a union-find data structure to identify those that cause cycles, and a queue to collect the MST edges.

Kruskal's algorithm computes the MST of any connected edge-weighted graph with E edges and V vertices using extra space proportional to E and time proportional to E log E (in the worst case).
http://algs4.cs.princeton.edu/43mst/KruskalMST.java.html
We don't sort all edges, in stead we use priorityqueue:

    private double weight;                        // weight of MST
    private Queue<Edge> mst = new Queue<Edge>();  // edges in MST

    /**
     * Compute a minimum spanning tree (or forest) of an edge-weighted graph.
     * @param G the edge-weighted graph
     */
    public KruskalMST(EdgeWeightedGraph G) {
        // more efficient to build heap by passing array of edges
        MinPQ<Edge> pq = new MinPQ<Edge>();
        for (Edge e : G.edges()) {
            pq.insert(e);
        }

        // run greedy algorithm
        UF uf = new UF(G.V());
        while (!pq.isEmpty() && mst.size() < G.V() - 1) {
            Edge e = pq.delMin();
            int v = e.either();
            int w = e.other(v);
            if (!uf.connected(v, w)) { // v-w does not create a cycle
                uf.union(v, w);  // merge v and w components
                mst.enqueue(e);  // add edge e to mst
                weight += e.weight();
            }
        }

        // check optimality conditions
        assert check(G);
    }

    // check optimality conditions (takes time proportional to E V lg* V)
    private boolean check(EdgeWeightedGraph G) {

        // check total weight
        double total = 0.0;
        for (Edge e : edges()) {
            total += e.weight();
        }
        if (Math.abs(total - weight()) > FLOATING_POINT_EPSILON) {
            System.err.printf("Weight of edges does not equal weight(): %f vs. %f\n", total, weight());
            return false;
        }

        // check that it is acyclic
        UF uf = new UF(G.V());
        for (Edge e : edges()) {
            int v = e.either(), w = e.other(v);
            if (uf.connected(v, w)) {
                System.err.println("Not a forest");
                return false;
            }
            uf.union(v, w);
        }

        // check that it is a spanning forest
        for (Edge e : G.edges()) {
            int v = e.either(), w = e.other(v);
            if (!uf.connected(v, w)) {
                System.err.println("Not a spanning forest");
                return false;
            }
        }

        // check that it is a minimal spanning forest (cut optimality conditions)
        for (Edge e : edges()) {

            // all edges in MST except e
            uf = new UF(G.V());
            for (Edge f : mst) {
                int x = f.either(), y = f.other(x);
                if (f != e) uf.union(x, y);
            }
            
            // check that e is min weight edge in crossing cut
            for (Edge f : G.edges()) {
                int x = f.either(), y = f.other(x);
                if (!uf.connected(x, y)) {
                    if (f.weight() < e.weight()) {
                        System.err.println("Edge " + f + " violates cut optimality conditions");
                        return false;
                    }
                }
            }

        }

        return true;
    }

https://www.quora.com/What-is-the-difference-in-Kruskals-and-Prims-algorithm
The basic difference is in which edge you choose to add next to the spanning tree in each step.

In Prim's, you always keep a connected component, starting with a single vertex. You look at all edges from the current component to other vertices and find the smallest among them. You then add the neighbouring vertex to the component, increasing its size by 1. In N-1 steps, every vertex would be merged to the current one if we have a connected graph.

In Kruskal's, you do not keep one connected component but a forest. At each stage, you look at the globally smallest edge that does not create a cycle in the current forest. Such an edge has to necessarily merge two trees in the current forest into one. Since you start with N single-vertex trees, in N-1 steps, they would all have merged into one if the graph was connected.

In Kruskal's algorithm at any point of time, the set of selected edges need not belong to the same tree. But at the end we will have a single spanning tree.

In terms of implementation the difference is...

In Kruskal's every edge is considered only once. Either it is selected or rejected.
In Prim's certain edges are considered more than once. So in Kruskal's implementation we can use heap structure for selecting the next best edge efficiently where as in Prim's it is not possible.

http://stackoverflow.com/questions/1195872/kruskal-vs-prim

Use Prim's algorithm when you have a graph with lots of edges.

For a graph with V vertices E edges, Kruskal's algorithm runs in O(E log V) time and Prim's algorithm can run in O(E + V log V) amortized time, if you use a Fibonacci Heap.

Prim's algorithm is significantly faster in the limit when you've got a really dense graph with many more edges than vertices. Kruskal performs better in typical situations (sparse graphs) because it uses simpler data structures.

Read full article from Greedy Algorithms | Set 2 (Kruskal’s Minimum Spanning Tree Algorithm) | GeeksforGeeks

Greedy Algorithms | Set 2 (Kruskal's Minimum Spanning Tree Algorithm) | GeeksforGeeks

Labels

Popular Posts