Super detailed decomposition of strongly connected components

Posted by InfinityRogue on Fri, 05 Nov 2021 03:54:33 +0100

(it's a little small. Take your time and you'll get something)
(1)
First we have to understand, what is strong connectivity?
If a path from u to v can be found in any two points u and v in the vertex subset of a directed graph, the subset is called strongly connected

(2)
Second, we have to understand what is a strongly connected component?
If we add any other vertex set to a strongly connected vertex set, it will no longer be strongly connected, then the vertex set is called a strongly connected component

(3)
Finally, we have to understand what is strongly connected component decomposition?
Any directed graph can be decomposed into several disjoint strongly connected components (this disjoint means that the vertices in different components are different), which is strongly connected component decomposition

Note: we generally perform strongly connected components on directed acyclic graphs, because in directed acyclic graphs, there is no strongly connected component, and in undirected graphs, all vertex sets are a strongly connected component, so this decomposition is meaningless

(4)
So how do we decompose strongly connected components?

Firstly, we perform a dfs, select any vertex as the starting point, traverse all the vertices that have not been visited, label the vertices (post order traversal) before backtracking, and repeat the above process for the remaining vertices that have not been visited
This labeling is mainly to make the label of the vertex smaller the closer it is to the tail of the graph, so as to pave the way for subsequent operations

How to label?
We can create a vector container so that the closer we get to the end of the graph, the more we put it first

Why can you start at any vertex?
Because we put the tail first. No matter which point we start to traverse, we can traverse to the point closest to the tail in the current remaining vertex set, so no matter where we traverse, we can put this point and all points behind it correctly.
For example, if we start traversing in the order of vertices 1, 2 and 3, the points we put in are the circled parts.
It is easy to see that the operation can be completed correctly no matter from which point

Why do I need post order traversal? (i.e. recursion down first, and put vector in during backtracking)
Because the closer we get to the end of the graph, the more we put it in first.
If it is a preorder traversal, it is equivalent to putting it first the closer to the head, which will make an error, because we start traversing from any vertex, for example, we traverse in the order of 1, 2 and 3, so we put it first, which is equivalent to vertex 1, but vertex 1 is not the head, which will make an error

We perform dfs again, first invert all edges, and then perform dfs starting from the vertex with the largest label. Each time, the vertex set traversed by dfs constitutes a strongly connected component. Take an array to save which strongly connected component the following points belong to

Reverse is actually recording one more reverse edge when recording an edge
Starting from the one with the largest label is actually traversing the vector from the back to the front

The idea of this is as follows:
Because we are traversing from head to tail, we are traversing only two cases
Example: if we traverse to vertex 1, what we haven't traversed at this time may be "next" or "behind"
(the shadow is equivalent to having been traversed)
For "next": because the upstream points have been traversed, the next points cannot be recursively found, so it does not need to be considered
For "back": suppose that vertex v is "behind" vertex u, because it is "behind", there is a path from u to v. if the edge is reversed, there is still a path from vertex u to vertex v, which is equivalent to a path from V to u in the forward graph. It is proved that u to V is connected

The code is as follows:

#include <iostream>
#include <stdio.h>
#include <vector>
#include <string.h>
using namespace std;
vector<int> data[10005];
vector<int> rdata[10005];
vector<int> flag;
int used[10005];
int kind[10005];
void add_edge(int i, int j)
{
    data[i].push_back(j);
    rdata[j].push_back(i);
}
void dfs(int v)
{
    used[v] = true;
    for(int i = 0; i < data[v].size(); i++)
    {
        if(!used[data[v][i]])
        {
            dfs(data[v][i]);
        }
    }
    flag.push_back(v);
}
void rdfs(int v, int k)
{
    used[v] = true;
    kind[v] = k;
    for(int i = 0; i < rdata[v].size(); i++)
    {
        if(!used[rdata[v][i]])
        {
            rdfs(rdata[v][i], k);
        }
    }
}
int scc()
{
    memset(used, false, sizeof(used));
    for(int i = 0; i < N; i++)
    {
        if(!used[i]) dfs(i);
    }
    memset(used, false, sizeof(used));
    int num = 0;
    for(int i = flag.size() - 1; i >= 0; i--)
    {

        if(!used[flag[i]])
        {
            rdfs(flag[i], num);
            num++;
        }
    }
	return num
}

Topics: C++ Algorithm Graph Theory