Collate Zhang Ming's Notes on Data Structure and Algorithms

## 7. Diagram

### 7.1 Diagram concepts and abstract data types

#### Definitions and terminology of 7.1.1 diagrams

1.G=(V,E) Representation

- V is the set of vertices
- E is a set of edges

2. Complete Diagram

3. Sparse Map - Sparsity (Sparsity Factor)
- The number of edges is less than 5% of the complete graph

4. Dense Map

5. undirected graph - Pairwise disorder of edges involving vertices
- Actually, it's two-way

6. Directed Graph - Even pairs of edges involving vertices are ordered

7. Labels

8. Weighted Map

9. Degree of vertex - Number of edges associated with a vertex, degrees in and out

10. Subgraph

11. Loop

#### Abstract Data Type of 7.1.2 Diagram

### Storage structure of 7.2 diagrams

Adjacent Matrix

- For a graph with n vertices, the spatial cost of the adjacent matrix is O(n2), independent of the number of edges.
- Sparsity factor

If there are t non-zero elements in the m*n matrix, the sparse factor is t/m+n

If the sparse factor is less than 0.05, it is the sparse factor

Contiguity table representation

- An undirected graph has the same edge appearing twice in the adjacency table
- Weighted adjacency table representation
- Adjacency Table for Directed Graphs (Out Edge Table)
- Inverse Adjacency Table for Directed Graphs (Entry Table)

Cost of adjacency table space for Graphs

- Undirected graph with n vertices e edges, requires (n+2e) storage units
- Directed graph with n vertices e edges, requires (n+e) storage units

### Traversal of 7.3 Graph

Given a graph G and any of its vertices V0, from V0 all vertices in G are accessed systematically, each vertex accessed only once

- dfs
- breadth-first search
- Topological Sorting

Graph Traversal Question: 1) Disconnected Graphs?2) Circuit?

Solution: Sign Location

//A framework of traversal algorithms for Graphs void graph_traverse(Graph&G) { // Initialize the flag bits of all vertices of the graph for (int i=0;i<G.VerticesNum();i++) G.Mark[i] = UNVISITED; //Check that all vertices of the graph have been marked, and if not, continue traversing from the unmarked vertex //do_traverse function takes depth first or breadth first for (int i=0;i<G.VerticesNum();i++) if(G.Mark[i] = UNVISITED) do_traverse(G,i)

#### 7.3.1 Depth-first traversal

- Select an inaccessible point V0 as the source point, visit vertex v0, recursively and deeply search for other vertices adjacent to v0, repeat the above process until all the vertices that are pathable from V0 have been visited
- Then select another unvisited vertex as the source point for a deep search until all vertices are visited.

void DFS(Graph& G,int v) { //Recursive implementation of depth-first search G.Mark[v]=VISITED; //Set Tag Bit to VISITED Visit(G,v); for(Edge e= G.FirstEdge(v);G.IsEdge(e);e=G.NextEdge(e)) if(G.Mark[G.ToVertex(e)]==UNVISITED) //Is the end point of the return edge not accessed DFS(G,G.ToVertex(e)); PostVisit(G,v) //Post-access to vertex v }

#### 7.3.2 Width-first Traversal

- Starting from a vertex v0 in the graph, after the vertex v0 is visited and marked, all the adjacent points of v0 are searched horizontally one level at a time, and horizontally one level at a time for all the adjacent points until all the vertices reached by v0 with a path have been visited
- Select another unvisited vertex as the source point for a wide search until all points have been visited

void BFS(Graph& G,int v) { using std::queue;queue<int>Q; //Using queues in STL Visit(G,v); //Access vertex v G.Mark[v]=VISITED; //Set Tag Bit to VISITED Q.pop(); //Queue Top Element Queued while(!Q.empty()) { //If the queue is not empty int u = Q.front(); //Get the top element of the queue Q.pop(); for(Edge e= G.FirstEdge(v);G.IsEdge(e);e=G.NextEdge(e)) //All unreached adjacent points enqueued if(G.Mark[G.ToVertex(e)]==UNVISITED) //Is the end point of the return edge not accessed Visit(G,G.ToVertex(e)); G.Mark[G.ToVertex(e)]=VISITED; Q.push(G.ToVertex(e)); } }}

Time complexity of graph search

DFS and BFS access each vertex once and process each edge once (each edge of undirected graph is processed in two directions)

- Using adjacency table, directed graph cost O(n+e), undirected graph cost O(n+2e)
- Using adjacent matrix representation, total cost O(n2)

#### 7.3.3 Topological Sorting

For directed acyclic graph G=(V, E), the linear sequence of vertices in V is called a topological sequence, which satisfies:

If there is a path from vertex vi to vj in directed acyclic graph G, vertex vi must precede vertex vj in the sequence

Topological sorting refers to the process of arranging all vertices in a directed acyclic graph into a linear sequence without violating the prerequisite relationships.

//Graph Topology Sorting by Queue void TopsortbyQueue(Graph&G) { for (int i=0;i<G.VerticesNum();;i++) G.Mark[i]=UNVISITED;//Initialization using std::queue;queue<int>Q; for(i=0;i<G.VerticesdNum();i++) //0 vertices queued if (G.Indegree[i]==0) Q.push(i); while (!Q.empty()) { //Queue is not empty int v=Q.front();Q.pop(); //Get the top element of the queue, queue Visit(G,v); G.Mark[v]=VISITED; //Set marker position to VISITED for(Edge e = G.FirstEdge(v);G.IsEdge(e);e=G.NextEdge(e)) { G.Indegree[G.ToVertex(e)]--; //Adjacent vertex entry minus 1 if(G.Indegree[G.ToVertex(e)]==0) //Vertex Entry Reduced to 0 Entry Q.push(G.ToVertex(e)); }} for(i =0;i<G.VerticesNum(); i++) //Determine if there are rings in the graph if(G.Mark[i]==UNVISITED) { cout<<"This graph has rings" ; break; }}

### 7.4 Shortest Path

#### 7.4.1 Single Source Shortest Path-Dijkstra Algorithm

Given a weighted graph G=<V, E>, where the weighted W[vi,vj] on each edge (vi,vj) is a non-negative real number.Calculate the shortest path from any given source point s to all other nodes

class Dist { //Dist class, which holds the shortest path information public: int index; //Index Value of Node int length; //Current shortest path length int pre; //The last node of the path }; void Dijkstra(Graph&G,int s,Dist*&D) //s is the source point D=new Dist[G.VerticesNum()]; //Record Shortest Path for (int i=0;i<G.VerticesNum();i++) { // Initialization G.Mark[i]=UNVISITED; D[i].index=i;D[i].length=INFINITE ; D[i].pre = s; } D[s].length = 0; MinHeap<Dist>H(G.EdgesNum()); H.Insert(D[s]); for (i=0;i<G.VerticesNum();i++) { bool FOUND=false; Dist d; while(!H.isEmpty()) { d=H.RemoveMin(); //Obtain the node with the smallest s-path length if(G.Mark[d.index]==UNVISITED) { //Jump out of the loop if not accessed FOUND=true;break; }} if(!FOUND) break;//Jump out of the loop if there is no shortest path that meets the criteria int v =d.index; G.Mark[v]=VISITED; //Set Tag Bit to VISITED for (Edge e=G.FirstEdge(v);G.IsEdge(e);e=G.NextEdge(e))//Refresh Shortest Path if (D[G.ToVertex(e)].length>(D(e).length+G.Weight(e))) { D[G.ToVertex(e)].length=(D(e).length+G.Weight(e)) D[G.ToVertex(e)].pre=v; H.Insert(D[G.ToVertex(e)]); }} }

Time cost of Dijkstra algorithm

Each time you change D[i].length, do not delete it, and add a new value (smaller) as a new element in the heap.When the old value is found, the node must be marked VISITED and thus ignored; at worst, it will increase the number of elements in the heap from O(V) to O(|E|), at an overall time cost of O(|V|+|E|) log|E|).

The Dijkstra algorithm supports loops, but does not support negative weighted loops.

#### 7.4.2 Shortest path between each pair of nodes

Floyd algorithm for finding the shortest path between each pair of nodes

basic thought

Using adjacent matrix adj to represent weighted directed graph

Initialize adj as adjacent matrix

Do n iterations on the matrix adj(0) to produce a matrix adj(!),adj(2),...adj(n)

After the kth iteration, the value of adj(k)[i,j] is equal to the shortest path length of the path from node vi to node vj whose sequence number is not greater than k

Analysis of Shortest Path Combination

void Floyed(Graph& G,Dist**&D) { int i,j,v; D=new Dist*[G.VerticesNum()]; //Request Space for(i=0;i<G.VerticesNum();i++) D[i]=new Dist[G.VerticesNum()]; for(i=0;i<G.VerticesNum();i++) //Initialization Array D for(j=0;j<G.VerticesNum();j++) { if(i==j) { D[i][j].length = 0; D[i][j].pre = i; } else { D[i][j].length = INFINITE; D[i][j].pre = -1; } } for (v=0;v<G.VerticesNum();v++) for(Edge e = G.FirstEdge(v);G.IsEdge(e);e = G.NextEdge(e)) { D[v][G.ToVertex(e)].length=G.Weight(e); D[v][G.ToVertex(e)].pre=v; } //Update those shortened path lengths after adding new nodes for(v=0；v<G.VerticesNum();v++) for(i=0；i<G.VerticesNum();i++) for(j=0；j<G.VerticesNum();j++) if(D[i][j].length > (D[i][v].length + D[v][j].length)) { D[i][j].length = D[i][v].length + D[v][j].length D[i][j].pre=D[v][j].pre; } }

Time complexity of Floyd algorithm

Triple for cycle O(n3)

### 7.5 Minimum Spanning Tree

concept

The spanning tree of graph G is a tree containing all the vertices of graph G. The sum of all the values in the tree represents the cost. The spanning tree with the least cost among all the spanning trees of graph G is called the minimum spanning tree of graph G (MST).

#### 7.5.1 Prim algorithm

Similar to the Dijkstra algorithm - also greedy

Start with any vertex in the graph (e.g. v0), first include this vertex in MST, U=(V*,E*), initial V*={v0}, E*={}, then look for the least weighted edge (vp), where one of its endpoints is already in the MST and the other is not yet in the MST.vq) and include vq in the MST, so proceed, adding one vertex and one edge with the least weight to the MST each time until all vertices are included in the MST. At the end of the algorithm, V*=V, E* includes n-1 edges in G.

void Prim(Graph&G,int s,Edge*&MST) //s is the source point, MST saves edges int MSTtag=0; MST=new Edge[G.VerticesNum()-1]; //Request space for array MST Dist *D; D=new Dist[G.VerticesNum()]; //Request space for Array D for (int i=0;i<G.VerticesNum();i++) { // Initialization G.Mark[i]=UNVISITED; D[i].index=i;D[i].length=INFINITE ; D[i].pre = s; } D[s].length = 0; G.Mark[s]=VISITED; int v=s; for (i=0;i<G.VerticesNum()-1;i++) { for (Edge e=G.FirstEdge(v);G.IsEdge(e);e=G.NextEdge(e))//Refresh Shortest Path if (G.Mark[G.ToVertex((e)]!=VISITED && (D[G.ToVertex(e)].length > e.weight)) { D[G.ToVertex(e)].length=G.Weight(e); D[G.ToVertex(e)].pre=v;} v = minVertex(G,D); //Find the minimum value in the D array and mark it as v if(v==-1) return; //Disconnected with unreachable vertices G.Mark[v]=VISITED; //Tag Accessed Edge edge(D[v].pre,D[v].index,D[v].length); //Save Edge AddEdgetoMST(edge,MST,MSTtag++); //Add Edge to MST } }

//Find Minimum in Dist Array int minVertex(Graph &G, Dist *&D) { int i,v=-1; int MinDist = INFINITY; for (i=0;i<G.VerticesNum();i++) if ((G.Mark[i] == UNVISITED) && (D[i]<MinDist)) { v=i; //Save the currently discovered minimum distance vertex MinDist = D[i]; } return v; }

Time complexity of Prim algorithm

The Prim algorithm framework is similar to the Dijkstra algorithm in that the distance values in the Prim algorithm do not need to be accumulated and use the smallest edge directly

By directly comparing D array elements, this algorithm determines the total time O(n2) required for the least expensive edge; after removing the vertex with the least weight, it takes O(e) time to modify the D array, so it takes O(n2) time. The algorithm is suitable for dense graphs, and for sparse graphs, it can store distance values in heap as Dijkstra does.

#### 7.5.2 Kruskal algorithm

First, consider n vertices in G as independent n connected components, then the state is a forest with n vertices and no edges, which can be recorded as T=<V, {}>. Then, select the least expensive edge in E. If the edge is dependent on two different connected branches, then add it to T, otherwise, discard it and choose the next one with the least cost.By analogy, until all the vertices in T are in the same connected component, a minimum spanning tree of graph G is obtained.

void Kruskal(Graph &G,Edge*&MST) //MST Stores Edges of Minimum Spanning Tree ParTree <int> A(G.VerticesNum()); //Equivalent Classes MST=new Edge[G.VerticesNum()-1]; //Request space for array MST int MSTtag = 0; for (v=0;v<G.VerticesNum();v++) //Insert all edges into minimum heap H for(Edge e = G.FirstEdge(v);G.IsEdge(e);e = G.NextEdge(e)) { if(G.FromVertex(e)<G.ToVertex(e)) //Anti-Repeat Edge H.insert(e); int EquNum = G.VerticesNum(); //Start with n independent vertex equivalents while(EquNum > 1) { //Merge equivalence classes when the number of equivalent classes is greater than 1 if(H.isEmpty()) { cout<<"No minimum spanning tree exists." <<endl; delete []MST; MST = NULL; return; } Edge e=H.RemoveMin(); //Minimum weight edge int from =G.FromVertex(e); //Record information for this edge int to =G.ToVertex(e); if (A.Different(from,to)) { //The two vertices of edge e are not in an equivalent class A.Union(from,to); //Equivalent class where the two vertices of a merged edge are located AddEdgetoMST(e,MST,MSTtag++); //Add e Edge to MST EquNum--; //Number of equivalent classes minus 1 } } }

The time cost of Kruskal algorithm is close to O(Nloge)