Detailed C + + implementation of depth first search

Posted by bibie on Fri, 07 Jan 2022 21:59:47 +0100

DFS

The full text is about 4000 words. If you start learning DFS, I believe it will be of great help to you. Your ability is limited. Many terms are not professional enough. Long live your understanding

Depth first search of binary tree

The concept of binary tree will not be discussed in detail here

The array is used to store the binary tree. The root node starts from 1 (convenient for calculation). If the subscript of the parent node is n, then the subscript of the left son is 2n and the subscript of the right son is 2n+1

       2
     /   \
    3     5
   / \    / \
  7   8  6    1
 /     \
4       9

The tree above can be expressed as an array

Array subscript: 0 1 2 3 4 5 6 7 8 9 10 11 
     Value: 0 2 3 5 7 8 6 1 4 0 0 9

Because depth first traversal is depth first, the image is not to hit the south wall or look back.

The depth first traversal is based on the stack. Because we record the route, the following steps should be like this (you might as well take a paper and pen to simulate this process):

  1. First, prepare a stack;
  2. Then put the root node into the stack. At this time, the program should output the value of the node into the stack (the value of the node into the stack is output every time). Then if there is a son on the left of the root node, we start to the left until 4, because there are no nodes around 4. At this time, the elements in the stack are 2, 3, 7, 4 from the bottom of the stack to the top of the stack;
  3. At this time, the top of the stack is 4, because there are no nodes around 4, so we take 4 out of the stack; At this time, the top of the stack is 7. Similarly, there are no nodes around 7, so 7 leaves the stack; At this time, the top of the stack is 3, and then there are nodes on the right of 3, so put 8 on the right of 3 into the stack, and there are nodes on the right of 8, so put 9 into the stack. At this time, the elements in the stack from the bottom of the stack to the top of the stack are 2, 3, 8 and 9;
  4. Because we have all gone through 389, and there are no new nodes for us to go, 389 comes out of the stack. At this time, the top of the stack is 2, and then we can see that there are no visited nodes on the right side of the root node, so we put 5 into the stack. I believe you should understand that the depth is better. First, traverse the route in the binary tree, and then you can try to finish the rest by yourself;

Therefore, the depth first traversal result of the above tree should be 2 3 7 4 8 9 5 6 1

In a word, go straight to the left, and go to the right when the left is finished or has passed

But when we write the code, we don't need to open a stack manually. The computer uses the stack called call stack internally, that is, we can implement the above process through recursion. The specific process is the same as the above without too much entanglement. The following is the code implemented by recursion

Note:

  1. The null node is judged by 0 (this is not rigorous, because sometimes we need to save 0, we can replace 0 with a low utilization number, such as 0x3f3f3f);
  2. In order to avoid out of bounds, the tree array is slightly larger;
  3. The vis array determines whether the current point has been accessed. true if accessed, false if not accessed;
int tree[50];
bool vis[50];

void dfs(int u) {	// u represents the current point
  
	vis[u] = true;  // Mark the current node as visited
	printf("%d ", tree[u]); // output

	if(tree[2 * u] != 0 && !vis[2 * u]) {
		dfs(2 * u);
	} 

	if(tree[2 * u + 1] != 0 && !vis[2 * u + 1]) {
		dfs(2 * u + 1);
	} 
}

int main() {

	memset(vis, false, sizeof vis);

	tree[1] = 2;
	tree[2] = 3;
	tree[3] = 5;
	tree[4] = 7; 
	tree[5] = 8;
	tree[6] = 6;
	tree[7] = 1;
	tree[8] = 4;
	tree[11] = 9;

	dfs(1);
 	
	return 0;
}

search

Search algorithm is a method that uses the high performance of computer to purposefully enumerate some or all possible situations in the solution space of a problem, so as to find the solution of the problem.

The search algorithm is actually a process of constructing a "solution tree" according to the initial conditions and expansion rules and finding the nodes that meet the target state. From the perspective of the final algorithm implementation, all search algorithms can be divided into two parts - control structure (the way of expanding nodes) and generation system (expanding nodes), and all algorithm optimization and improvement are mainly completed by modifying its control structure. In fact, in this process of thinking, we have unconsciously abstracted a specific problem into a graph theory model tree, that is, the first step in the use of search algorithm is the establishment of search tree.

Total permutation problem https://www.luogu.com.cn/problem/P1706

Let's first think about the practice of the for loop. Suppose we only want three

for(int i = 1; i <= 3; ++i) {
		for(int j = 1; j <= 3; ++j) {
			for(int k = 1; k <= 3; ++k) {
				if(i != j && i != k && j != k) {
					printf("%d %d %d\n", i, j, k);
				}
			}
		}
	}

The problem leads to the solution, but it is obvious that the data range is 0 ~ 9. We can't write that.

At this time, let's think carefully. The number of bits we want to fully arrange is the same as that of several layers of for. We all know that when there are many similarities in this kind of code (it doesn't matter if we don't know, at least now), we can consider using recursion to solve it.

But before we begin to write recursion, let's take a look at this sentence. The first step in the use of search algorithm is the establishment of search tree.

What is a search tree? How? Let's look at the code of the above three layers of the for loop

i = 1, j = 1, k = 1,It doesn't match the meaning of the question, so we don't output it
i = 1, j = 1, k = 2,It doesn't match the meaning of the question, so we don't output it
i = 1, j = 1, k = 3,It doesn't match the meaning of the question, so we don't output it
i = 1, j = 2, k = 1,It doesn't match the meaning of the question, so we don't output it
i = 1, j = 2, k = 2,It doesn't match the meaning of the question, so we don't output it
i = 1, j = 2, k = 3,So we output the answer 1 2 3
i = 1, j = 3, k = 1,It doesn't match the meaning of the question, so we don't output it
i = 1, j = 3, k = 2,So we output the answer 1 3 2
i = 1, j = 3, k = 3,It doesn't match the meaning of the question, so we don't output it

Because of space, we only write the case of i = 1. We try to draw the above process as a tree, which is about as long as the following

i:            1
       /      |     \
j:    1       2       3
    / | \   / | \   / | \
k: 1  2  3 1  2  3 1  2  3

Then the connection path is our answer

We can find that the process of generating this answer is very similar to the depth first search of the tree above. Therefore, the current problem is to build this tree. We certainly won't fill it one by one manually as above. Right? At this time, we need to introduce another concept, called backtracking.

to flash back

Backtracking algorithm is actually a search attempt process similar to enumeration. It is mainly to find the solution of the problem in the search attempt process. When it is found that the solution conditions are not met, it will "backtrack" back and try other paths. Backtracking method is an optimization search method, which searches forward according to the optimization conditions to achieve the goal. However, when a certain step is explored and it is found that the original selection is not excellent or fails to achieve the goal, it will go back to one step and reselect. This technology of going back and going again if it fails is the backtracking method, and the point in a certain state that meets the backtracking conditions is called the "backtracking point". Many complex and large-scale problems can use backtracking method, which is known as "general problem-solving method".

       2
     /   \
    3     5
   / \    / \
  7   8  6    1
 /     \
4       9
 It's still this tree. We can see that when we go to 4, we find that there are no nodes for us to go, so we choose to go back to 7. The step from 4 to 7 is called "backtracking"
The backtracking here is accomplished by calling the stack out of the stack during recursive calls
 Think about how we should deal with the backtracking of the full permutation problem

We can think like this. Don't think too specifically about this tree. We first need an array to store our final answer

vector<int> num;

We still use recursion. With the tree above, we can know that the change of this array should be like this

1
1 1 
1 1 1
1 1		to flash back
1 1 2
1 1		to flash back
1 1 3
1 1		to flash back
1 		to flash back
1 2	  
1 2 1
1 2		to flash back
1 2 2
1 2		to flash back
1 2 3
1 2   to flash back
1     to flash back
1 3 
1 3 1
1 3		to flash back
1 3 2
1 3		to flash back
1 3 3
1 3   to flash back
1     to flash back

As we all know, recursion needs a recursion exit, so it is obvious here that the recursion exit is when the size of the array is 3. At the same time, we also need to output the elements in the array, so the exit code should be this long

void dfs() {
  if(num.size() == len) {
		for (int i : num) {
			cout << i << " ";
		}
		cout << endl;
		return;
	}
}

The next step is to add elements to it. We need to put all the numbers from 1 to 3 in it. There may be no better way to do this step than cycle

void dfs() {
  if(num.size() == len) {
		for (int i : num) {
			cout << i << " ";
		}
		cout << endl;
		return;
	}
  
	for(int i = 1; i <= 3; ++i) {
		num.push_back(i);
		
	}
}

Forget that our function has not written parameters yet! Think about what parameters we need

We need a length so that we know when to exit recursion, so we can adjust our code again

void dfs(int len) {
  if(num.size() == len) {
		for (int i : num) {
			cout << i << " ";
		}
		cout << endl;
		return;
	}
  
	for(int i = 1; i <= len; ++i) {
		num.push_back(i);
		
	}
}


After writing this step, we come to the most critical step. Since we want to make the next step appear the arrangement of 1, 1 and 1, obviously, should we recurse at this time?

for(int i = 1; i <= len; ++i) {
		num.push_back(i);
		dfs(len);	// len represents the length of the program at the end of recursion. Just pass it directly	
	}

If nothing unexpected happens, the program runs at this time and outputs 1, which is one of the results we expect, but we don't list all the cases. The reason is that we don't backtrack. Returning to the changes of the array listed above, we can find that the backtracking process is the process of deleting the last bit of the array. Obviously, we need to add num.pop_back(), so our code should be this long

void dfs(int len) {

	if(num.size() == len) {
		for (int i : num) {
			cout << i << " ";
		}
		cout << endl;
		return;
	}


	for(int i = 1; i <= len; ++i) {
		num.push_back(i);
		dfs(len);
		num.pop_back();
	}
}

This code will output all the results, but we only need to meet the meaning of the topic, so we need to check and modify the code according to the meaning of the topic. The complete code is as follows

#include "iostream"
#include "vector"
using namespace std;

vector<int> num;
bool vis[10]; // This array is used to check whether the array contains duplicate numbers

void dfs(int len) {

	if(num.size() == len) {

		for(int i = 1; i <= len; ++i) vis[i] = false; // Remember to initialize the array every time you check

		for(int i = 0; i < num.size(); ++i) {
			if(vis[num[i]] == true) // If it is repeated, it does not conform to the meaning of the question. On the contrary, mark this point
				return ;
			else
				vis[num[i]] = true; 
		}
		// Well, here's the right answer
		for (int i : num) {
			cout << "    " << i;
		}
		cout << endl;
		return;
	}

	for(int i = 1; i <= len; ++i) {
		num.push_back(i);
		dfs(len);
		num.pop_back();
	}
}

int main() {

	int n;
	cin >> n;
	dfs(n);

	return 0;
}

As a result, a point timed out. At this time, let's think about how to optimize the algorithm

Let's see what's wrong with our previous algorithm

In fact, it can be seen at a glance that we need to check whether there is repetition every time. Can we simplify it to not take this branch as long as there is repetition

Then the recursive tree can be simplified as follows

     Choose nothing
   /    |    \
  1     2     3
 / \   / \   / \
2   3 1   3 1   2
|   | |   | |   |
3   2 3   1 2   1

The same search path is the answer. We can find that compared with the above algorithm, this branch has 13 nodes. Our optimized algorithm has only 15 nodes in total, which greatly reduces the time complexity. This step is also called pruning, and then modify the code details according to the topic. The specific implementation code is as follows

#include "iostream"
#include "vector"
using namespace std;

vector<int> num;
bool vis[10];

void dfs(int len) {

	if(num.size() == len) {

		for (int i : num) {
			cout << "    " << i;
		}
		cout << endl;
		return;
	}

	for(int i = 1; i <= len; ++i) {
		if(!vis[i]) { // Judge whether this node has been accessed
			vis[i] = true; // If you haven't accessed it, mark it and add the num array
			num.push_back(i);
			dfs(len);
			vis[i] = false; // Backtracking, so this point is re marked as an unreachable state
			num.pop_back();
		}
	}
}

int main() {

	int n;
	cin >> n;
	dfs(n);

	return 0;
}

AC (applause)

To summarize the steps we just took:

  1. Draw the search tree (recursive tree)
  2. Find recursive exit
  3. Find recursive conditions
  4. Pruning optimization

Code template

void dfs(int u) {
	if(......) { // Recursive exit
		......;
		return ;
	}
	
	for(int i = 1; i <= u; ++i) {
		if(check()) { // prune
			......; // Select branch
			dfs(u);
			......; //to flash back
		}
	}
}

Next, challenge the problem

Eight queens https://leetcode-cn.com/problems/eight-queens-lcci/submissions/

Follow the steps above to solve this problem

Here, let's draw a case where the chessboard is 2 * 2

 Choose nothing
 /       \
 Q.      .Q
/ \      / \
Q. Q.   .Q .Q
Q. .Q   Q. .Q

Obviously, none of the above situations are in line with the situation, and the solution will not appear until 4, but there are 256 situations by 4, so we choose the two situations that are in line with the meaning of the question

.Q..
...Q
Q...
..Q.

..Q.  
Q...
...Q
.Q..

Obviously, the exit of recursion is when the number of columns reaches the required chessboard size

But this time we need to think about pruning

Because when the size of the chessboard reaches 9, it is roughly estimated that we need 435848049 matches to get all the answers. With the check time, it is obvious that this algorithm cannot be completed in 1 second, and the memory will explode

After careful consideration, it is not difficult to find that pruning is when queens conflict with each other, that is, every time we place the next queen, judge whether the current placement will conflict with the Queens already placed in front. If there is a conflict, we can choose not to do the latter. Of course, we only need to check the upper left The situation on the top and right, because the queen behind has not been put

We use res two-dimensional array to represent the final answer and board to represent the chessboard

class Solution {
public:
    vector<vector<string>> res;

    vector<vector<string>> solveNQueens(int n)
    {
        //initialization
        vector<string> board(n, string(n, '.'));
        //Start selection
        dfs(board, 0);
        return res;
    }

    void dfs(vector<string>& board, int row)
    {
        //Recursive exit
        if (row == board.size())
        {
            res.push_back(board);
            return;
        }

        int n = board[row].size();

        for (int col = 0; col < n; col++)
        {
            //Judge whether the current position can be attacked by other queens
            if (!isVal(board, col, row)) continue;

            //choice
            board[row][col] = 'Q';
            dfs(board, row + 1);
            //Undo selection
            board[row][col] = '.';
        }
    }

    //Because there are no queens under the new line of queens, you don't need to check the bottom. Similarly, you don't need to check the left and right. You only need to check the upper left and upper right
    bool isVal(vector<string>& board, int col, int row)
    {
        //upper
        for (int i = 0; i < row; i++)
        {
            if (board[i][col] == 'Q')
                return false;
        }
        //Upper left
        for (int i = row, j = col; i >= 0 && j >= 0; --i, --j)
        {
            if (board[i][j] == 'Q')
                return false;
        }
        //Upper right
        for (int i = row, j = col; i >= 0 && j <= board[row].size(); --i, ++j)
        {
            if (board[i][j] == 'Q')
                return false;
        }
        //No delegates can be placed
        return true;
    }
};

Success A (applause)

Finally, I hope you can implement the above code independently. Only when you really do it, can you deeply understand the wonders of this algorithm with your family

Topics: C++ Algorithm search engine