Thinking Guide for learning algorithms and brushing questions

Posted by toovey on Sun, 20 Feb 2022 19:25:49 +0100

First of all, what we're talking about here are ordinary data structures. I'm not engaged in algorithm competition. Noriko was born. I can only solve conventional problems. In addition, the following is a summary of my personal experience. No algorithm book will write these things, so please try to understand my perspective and don't dwell on details, because this article is to establish a framework understanding of data structure and algorithm.

From the whole to the details, from the top to the bottom, from the abstract to the concrete framework, thinking is universal. It is efficient not only to learn data structures and algorithms, but also to learn any other knowledge.

1, Storage mode of data structure

There are only two ways to store data structures: array (sequential storage) and linked list (linked storage).

There are hash tables, stacks, queues, heaps, trees, graphs and other data structures?

When we analyze problems, we must have the idea of recursion, from top to bottom, from abstract to concrete. You have listed so many. Those belong to the "superstructure", and arrays and linked lists are the "structural foundation". Because those diversified data structures are special operations on linked lists or arrays, and the API s are different.

For example, the two data structures of "queue" and "stack" can be implemented by either linked list or array. To implement with array, we must deal with the problem of capacity expansion and contraction; Using a linked list does not have this problem, but it needs more memory space to store node pointers.

The adjacency list is a linked list, and the adjacency matrix is a two-dimensional array. The adjacency matrix can judge the connectivity quickly, and can carry out matrix operation to solve some problems, but it is very space-consuming if the graph is sparse. Adjacency table saves space, but many operations are certainly not as efficient as adjacency matrix.

"Hash table" is to map keys into a large array through hash function. Moreover, for the method of solving hash conflict, the zipper method needs the characteristics of linked list, which is simple to operate, but needs additional space to store pointers; Linear probing requires array characteristics for continuous addressing. It does not need the storage space of the pointer, but the operation is slightly more complex.

The "tree" implemented by array is the "heap", because the "heap" is a complete binary tree. Storing with array does not require node pointers, and the operation is relatively simple; Using linked list is a very common kind of "tree". Because it is not necessarily a complete binary tree, it is not suitable for array storage. Therefore, based on this linked list "tree" structure, various ingenious designs are derived, such as binary search tree, AVL tree, red black tree, interval tree, B tree and so on, to deal with different problems.

Friends who know about Redis database may also know that Redis provides several common data structures such as list, string, collection, etc., but for each data structure, there are at least two storage methods at the bottom, so as to use the appropriate storage methods according to the actual situation of stored data.

To sum up, there are many kinds of data structures. Even you can invent your own data structure, but the underlying storage is nothing more than array or linked list. Their advantages and disadvantages are as follows:

Because the array is compact and continuous storage, it can be accessed randomly. The corresponding elements can be found quickly through index, and the storage space is relatively saved. However, because of continuous storage, the memory space must be allocated enough at one time. Therefore, if the array needs to be expanded, it needs to reallocate a larger space and copy all the data. The time complexity is O(N); And if you want to insert and delete in the middle of the array, you must move all the subsequent data every time to maintain continuity, with time complexity O(N).

Because the elements of the linked list are discontinuous, but the pointer points to the position of the next element, there is no problem of array expansion; If you know the precursor and follower of an element, the operation pointer can delete the element or insert a new element. The time complexity is O(1). However, because the storage space is discontinuous, you can't calculate the address of the corresponding element according to an index, so you can't access it randomly; Moreover, since each element must store pointers to the positions of the front and rear elements, it will consume relatively more storage space.

2, Basic operation of data structure

For any data structure, its basic operation is nothing more than traversal + access. Another specific point is: add, delete, check and change.

There are many kinds of data structures, but their purpose is to add, delete, check and modify as efficiently as possible in different application scenarios. In other words, isn't this the mission of data structure?

How to traverse + access? From the highest level, the traversal + access of various data structures is nothing more than two forms: linear and nonlinear.

Linear is represented by for/while iteration, and nonlinear is represented by recursion. To be more specific, there are the following frameworks:

Array traversal framework, typical linear iterative structure:

void traverse(int[] arr) {
    for (int i = 0; i < arr.length; i++) {
        // Iterative access arr[i]
    }
}

The linked list traversal framework has both iterative and recursive structures:

/* Basic single linked list node */
class ListNode {
    int val;
    ListNode next;
}

void traverse(ListNode head) {
    for (ListNode p = head; p != null; p = p.next) {
        // Iterative access to p.val
    }
}

void traverse(ListNode head) {
    // Recursive access to head val
    traverse(head.next)
}

Binary tree traversal framework, typical nonlinear recursive traversal structure:

/* Basic binary tree node */
class TreeNode {
    int val;
    TreeNode left, right;
}

void traverse(TreeNode root) {
    traverse(root.left)
    traverse(root.right)
}

Do you think the recursive traversal of binary tree is similar to that of linked list? Look at the binary tree structure and single linked list structure. Are they similar? If you add a few more forks, will you traverse the N-fork tree?

The binary tree framework can be extended to the traversal framework of N-ary tree:

/* Basic N-ary tree node */
class TreeNode {
    int val;
    TreeNode[] children;
}

void traverse(TreeNode root) {
    for (TreeNode child : root.children)
        traverse(child);
}

The traversal of N-ary tree can be extended to the traversal of graph, because graph is the combination of several N-ary trees. Do you think there may be rings in the graph? This is easy to do. Just mark it with a Boolean array visited. There is no code here.

The so-called framework is routine. No matter add, delete, check or change, these codes are always inseparable structures. You can take this structure as an outline and add code to the framework according to specific problems. Specific examples will be given below.

3, Algorithm brushing Guide

First of all, it should be clear that data structure is a tool and algorithm is a method to solve specific problems through appropriate tools. In other words, before learning algorithms, we must at least understand the commonly used data structures and their characteristics and defects.

So how to brush questions in LeetCode?

Brush the binary tree first, brush the binary tree first, brush the binary tree first!

Why brush the binary tree first, because the binary tree is the easiest to cultivate frame thinking, and most of the algorithm skills are essentially the traversal of the tree.

Brush binary tree, see the problem, no idea? According to the questions of many readers, in fact, we are not without ideas, but do not understand what we call "framework". Don't underestimate these lines of broken code. Almost all the topics of binary tree are a set, and this framework comes out.

void traverse(TreeNode root) {
    // Preorder traversal
    traverse(root.left)
    // Middle order traversal
    traverse(root.right)
    // Postorder traversal
}

For example, I casually take the solution of several problems, regardless of the specific code logic, just look at how the framework works.

LeetCode 124, difficulty Hard, let you find the maximum path sum in the binary tree. The main codes are as follows:

int ans = INT_MIN;
int oneSideMax(TreeNode* root) {
    if (root == nullptr) return 0;
    int left = max(0, oneSideMax(root->left));
    int right = max(0, oneSideMax(root->right));
    ans = max(ans, left + right + root->val);
    return max(left, right) + root->val;
}

You see, this is a post order traversal.

LeetCode 105, difficulty Medium, let you restore a binary tree according to the results of preorder traversal and inorder traversal. It's a classic problem. The main codes are as follows:

TreeNode buildTree(int[] preorder, int preStart, int preEnd, 
    int[] inorder, int inStart, int inEnd, Map<Integer, Integer> inMap) {

    if(preStart > preEnd || inStart > inEnd) return null;

    TreeNode root = new TreeNode(preorder[preStart]);
    int inRoot = inMap.get(root.val);
    int numsLeft = inRoot - inStart;

    root.left = buildTree(preorder, preStart + 1, preStart + numsLeft, 
                          inorder, inStart, inRoot - 1, inMap);
    root.right = buildTree(preorder, preStart + numsLeft + 1, preEnd, 
                          inorder, inRoot + 1, inEnd, inMap);
    return root;
}

Don't look at the many parameters of this function. It's just to control the array index. In essence, the algorithm is a preorder traversal.

LeetCode 99, Hard, restore a BST, the main codes are as follows:

void traverse(TreeNode* node) {
    if (!node) return;
    traverse(node->left);
    if (node->val < prev->val) {
        s = (s == NULL) ? prev : s;
        t = node;
    }
    prev = node;
    traverse(node->right);
}

This is a middle order traversal. There should be no need to explain what it means for a BST middle order traversal.

You see, the problem of Hard difficulty is just like this, and it is also so regular. Just write out the framework and add something to the corresponding position. Isn't that the idea.

For a person who understands a binary tree, it doesn't take long to brush the title of a binary tree. If you are afraid of the first 10 questions, you may as well start from the first 10 questions; Combine the framework and do 20 more, maybe you will have some understanding of yourself; Brush the whole topic, and then do what retrospective dynamic rule divide and conquer topic, you will find that as long as recursion is involved, it is all a tree problem.

Let's take another example and say a few questions we wrote in our previous article.

The detailed explanation of dynamic programming says that the violent solution to the problem of collecting change is to traverse an N-ary tree:

def coinChange(coins: List[int], amount: int):

    def dp(n):
        if n == 0: return 0
        if n < 0: return -1

        res = float('INF')
        for coin in coins:
            subproblem = dp(n - coin)
            # Subproblem has no solution, skip
            if subproblem == -1: continue
            res = min(res, 1 + subproblem)
        return res if res != float('INF') else -1
    
    return dp(amount)

What if you don't understand so much code? By directly extracting the framework, we can see the core idea:

# It's just an N-ary tree traversal problem
def dp(n):
    for coin in coins:
        dp(n - coin)

In fact, many dynamic programming problems are traversing a tree. If you are familiar with the traversal operation of the tree, you at least know how to convert ideas into code and how to extract the core ideas of others' solutions.

Let's take a look at the backtracking algorithm. The detailed explanation of the backtracking algorithm mentioned above simply and directly. The backtracking algorithm is a forward and backward traversal problem of N-ary tree, without exception.

For example, Queen N, the main code is as follows:

void backtrack(int[] nums, LinkedList<Integer> track) {
    if (track.size() == nums.length) {
        res.add(new LinkedList(track));
        return;
    }
    
    for (int i = 0; i < nums.length; i++) {
        if (track.contains(nums[i]))
            continue;
        track.add(nums[i]);
        // Enter the next level decision tree
        backtrack(nums, track);
        track.removeLast();
    }

/* Extract the N-ary tree traversal framework */
void backtrack(int[] nums, LinkedList<Integer> track) {
    for (int i = 0; i < nums.length; i++) {
        backtrack(nums, track);
}

The traversal framework of N-ary tree is found. Do you think it's important for the tree structure to be heavy?

To sum up, for those who are afraid of algorithms, you can brush the relevant topics of the tree first and try to look at the problem from the framework instead of focusing on the details.

Tangle details, such as whether tangled i should be added to N or n - 1, and whether the size of this array should be n or n + 1?

Looking at the problem from the framework is to extract and expand based on the framework, which can not only quickly understand the core logic when looking at other people's solutions, but also help to find the thinking direction when we write our own solutions.

Of course, if the details go wrong, you can't get the right answer, but as long as there is a framework, you can't go wrong again, because your direction is right.

However, if you don't have a frame in mind, you can't solve the problem at all. If you give you the answer, you won't find that it's a tree traversal problem.

This kind of thinking is very important. The process of finding the state transition equation summarized in the detailed explanation of dynamic programming. Sometimes I write the solution according to the process. To be honest, I don't know why it's right. Anyway, it's right...

This is the power of the framework, which can ensure that you can still write the correct program when you are about to fall asleep; Even if you can't do anything, you can be one level higher than others.

4, Sum up a few sentences

The basic storage methods of data structures are chain and sequence. The basic operation is to add, delete, check and change. The traversal methods are nothing but iteration and recursion.

It is suggested to brush the algorithm questions from the "tree" classification. Combined with frame thinking, brush these dozens of questions, and the understanding of tree structure should be in place. At this time, you may have a deeper understanding of the idea by looking at algorithms such as backtracking, dynamic rules and divide and conquer.

Topics: Algorithm data structure