Dichotomy and quadratic problem of algorithm performance optimization

Posted by Gulsaes on Fri, 11 Feb 2022 13:07:29 +0100

Performance optimization cases

1. Dichotomy search

Ordered array lookup

What is an ordered array?

If the values in an array are arranged in a certain order, we call it an ordered array

For example: array [2, 8, 15, 24, 66, 88, 100]

Now I want to complete a function to find out whether a number exists in the array. For example:

24 exists in the array, and the index value is 31000, which is not in the array

Linear search

Method 1: we use the brute force search method to determine whether the number exists by traversing the entire array

public static int orangeFind(int[] array, int aim) {
    for (int i=0; i<array.length; i++) {
        if (aim == array[i]) {
            return i;
        }
    }
    return -1;
}

The time complexity of method 1 is: O(N)

Method 2: dichotomy search

Because the array is ordered, we can first compare the values corresponding to the intermediate index values, and then narrow the search range

Taking the number 15 as an example, the steps are as follows:

  1. First select the value with index value of 3 (24). Since 15 < 24, the value on the right of 24 will be discarded
  2. Select the value (8) with the index value of 1 in the left part. Since 15 > 8, the value in the left part of 8 will be discarded
  3. Finally, the array[2] is left. Through comparison, it is found that this is what we want

As above, only three steps are needed to find the corresponding element. The code is as follows:

public static int orangeFind(int[] array, int aim) {
    // Initialize left = leftmost, right = rightmost
    int left = 0;
    int right = array.length - 1;
    
    // When left > right, the traversal is completed
    while (left <= right) {
        // Get intermediate value
        int middle = (left + right) / 2;
        int middleValue = array[middle];
        if (middleValue == aim) {
            // The target has been found
            return middle;
        } else if (maddleValue < aim) {
            // The target is on the right side of the middle. Reset left
            left = middle + 1;
        } else {
            // The target is on the left side of the middle. Reset right
            right = middle - 1;
        }
    }
    return -1;
}

Next, let's give some examples with a large amount of data to compare the time complexity of their search:

We only consider the worst case, because the best case is once, and there is no comparative value

Violence searchbinary search
Time complexityO(N)O(log(N))
Array 1001006.64
Array 1w1w13.28
Array 100w100w19.93

It can be seen from the data in the table that the larger the array, the larger the amount of data, and the final performance improvement is quite obvious, with a difference of more than one grade

2. Quadratic problem

Title: suppose you are given an array of numbers, and each number in the array is between 0 and 10. Please find out the repeated numbers in it

for instance:

The number array is:[0, 8, 6, 2, 5, 6, 8, 6, 10, 8]

The repeated number is:[8, 6]

Method 1: brute force cracking

Get one element at a time, and judge whether this element and subsequent elements are repeated in turn

In the first step, we select the first element 0 and judge whether it is the same as the following 8 elements

In the second step, we select the second element 8 and judge whether it is the same as the following 7 elements

...

The code is as follows:

public static ArrayList<Integer> repeat(int[] array) {
    ArrayList<Integer> result = new ArrayList<>();
    for (int i=0; i<array.length; i++) {
        // Judge whether the element at position i is equal to the element at position j
        for (int j=i+1; j<array.length; j++) {
            if (array[i] == array[j]) {
                // Judge whether the result array contains the element. If not, add the element
                if (!result.contains(array[i])) {
                    result.add(array[i]);
                }
            }
        }
    }
    return result;
}

What is the time complexity of method 1? There are two for loops nested, so it is: O(N^2) (the complexity of the contains method is not included here)

Method 2: labeling method

Since the stem says that each number is between 0 and 10, we guess that we can use an array with a length of 11 to mark whether the 11 numbers 0 to 10 exist. 1 means once, 2 means twice, and N means n times

Suppose there is an array of 8, 6 and 5, then we build an array with length of 11, and the position with index of 8, 6 and 5 will be marked as 1

How to judge whether it is repeated?

If the value of the corresponding position > = 1 is encountered next time, it means that it is repeated. Finally, we only need to get the value of the index with the corresponding value > 1 to know which elements are repeated

Detailed steps:

  1. Pre step: we first create an array with the name exist, indicating whether the number exists. The length of the array is 11, and the values in the array are 0 by default, which means that the 11 numbers from 0 to 10 do not exist by default

  2. In the first step, we scan the number 0, so we set the number with index value 0 in the exist array to 1

    exist[0] = 1;
    
  3. In the second step, we scan the number 8, so we set the number with the index value of 8 in the exist array to 1

    exist[8] = 1;
    
  4. ... and so on

  5. In step 6, the index position of our exist array 0, 8, 6, 2 and 5 has been set to 1. We scan that the 6th bit of the original array is 6, and the value of the position with index value of 6 in the exist array is already 1, so we know that 6 is repeated, then write 6 into the result array, and change the index value corresponding to 6 to 2, indicating that it has occurred twice

  6. What if you meet 6 again

    At this time, you can't continue to add elements to the result. You need to judge that the corresponding mark is 1 before adding them to the result

Perfect code:

public static ArrayList<Integer> repeat(int[] array) {
    ArrayList<Integer> result = new ArrayList<>();
    int[] exist = new int[11];
    for (int i=0; i<array.length; i++) {
        int value = array[i];
        // Judge whether the value value exists in the current exist array. If the value already exists, the identification is repeated and added to the result array. If it is > 1, it will not be added to the result array to prevent the value value value from appearing repeatedly in the result
        if (exist[value] == 1) {
            result.add(value);
        }
        // Record the value into the corresponding index of the exist array
        exist[value]++;
    }
    return result;
}

Obviously, it can be seen from the code that the problem requirements can be completed in one cycle, so the time complexity is o(N). Some people may have doubts. If you declare more exist space, doesn't the space complexity increase?

This is actually a classic case of space for time. In programming, unless there are special circumstances, we only consider the time complexity and ignore the space complexity

There are two approaches to the efficiency gap:

Method 1Method 2
Time complexityO(N^2)O(N)
Array 1001w100
Array 1w100 million1w
Array 100w1w billion100w

From this, we can find that with the increase of the number of arrays, the performance gap will become more and more obvious

Topics: Algorithm Optimize