Programmer Basic Skills Series 2 - sorting algorithm

Posted by Fatboy on Tue, 01 Feb 2022 17:48:00 +0100

1. Criteria for measuring sorting algorithms

In fact, almost all algorithms can be measured from several convenience: execution efficiency, memory overhead and stability.

The sorting algorithm is the same, mainly from:

• time complexity, including: best case, worst case, average time complexity, and the number of comparisons and exchanges

• spatial complexity, such as sorting in place

• the stability of the sorting algorithm, that is, the elements with the same value, and the sequence after sorting remains unchanged, is called stable sorting

Let's summarize the common sorting algorithms:

Algorithm classification	Time complexity	Spatial complexity	Is it based on comparison	Stable sorting
Bubbling, insertion, selection	O(n2)	O(1)	yes	Bubbling and insertion are stable sorting, and selection is unstable sorting
Quickly arrange and merge	O(nlogn)		yes
Bucket sorting, counting and cardinality	O(n)		no

2. Bubbling, insertion and selection

2.1 bubble sorting

Take a simple example: sort the array [4,5,6,3,2,1]. Take a look at the decomposition process of bubbling:

Bubble sorting includes two key operations: comparison and exchange. According to the above analysis, we write specific codes:

// Bubble sort, a Represents an array, n Represents the size of the array
public void bubbleSort(int[] a, int n) {
  if (n <= 1) return;
 
 for (int i = 0; i < n; ++i) {
    // Flag bit for early exit of bubbling cycle
    boolean flag = false;
    for (int j = 0; j < n - i - 1; ++j) {
      if (a[j] > a[j+1]) { // exchange
        int tmp = a[j];
        a[j] = a[j+1];
        a[j+1] = tmp;
        flag = true;  // Indicates that there is data exchange      
      }
    }
    if (!flag) break;  // No data exchange, early exit
  }
}

Bubble sorting is relatively simple. Note that here we set an exit flag bit to reduce unnecessary loop logic when the array has the current position and subsequent elements have been ordered.

The best time complexity of bubble sorting, that is, when the array is ordered (1, 2, 3, 4, 5, 6), is O(n); The worst-case time complexity, i.e. array reverse order (6, 5, 4, 3, 2, 1), is O(n2); The average time complexity is O(n2).

Bubble sorting is in-situ sorting with spatial complexity O(1).

Bubble sorting is stable because there is no exchange when a[j] == a[j+1], so the sequence will not change.

2.2. Insert sort

Insert sorting idea: divide the data in the array into two intervals, sorted interval and unordered interval. The initial sorted interval has only one element, which is the first element of the array. Then take the elements in the unordered interval, find the appropriate insertion position in the sorted interval, insert them, and ensure that the sorted interval data is always in order. Repeat this process until the element in the unordered interval is empty and the algorithm ends.

Sort the array [4,5,6,3,2,1] and see the decomposition of the insertion process below:

Based on the above analysis, we write the specific code:

// Insert sort, a Represents an array, n Represents the size of the array
public void insertionSort(int[] a, int n) {
  if (n <= 1) return;

  for (int i = 1; i < n; ++i) {
    int value = a[i];// Find where to insert
    for (int j=i-1; j >= 0; --j) {
      if (a[j] > value) {
        a[j+1] = a[j];  // Data movement
      } else {
        break;
      }
    }
    a[j+1] = value; // Insert data, note here a[j+1]，Because after the loop ends and the position to be inserted is found, the following--j It will also be executed.
  }
}

In the best case of insertion sorting, the time complexity, that is, when the array is ordered (1, 2, 3, 4, 5, 6), is O(n); The worst-case time complexity, i.e. array reverse order (6, 5, 4, 3, 2, 1), is O(n2); The average time complexity is O(n2).

Insertion sort is in-situ sort, with spatial complexity O(1).

The insertion sort is stable because a[j] == value does not move a[j], so the sequence will not change.

2.3. Selection and sorting

The implementation idea of selective sorting algorithm is a bit similar to insertion sorting, which is also divided into sorted interval and unordered interval. However, selecting sorting will find the smallest element from the unordered interval every time and put it at the end of the sorted interval.

Based on the above analysis, we write the code:

// Select sort, a Represents an array, n Represents the size of the array
public void selectSort(int[] a, int n) {
  if (n <= 1) return;

  for (int i = 0; i < n; ++i) {
    int min_idx = i;
    // Find the location of the smallest element
    for (int j=i+1; j < n; ++j) {
      if (a[j] < a[min_idx]) {
        min_idx = j;
      } 
    }
    //Move the smallest element of the unordered area to the sorted area
    if(min_idx != i){
       int tmp = a[i];
       a[i] = a[min_idx];
       a[min_idx] = tmp;
    }
  }
}

The time complexity in the best case of sorting, that is, when the array is ordered (1, 2, 3, 4, 5, 6), is O(n); The worst-case time complexity, i.e. array reverse order (6, 5, 4, 3, 2, 1), is O(n2); The average time complexity is O(n2).

The selected sorting is in-situ sorting, and the spatial complexity is O(1).

Selecting sorting is unstable because the smallest element in the unordered area and the exchange position of the previous element must be found every time, which destroys the stability. For example: [5,8,5,2,4], when the first 5 and 2 exchange positions, the stability is destroyed.

3. Fast and merge

Both quick sort and merge sort use the divide and conquer idea, which is suitable for large-scale data sorting and is more commonly used than the above three.

3.1 merging and sorting

The core idea of merging and sorting is to divide the array into two parts from the middle, then sort the two parts respectively, and then merge the two parts in good order, so that the whole array is in order.

Recursion is used in divide and conquer. Let's look at the following recursion formula:

　　　　merge_sort(p..r) = merge_sort(p...m) + merge_sort(m+1...r), termination condition: when p > = R, the decomposition will not continue.

Here, the decomposition process is over, but there is also a merging process after decomposition, that is, merge the ordered A[p...m] and A[m+1...r] into an ordered array, and then put them into A[p...r].

Logic of merging: apply for a temporary array tmp with the same size as A[p...r]. Use two cursors I and j to point to the first element of A[p...m] and A[m+1...r], respectively. Compare the two elements A[i] and A[j]. If A[i] < = A[j], put A[i] into the temporary array tmp and move I one bit later. Otherwise, put A[j] into the array tmp and move J one bit later. Continue the above comparison process until all the data in one sub array is put into the temporary array, and then add the data in the other array to the end of the temporary array in turn. At this time, what is stored in the temporary array is the result of the combination of the two sub arrays. Finally, copy the data in the temporary array tmp to the original array A[p...r].

Based on the above analysis, we write the code of the whole decomposition and merging:

public class MergeSort {

    public static void main(String[] args) {
        int[] arr = {4,5,6,3,2,1};
        int len = arr.length;
        sort(arr,0,len-1);
        for (int item : arr){
            System.out.print(item);
        }
    }

    /**
     * decompose
     */
    private static void sort(int[] arr,int left,int right){
        //Termination conditions
        if (left >= right)return;

        int mid = (left+right) / 2;

        sort(arr,left,mid);
        sort(arr,mid+1,right);
        //If the two arrays are already in order, there is no need to merge
        if (arr[mid] <= arr[mid+1]){
            return;
        }
        //Merge two sections
        merge(arr,left,mid,right);
    }

    /**
     * merge
     */
    private static void merge(int[] arr,int left,int mid,int right){
        //Create temporary array
        int[] tmp = new int[right-left+1];

        int i=left,j=mid+1;
        for (int k=0;k<tmp.length;k++){
            //If left
            if (i == mid+1){
                tmp[k] = arr[j++];
            }else if (j == right+1){
                tmp[k] = arr[i++];
            }
            //If the equal sign is removed in this step, the stability of the adjustment will be destroyed
            else if (arr[i] <= arr[j]){
                tmp[k] = arr[i++];
            }else {
                tmp[k] = arr[j++];
            }
        }
        //use Java The built-in array copy can also write a loop
        System.arraycopy(tmp, 0,arr, left, tmp.length);
    }
}

public class MergeSort {

    public static void main(String[] args) {
        int[] arr = {4,5,6,3,2,1};
        int len = arr.length;
        sort(arr,0,len-1);
        for (int item : arr){
            System.out.print(item);
        }
    }

    /**
     * decompose
     */
    private static void sort(int[] arr,int left,int right){
        //Termination conditions
        if (left >= right)return;

        int mid = (left+right) / 2;

        sort(arr,left,mid);
        sort(arr,mid+1,right);
        //If the two arrays are already in order, there is no need to merge
        if (arr[mid] <= arr[mid+1]){
            return;
        }
        //Merge two sections
        merge(arr,left,mid,right);
    }

    /**
     * merge
     */
    private static void merge(int[] arr,int left,int mid,int right){
        //Create temporary array
        int[] tmp = new int[right-left+1];

        int i=left,j=mid+1;
        for (int k=0;k<tmp.length;k++){
            //If left
            if (i == mid+1){
                tmp[k] = arr[j++];
            }else if (j == right+1){
                tmp[k] = arr[i++];
            }
            //If the equal sign is removed in this step, the stability of the adjustment will be destroyed
            else if (arr[i] <= arr[j]){
                tmp[k] = arr[i++];
            }else {
                tmp[k] = arr[j++];
            }
        }
        //use Java The built-in array copy can also write a loop
        System.arraycopy(tmp, 0,arr, left, tmp.length);
    }
}

Topics: data structure

Programmer Think