Using Python to implement several common sorting algorithms

Posted by freephoneid on Sun, 02 Jan 2022 12:56:39 +0100

1, Overview of sorting algorithms

Insert sort: direct insert sort, dichotomy insert sort

Select sort: directly select sort, heap sort

Swap sort: bubble sort, quick sort

Merge sort

2, Code implementation

1. Insert sorting directly

In the simplest and direct way, the sequence can be divided into the sorted part on the left and the unordered part on the right. Take out a number from the unordered part each time, compare it with the sorted part one by one, move it, and finally put it in the correct position. This algorithm does not use auxiliary null sequences, and the spatial complexity is O(1) and the average time complexity is O(n^2)

def inset_sort(lst):
    for i in range(1, len(lst)):
        x = lst[i]
        j = i
        while j > 0 and lst[j-1] > x:
            lst[j] = lst[j-1]
            j -= 1
        lst[j] = x

2. Dichotomy insertion sort

On the basis of direct insertion, the position that should be inserted can be quickly found by dichotomy, and the comparison times are optimized. But the data still needs to be moved step by step. Therefore, there is no qualitative change in complexity compared with direct insertion sorting.

def bisect_inset_sort(lst):
    for i in range(1, len(lst)):
        x = lst[i]
        left = 0
        right = i-1
        mid = int(right / 2)
        # Find final location
        while left < right:
            if lst[mid] > x:
                right = mid-1
            elif lst[mid] <= x:
                left = mid+1
            mid = int((left + right) / 2)
        if lst[mid] <= x:
            index = mid+1
        else:
            index = mid
        # Move to final position by swap
        for j in range(i, index, -1):
            lst[j-1], lst[j] = lst[j], lst[j-1]

3. Select Sorting directly

Each time the unordered part is selected, the maximum / minimum value is inserted at the end of the sorted part. Time complexity O(n^2), space complexity O(1)

def select_sort(lst):
    for i in range(len(lst)-1):
        k = i
        for j in range(i, len(lst)):
            if lst[j] < lst[k]:
                k = j
        if i != k:
            lst[i], lst[k] = lst[k], lst[i]

4. Bubble sorting

Every time we check, we compare and exchange positions in pairs. Finally, the maximum element is placed in the correct position, and the larger element may move to a certain position. Average time complexity O(n^2), space complexity O(1). This algorithm is not friendly to the sorted sequence. For example, if the reverse sequence is to be arranged in order, each element needs to move the maximum distance.

def bubble_sort(lst):
    n = len(lst)
    for i in range(n):
        found = False
        # It can be understood as finding the maximum value from 0 to n-i-1 and placing it in the position of n-i-1
        for j in range(0, n - i - 1):
            if lst[j] > lst[j + 1]:
                lst[j], lst[j + 1] = lst[j + 1], lst[j]
                found = True
        # If the reverse order is not encountered after a round of sorting, it means that it has been all arranged
        if not found:
            break

5. Quick sort

(1) Method 1: taking an element as the benchmark, the sequence is divided into two parts greater than and less than the benchmark, which are divided into the left and right sides, and the benchmark is in the middle. Then do the same operation for the left and right parts again. It is easy to know that if the intermediate value can be selected as the benchmark every time to ensure the same size of the two parts, the efficiency is the highest. Average time complexity O(nlogn). For spatial complexity, although fast scheduling does not use additional assistance (only a few temporary variables), recursion is used, resulting in additional costs. Related to concrete implementation.

def quick_sort(lst):
    base_sort(lst, 0, len(lst)-1)


def base_sort(lst, left, right):
    if left >= right:
        return
    i, j = left, right
    k = lst[i]
    while i < j:
        while i < j and lst[j] >= k:
            j -= 1
        if i < j:
            lst[i] = lst[j]
            i += 1
        while i < j and lst[i] <= k:
            i += 1
        if i < j:
            lst[j] = lst[i]
            j -= 1
    lst[i] = k
    base_sort(lst, left, i-1)
    base_sort(lst, i+1, right)

(2) Method 2: divide the sequence into small records, large records and unordered records according to the benchmark. Select the first element from the unordered list every time. If it is greater than or equal to the benchmark, the pointer moves back; If it is less than the benchmark, exchange the first and this element of the large record

def quick_sort2(lst):
    def qsort(lst, begin, end):
        if begin >= end:
            return
        k = lst[begin]
        i = begin
        for j in range(begin+1, end+1):
            if lst[j] < k:
                i += 1
                lst[i], lst[j] = lst[j], lst[i]
        lst[begin], lst[i] = lst[i], lst[begin]
        qsort(lst, begin, i-1)
        qsort(lst, begin+1, end)

    qsort(lst, 0, len(lst)-1)

6. Merge and sort

At first, an element is regarded as an ordered segment, and the adjacent segments are combined into an ordered segment containing two elements, and so on. Until the length of the segment is equal to the length of the sequence. The time complexity is O(nlogn). The auxiliary list is used here, and the length is the same as the original sequence. Therefore, the spatial complexity is O(n).

def merge_sort(lst):
    """Responsible for merging two subsequences"""
    def merge(lfrom, lto, low, mid, high):
        i, j, k = low, mid, low
        while i < mid and j < high:
            if lfrom[i] <= lfrom[j]:
                lto[k] = lfrom[i]
                i += 1
            else:
                lto[k] = lfrom[j]
                j += 1
            k += 1
        # Copy the remaining records in the first paragraph
        while i < mid:
            lto[k] = lfrom[i]
            i += 1
            k += 1
        # Copy the remaining records in the second paragraph
        while j < high:
            lto[k] = lfrom[j]
            j += 1
            k += 1
    """Complete the merge once and merge the subsequences in pairs"""
    def merge_pass(lfrom, lto, llen, slen):
        i = 0
        while i + 2 * slen < llen:
            merge(lfrom, lto, i, i+slen, i+2*slen)
            i += 2*slen
        # There are two remaining segments, and the length of the last segment is less than slen
        if i + slen < llen:
            merge(lfrom, lto, i , i+slen, llen)
        # There's only one paragraph left. Copy it directly
        else:
            for j in range(i, llen):
                lto[j] = lfrom[j]

    slen, llen = 1, len(lst)
    # An auxiliary table of the same length as the original table
    tmp_lst = [None] * llen
    while slen < llen:
        merge_pass(lst, tmp_lst, llen, slen)
        slen *= 2
        # Merge back and forth between the original table and the auxiliary table
        merge_pass(tmp_lst, lst, llen, slen)
        slen *= 2

7. Summary

Various sorting algorithms have their own application scenarios, and there is not only one in practical use. For example, when using quick sort and merge sort, a simple and stable algorithm such as direct insertion sort can be used for short sequences.

Topics: Python Algorithm