Chapter 8 search algorithm

Posted by grimz on Wed, 24 Nov 2021 14:39:36 +0100

Chapter 8 search algorithm

1. Introduction to search algorithm

  • Sequential (linear) lookup
  • Binary search / half search
  • Interpolation lookup
  • fibonacci search

2. Linear search

  • Write linear search algorithm code
public class SeqSearch {

	public static void main(String[] args) {
		int[] arr = { 1, 2, 3, 4, 5 };// Array without order
		int index = seqSearch(arr, -11);
		if (index == -1) {
			System.out.println("Not found");
		} else {
			System.out.println("Found with subscript=" + index);
		}
	}

	/**
	 * The linear search we implemented here is to find a value that meets the conditions and return it
	 * 
	 * @param arr
	 * @param value
	 * @return
	 */
	public static int seqSearch(int[] arr, int value) {
		// Linear search is to compare one by one. If the same value is found, the subscript is returned
		for (int i = 0; i < arr.length; i++) {
			if (arr[i] == value) {
				return i;
			}
		}
		return -1;
	}

}

  • Program running results
Found with subscript=4

Linear search is actually the most commonly used sequential traversal. When it is found, it returns the subscript that appears for the first time

To find an object, you need to judge whether it is null. Follow the get method of arrayList,

3. Binary search

3.1. Binary search ideas

  • Premise of binary search algorithm: the array must be an ordered array
  • Analysis of binary search algorithm (recursive version):
    • Define two auxiliary pointers: left and right. The element to be found is between arr[left]~arr[right]
    • The initial value of left is 0, and the initial value of right is arr.length - 1
    • Divide the array into two halves: int mid = (left + right) / 2, Compare the middle value of the array with the target value findVal
      • If mid > findval, the value to be found is in the left half of the array
      • If mid < findval, the value to be searched is in the right half of the array
      • If mid == findVal, find the target value and return
    • When to terminate recursion? There are two cases:
      • Find the target value, directly return the target value findVal, and end the recursion
      • The target value is not found: left > right. Think like this: if you recurse to only one number in the array (left == right), the target value has not been found. When you continue to perform the next recursion, one of the left pointer and the right pointer will take another step. At this time
        Left and right will be staggered. At this time, left > right, return - 1 and end the recursion, indicating that the target value is not found

3.2 code implementation

3.2.1 binary search (single value)

  • Write a binary search algorithm: return when the target value is found
//Note: the premise of binary search is that the array is ordered
public class BinarySearch {

    public static void main(String[] args) {

        int arr[] = {1, 8, 10, 89, 1000, 1234};
        int resIndex = binarySearch(arr, 0, arr.length - 1, 1000);
        System.out.println("resIndex=" + resIndex);

    }

    // Binary search algorithm

    /**
     *
     * @param arr     array
     * @param left    Index on the left
     * @param right   Index on the right
     * @param findVal Value to find
     * @return If it is found, it returns the subscript. If it is not found, it returns - 1
     */
    public static int binarySearch(int[] arr, int left, int right, int findVal) {

        // When left > right, it means that the whole array is recursive, but it is not found
        if (left > right) {
            return -1;
        }
        int mid = (left + right) / 2;
        int midVal = arr[mid];

        if (findVal > midVal) { // Recursive right
            return binarySearch(arr, mid + 1, right, findVal);
        } else if (findVal < midVal) { // Left recursion
            return binarySearch(arr, left, mid - 1, findVal);
        } else {

            return mid;
        }

    }

}

  • Program running results
resIndex=4

3.2.2 binary search (all values)

  • Write a binary search algorithm: find all target values. After finding the target value, spread the search to the left and right respectively
//Note: the premise of binary search is that the array is ordered
public class BinarySearch {

    public static void main(String[] args) {

        int arr[] = {1, 8, 10, 89, 1000, 1000, 1000, 1234};
        List<Integer> resIndexList = binarySearch(arr, 0, arr.length - 1, 1000);
        System.out.println("resIndexList=" + resIndexList);

    }


    // Complete an after class thinking question:
    /*
     * After class thinking questions: {1,8, 10, 89, 1000, 10001234} when there are multiple identical values in an ordered array, how to find all the values, such as here
     * 1000
     *
     * Train of thought analysis 1. When finding the mid index value, don't return immediately. 2. Scan to the left of the mid index value and add the subscripts of all elements meeting 1000 to the set ArrayList
     * 3. Scan the right side of the mid index value and add the subscripts of all elements satisfying 1000 to the set ArrayList 4. Return Arraylist
     */

    public static List<Integer> binarySearch(int[] arr, int left, int right, int findVal) {

        // When left > right, it means that the whole array is recursive, but it is not found
        if (left > right) {
            return new ArrayList<Integer>();
        }
        int mid = (left + right) / 2;
        int midVal = arr[mid];

        if (findVal > midVal) { // Recursive right
            return binarySearch(arr, mid + 1, right, findVal);
        } else if (findVal < midVal) { // Left recursion
            return binarySearch(arr, left, mid - 1, findVal);
        } else {
            // Train of thought analysis
            // 1. When the mid index value is found, do not return it immediately
            // 2. Scan to the left of the mid index value and add the subscripts of all elements meeting 1000 to the set ArrayList
            // 3. Scan to the right of mid index value and add the subscripts of all elements satisfying 1000 to the set ArrayList
            // 4. Return Arraylist to

            List<Integer> resIndexlist = new ArrayList<Integer>();
            // Scan to the left of the mid index value and add the subscripts of all elements satisfying 1000 to the set ArrayList
            int temp = mid - 1;
            while (temp >= 0) {
                if (arr[temp] == findVal) {
                    resIndexlist.add(temp);
                }
                temp--;
            }
            resIndexlist.add(mid); //

            // Scan to the right of the mid index value and add the subscripts of all elements satisfying 1000 to the set ArrayList
            temp = mid + 1;
            while (temp < arr.length) {
                if (arr[temp] == findVal) {
                    resIndexlist.add(temp);
                }
                temp++;
            }

            return resIndexlist;
        }

    }
}

  • Program running results
resIndexList=[4, 5, 6]

4. Interpolation lookup

4.1 basic introduction to interpolation search

  • The interpolation search algorithm is similar to binary search, except that the interpolation search starts from the adaptive mid each time.

4.2 interpolation search diagram

  • The formula for finding the mid index in the split search. low represents the left index, high represents the right index, and right. The key is the findVal we talked about earlier

  • Formula in the figure: int mid = low + (high - low) * (key - arr[low]) / (arr[high] - arr[low]);

    Corresponding to the previous code formula:

    int mid = left + (right – left) * (findVal – arr[left]) / (arr[right] – arr[left])

  • The general idea is the same as binary search, with the following differences:

    • Look for different mid formulas:

      int mid = left + (right – left) * (findVal – arr[left]) / (arr[right] – arr[left]);

    • Because findVal appears in the formula, the value of findVal cannot be too large or too small, otherwise the mid will be too large or too small, causing the array to cross the boundary,

      • Add judgment: findval < arr [left] and findval > arr [right]
      • why? When findVal = arr[left], mid = left; When findVal = arr[right], mid = right;

4.3 code implementation

  • Write interpolation search algorithm
public class InsertValueSearch {

    public static void main(String[] args) {

        int[] arr = new int[100];
        for (int i = 0; i < 100; i++) {
            arr[i] = i + 1;
        }
        int index = insertValueSearch(arr, 0, arr.length - 1, 1);
        System.out.println("index = " + index);

    }

    //Write interpolation search algorithm
    //Interpolation search algorithm also requires that the array is ordered

    /**
     *
     * @param arr array
     * @param left Left index
     * @param right Right index
     * @param findVal Find value
     * @return If it is found, the corresponding subscript is returned. If it is not found, it returns - 1
     */
    public static int insertValueSearch(int[] arr, int left, int right, int findVal) {

        System.out.println("Interpolation lookup times~~");

        //Note: findval < arr [left] and findval > arr [right] must be required, otherwise the mid we get may be out of bounds
        // Findval < arr [left]: indicates that the value to be found is smaller than the smallest element in the array
        // Findval > arr [right]: indicates that the value to be found is larger than the largest element in the array
        if (left > right || findVal < arr[left] || findVal > arr[right]) {
            return -1;
        }

        // Find the mid, adaptive, well, isn't this a function
        // When findVal = arr[left], mid = left
        // When findVal = arr[right], mid = right
        int mid = left + (right - left) * (findVal - arr[left]) / (arr[right] - arr[left]);
        int midVal = arr[mid];
        if (findVal > midVal) { // Description should recurse to the right
            return insertValueSearch(arr, mid + 1, right, findVal);
        } else if (findVal < midVal) { // Description left recursive lookup
            return insertValueSearch(arr, left, mid - 1, findVal);
        } else {
            return mid;
        }

    }
}

  • Program running results
Interpolation lookup times~~
index = 0

Like the function in the rectangular coordinate system, you know the coordinates of two points, give the ordinate k of the third point, and let you find the abscissa mid of the third point;

And k be in the middle.

(k-y1) / (y2-y1) = (mid-x1) / (x2-x1) = = > find mid.

mid = (k-y1)/(y2-y1) * (x2-x1) + x1

4.4 summary

  • For the lookup table with large amount of data and uniform keyword distribution (preferably linear distribution), interpolation search is faster
  • In the case of uneven keyword distribution, this method is not necessarily better than half search

5. Fibonacci search

5.1 Fibonacci series

  • Golden section point refers to dividing a line segment into two parts, so that the ratio of one part to the total length is equal to the ratio of the other part to this part. The approximate value of the first three digits is 0.618. Because the shape designed according to this proportion is very beautiful, it is called the golden section, also known as the Chinese foreign ratio. This is a magical number,
    Will bring unexpected results.
  • Fibonacci sequence {1, 1, 2, 3, 5, 8, 13, 21, 34, 55} it is found that the ratio of two adjacent numbers of Fibonacci sequence is infinitely close to the golden section value of 0.618

5.2 introduction to Fibonacci search

  • Then why do you have to divide it equally? Can we have a "golden section"? That is, mid = left + 0.618 (right left). Of course, mid should be an integer. If so, what is the time complexity? Maybe you can also program an experiment to compare the execution efficiency of dichotomy and "golden section" method.

  • Fibonacci search algorithm is also called golden section search algorithm. Fibonacci search principle is similar to the first two, only changing the position of intermediate node (MID). Mid is no longer intermediate or calculated by interpolation, but located near the golden section point, that is, mid = low + F(k-1) - 1

  • Understanding of F(k)-1

    • From the properties of Fibonacci sequence F[k]=F[k-1]+F[k-2], we can get
    F[k]-1) =(F[k-1]-1) +(F[k-2]-1) + 1 
    
  • Description of the formula: as long as the length of the sequence table is F[k]-1, the table can be divided into two sections with the length of F[k-1]-1 and F[k-2]-1, as shown in the figure. Thus, the middle position is mid=low+F(k-1)-1. Similarly, each sub segment can be divided in the same way

    • However, the length n of the sequence table is not necessarily equal to F[k]-1, so the length n of the original sequence table needs to be increased to F[k]-1. The value of K here can only make F[k]-1 exactly greater than or equal to n
    • Why is the total length of the array F(k) - 1, not F(k)? Because the intermediate value can be found only by rounding up F(k-1). If the array length is F(k), and F(k) = F(k-1) + F(k-2), how to find the intermediate value?
    • Why is the length on the left of the array F(k-1) - 1 and the length on the right of the array F(k-2) - 1? Take a Fibonacci sequence: {1, 1, 2, 3, 5, 8, 13, 21, 34, 55}, 54 = 33 + 20 + 1
      , whether the left is F(k-1) - 1 and the right is F(k-2) - 1 also happens to leave an intermediate value~~~

5.3 Fibonacci search ideas

  • First, calculate the k value of Fibonacci sequence according to the size of the original array
  • The expansion condition of the array is: increase the K value (the index starts from 0) so that the array length is just greater than or equal to F[k]-1 in the Fibonacci sequence. We define the temporary array temp. The elements with 0 after temp are filled according to the maximum element value of the array
  • When will the Fibonacci search end?
    • Find the target value: directly return the target value index
    • The target value is not found: the low pointer and the high pointer are equal or pass by, that is, low > = high
  • Why do you need to carry it out alone when low == high?
    • When low == high, it means that only one element (a[low] or a[high]) in the array is not compared with the target value, and K may be equal to 0, so mid = low + f[k - 1] - 1 cannot be executed; Operation (k - 1)
      Will cause array out of bounds)
    • Solution: at the end of the program, we can compare a[low] or a[high] with the target value alone. I solve the array out of bounds exception through Debug. I don't want to understand, but I don't put low == high
      If you carry it out alone, you will throw abnormalities. Hey, burn your skull ~ ~ ~ think about it another day
  • How to determine the mid value? mid = low + f[k - 1] - 1: use the golden section point to determine the value of mid
  • How do you choose the left and right roads?
    • Key < temp [mid]: the target value is on the left of the golden section point. According to the above figure, it should be k -= 1;
    • Key > temp [mid]: the target value is on the right of the golden section point. According to the above figure, it should be k -= 2;
    • key = temp[mid]: find the target value. Because the array has experienced capacity expansion, some of the following values are redundant, and the mid may be out of bounds (relative to the original array)
      • Mid < = high: prove that the mid index is in the original array and return mid
      • When mid > high, it proves that the mid index has exceeded the limit (relative to the original array), and returns high

5.4 code implementation

  • Write Fibonacci search algorithm
public class FibonacciSearch {

    public static int maxSize = 20;

    public static void main(String[] args) {

        int[] arr = {1, 2, 3, 4, 5};
        System.out.println("index=" + fibSearch(arr, 5));

    }

    // Since we need to use Fibonacci sequence after mid=low+F(k-1)-1, we need to obtain a Fibonacci sequence first
    // A Fibonacci sequence is obtained by non recursive method
    public static int[] fib() {
        int[] f = new int[maxSize];
        f[0] = 1;
        f[1] = 1;
        for (int i = 2; i < maxSize; i++) {
            f[i] = f[i - 1] + f[i - 2];
        }
        return f;
    }

    // Write Fibonacci search algorithm
    // Write the algorithm in a non recursive way

    /**
     *
     * @param a   array
     * @param key Key (value) we need to find
     * @return Returns the corresponding subscript, if not - 1
     */
    public static int fibSearch(int[] a, int key) {
        int low = 0;
        int high = a.length - 1;
        int k = 0; // Subscript representing the Fibonacci division value
        int mid = 0; // Store mid value
        int f[] = fib(); // Get Fibonacci sequence
        // Get the subscript of Fibonacci division value
        while (high > f[k] - 1) {
            k++;
        }
        // Because the f[k] value may be greater than the length of a, we need to use the Arrays class to construct a new array and point to temp []
        // The insufficient part will be filled with 0
        int[] temp = Arrays.copyOf(a, f[k]);
        // In fact, you need to fill temp with the last number of the a array
        // give an example:
        // temp = {1,8, 10, 89, 1000, 1234, 0, 0} => {1,8, 10, 89, 1000, 1234, 1234,
        // 1234,}
        for (int i = high + 1; i < temp.length; i++) {
            temp[i] = a[high];
        }

        // Use while to cycle and find our number key
        while (low < high) { // As long as this condition is met, you can find it
            mid = low + f[k - 1] - 1;
            if (key < temp[mid]) { // We should continue to look in front of the array (left)
                high = mid - 1;
                // Why k--
                // Current: f[k-1] = f[k-2] + f[k-3]
                // Search to the left: f[k-2] = f[k-3] + f[k-4];
                // Assuming that k becomes x next time, the next time to enter the cycle is: f[X-1] = f[k-3] + f[k-4]
                // Get f [X-1] = f [K-2] = > x = k-1, and here x is the original K, so K--
                k--;
            } else if (key > temp[mid]) { // We should continue to look behind the array (to the right)
                low = mid + 1;
                //Why k -=2
                // Current: f[k-1] = f[k-2] + f[k-3]
                // Search to the right: f[k-3] = f[k-4] + f[k-5];
                // Assuming that k becomes x next time, the next time to enter the cycle is: f[X-1] = f[k-4] + f[k-5];
                // Get f [X-1] = f [k-3] = > x = k-2, and here x is the original k, so k-=2
                k -= 2;
            } else { // find
                // You need to determine which subscript is returned
                if (mid <= high) {
                    return mid;
                } else {
                    return high;
                }
            }
        }
        if (a[low] == key) {
            return low;
        } else {
            return -1;
        }
    }
}
  • Program running results
index=4

Topics: data structure