Sliding window case of algorithm

Posted by wgh on Sat, 18 Dec 2021 03:32:55 +0100

1. Longest substring without duplicate string

Description: given a string s (s consists of English letters, numbers, symbols and spaces), please find out the length of the longest substring without repeated characters.

Example:

Example 1:
input: s = "abcabcbb"
output: 3
 explain: Because the longest substring without duplicate characters is "abc"，So its length is 3.

Example 2:
input: s = "bbbbb"
output: 1
 explain: Because the longest substring without duplicate characters is "b"，So its length is 1.

Example 3:
input: s = "pwwkew"
output: 3
 explain: Because the longest substring without duplicate characters is "wke"，So its length is 3.
Please note that your answer must be the length of the substring,"pwke" Is a subsequence, not a substring.

Example 4:
input: s = ""
output: 0

Solution: sliding window

Ideas and algorithms:

We might as well take the string abcabcbb in example 1 as an example to find the longest substring starting from each character and not containing repeated characters, and the longest string is the answer. For the string in example 1, we list these results, where the selected character and the longest string are represented in parentheses:

with (a)bcabcbb The longest string to start is (abc)abcbb；
with a(b)cabcbb The longest string to start is a(bca)bcbb；
with ab(c)abcbb The longest string to start is ab(cab)cbb；
with abc(a)bcbb The longest string to start is abc(abc)bb；
with abca(b)cbb The longest string to start is abca(bc)bb；
with abcab(c)bb The longest string to start is abcab(cb)b；
with abcabc(b)b The longest string to start is abcabc(b)b；
with abcabcb(b) The longest string to start is abcabcb(b).

What did you find? If we enumerate the starting position of the substring incrementally, the ending position of the substring is also incremented! The reason for this is that suppose we select the k-th character in the string as the starting position, and get that the end position of the longest substring without repeated characters is rk. Then, when we select the K + 1st character as the starting position, it is obvious that the characters from k+1 to rk are not repeated at first, and since the original k-th character is missing, we can try to continue to increase rk until there are repeated characters on the right.

In this way, we can use the sliding window to solve this problem:

We use two pointers to represent the left and right boundaries of a substring (or window) in the string, where the left pointer represents the starting position of the enumerated substring in the above, and the right pointer is rk in the above;

In each step of operation, we will move the left pointer to the right by one grid, indicating that we start enumerating the next character as the starting position, and then we can constantly move the right pointer to the right, but we need to ensure that there are no duplicate characters in the substring corresponding to the two pointers. After the movement, this substring corresponds to the longest substring starting with the left pointer and not containing duplicate characters. We record the length of this substring;

After enumeration, the length of the longest substring we find is the answer.

How to judge repeated characters

In the above process, we also need to use a data structure to judge whether there are duplicate characters, The commonly used data structure is hash Set (i.e. std::unordered_set in C + +, HashSet in Java, Set in Python and Set in JavaScript). When the left pointer moves to the right, we remove a character from the hash Set. When the right pointer moves to the right, we add a character to the hash Set.

So far, we have solved the problem perfectly.

code:

// Solution: sliding window
var lengthOfLongestSubstring = function (s) {
    // A hash set that records whether each character has occurred
    const occ = new Set();
    const n = s.length;
    // The right pointer, whose initial value is - 1, is equivalent to that we are on the left side of the left boundary of the string and have not started moving
    let rk = -1, ans = 0;
    for (let i = 0; i < n; i++) {
        // Where i is equivalent to the left pointer
        if (i != 0) {
            // The left pointer moves one space to the right to remove a character
            occ.delete(s.charAt(i - 1));
        }
        while (rk + 1 < n && !occ.has(s.charAt(rk + 1))) {
            occ.add(s.charAt(rk + 1));
            // Keep moving the right pointer
            rk++;
        }
        // Characters i to rk are a very long non repeating character string
        ans = Math.max(ans, rk - i + 1);
    }
    // Returns the length of a substring without repeating characters
    return ans;
};

Complexity analysis:

Time complexity: O(N), where N is the length of the string. The left and right pointers traverse the entire string once, respectively.
Space complexity: O(∣ Σ ∣), where Σ Represents the character set (that is, the characters that can appear in the string), ∣ Σ ∣ indicates the size of the character set. The character set is not specified in this question, so it can default to all characters within [0128], i.e. ∣ Σ ∣=128. We need to use a hash set to store the characters that have appeared, and the maximum number of characters is ∣ Σ ∣ so the space complexity is O(∣) Σ ∣).

2. Arrangement of strings

Description: give you two strings s1 and s2 (where s1 and s2 only contain lowercase letters) and write a function to judge whether s2 contains the arrangement of s1.
In other words, one of the permutations of s1 is a substring of s2.

Example:

Example 1:
Input: s1 = "ab" s2 = "eidbaooo"
Output: true
 Explanation: s2 contain s1 One of the permutations of("ba").

Example 2:
Input: s1 = "ab" s2 = "eidboaoo"
Output: false

Solution: sliding window

Ideas and algorithms:
Because the arrangement does not change the number of characters in a string, one string is the arrangement of the other only when the number of characters in each of the two strings is equal.

According to this property, remember that the length of s1 is n, and we can traverse each substring with length N in s2 to judge whether the number of substrings is equal to that of each character in s1. If they are equal, it means that the substring is an arrangement of s1.

Two arrays cnt1 and cnt2 are used. Cnt1 counts the number of characters in s1 and cnt2 counts the number of characters in the currently traversed substring.

Since the length of the substring to be traversed is n, we can use a sliding window with a fixed length of n to maintain cnt2: each time the sliding window slides to the right, we count more characters entering the window and less characters leaving the window. Then, it is determined whether cnt1 and cnt2 are equal. If they are equal, it means that one of the arrangements of s1 is a substring of s2.

code:

var checkInclusion = function (s1, s2) {
    const n = s1.length, m = s2.length;
    if (n > m) {
        return false;
    }
    // Two arrays cnt1 and cnt2 are used. Cnt1 counts the number of characters in s1 and cnt2 counts the number of characters in the currently traversed substring.
    const cnt1 = new Array(26).fill(0);
    const cnt2 = new Array(26).fill(0);
    for (let i = 0; i < n; i++) {
        cnt1[s1[i].charCodeAt() - 'a'.charCodeAt()]++;
        cnt2[s2[i].charCodeAt() - 'a'.charCodeAt()]++;
    }
    if (cnt1.toString() === cnt2.toString()) {
        return true;
    }
    for (let i = n; i < m; i++) {
        // Each time the sliding window slides to the right, count more characters entering the window and less characters leaving the window
        cnt2[s2[i].charCodeAt() - 'a'.charCodeAt()]++;
        cnt2[s2[i - n].charCodeAt() - 'a'.charCodeAt()]--;
        if (cnt1.toString() === cnt2.toString()) {
            return true;
        }
    }
    return false;
};

let s1 = "ab", s2 = "eidbaooo";
console.log(checkInclusion(s1, s2));  // true

Complexity analysis:

Time complexity: O(n+m + ∣ Σ ∣), where n is the length of string s1 and M is the length of string s2, Σ Is the character set. The character set in this question is lowercase letters, ∣ Σ ∣=26.
Space complexity: O(∣ Σ ∣).

Topics: Algorithm

Programmer Think

Sliding window case of algorithm

1. Longest substring without duplicate string

Solution: sliding window

2. Arrangement of strings

Solution: sliding window

Hot Topics