Li Kou brush question diary

Posted by boske on Fri, 03 Dec 2021 20:04:49 +0100

Give two strings S and t. Returns the shortest substring of all characters containing T in S. If there is no qualified substring in s, the empty string '' is returned.
If there are multiple qualified substrings in s, any one is returned.
Note: for repeated characters in T, the number of characters in the substring we are looking for must not be less than the number of characters in t.

Example 1:
Input: s = "Adobe codebanc", t = "ABC"
Output: "BANC"
Explanation: the shortest substring "BANC" contains all characters' A ',' B 'and' C 'of string t

Example 2:
Input: s = "a", t = "a"
Output: "a"

Example 3:
Input: s = "a", t = "aa"
Output: ''
Explanation: both characters' a 'in t should be included in the substring of s, so there is no qualified substring, and an empty string is returned.

Tips:
1 <= s.length, t.length <= 105
s and t consist of English letters

Advanced: can you design an algorithm to solve this problem in o(n) time?

Idea 1: sliding window + hash table

The idea is as follows: first, traverse the string t, and use the hash table umt to store the characters and their occurrence times. Then traverse the string s, use the hash table ums to store the characters in the string t and their occurrence times, and count the length of the effective characters in the substring (the effective characters here refer to the characters that have appeared in the string t and the occurrence times of the character in the substring s are less than those in the string t). When the number of occurrences of a character in ums is greater than that in umt, it indicates that the character appears more and is invalid. At this time, it is necessary to eliminate the substring of the character; Another possibility is that the character does not appear in T and is invalid. In this case, the left boundary needs to be shifted to the right to shorten the length of the substring. When the number of valid characters in the substring is equal to the length of T, it means that the substring contains all characters in T, and the substring at this time is updated.

The code is as follows:

class Solution {
public:
    string minWindow(string s, string t) {
        int size1 = s.size(), size2 = t.size();
        if(size1 < size2)
            return "";
        string ret = "";
        unordered_map<char, int> ums, umt;
        for(int i = 0; i < size2; ++i)
            umt[t[i]]++;
        int left = 0, right = 0, num = 0;
        for( ; right < size1; ++right){
            ums[s[right]]++;
            if(ums[s[right]] <= umt[s[right]])
                num++;
            while(ums[s[left]] > umt[s[left]])
                ums[s[left++]]--;
            if(num == size2){
                if(ret.empty() || right - left + 1 < ret.size())
                    ret = s.substr(left, right - left + 1);
            }
        }
        return ret;
    }
};

Time complexity: O(n); Space complexity: O(1), with a total of 52 upper and lower case characters.

Train of thought 2:14

You can refer to the optimization method in question 14. Only one array is used to store the number of characters. The number of characters in string t is negative, while the number of characters in string s is positive. When the number of occurrences of all elements in the array is greater than or equal to 0, it indicates that the substring contains all characters in the string t.

The code is as follows:

class Solution {
public:
    string minWindow(string s, string t) {
        int size1 = s.size(), size2 = t.size();
        if(size1 < size2)
            return "";
        string ret = "";
        vector<int> v(58, 0);
        int left = 0, len = INT_MAX;
        for(int i = 0; i < size2; ++i){
            --v[t[i] - 'A']; // A is the smallest ASCII value of all letters
            ++v[s[i] - 'A'];
        }
        if(IsPositive(v))
            return s.substr(0, size2);
        for(int i = size2; i < size1; ++i){
            ++v[s[i] - 'A'];
            while(IsPositive(v)){
                if(i - left + 1 < len){
                    len = i - left + 1;
                    ret = s.substr(left, len);
                }
                --v[s[left++] - 'A'];
            }
        }
        return ret;
    }

    bool IsPositive(vector<int>& v){
        for(int& n : v){
            if(n < 0)
                return false;
        }
        return true;
    }
};

Time complexity: O(n * 58); Space complexity: O(1).

In the idea of question 14, IsPositive should be called every cycle to judge whether the substring meets the requirements of the question, which will take up more time and space. It can be further optimized, that is, using the idea of effective characters in idea 1, if the number of effective characters is equal to the length of t, the result of substring will be updated.

The code is as follows:

class Solution {
public:
    string minWindow(string s, string t) {
        int size1 = s.size(), size2 = t.size();
        if(size1 < size2)
            return "";
        string ret = "";
        vector<int> v(58, 0);
        for(int i = 0; i < size2; ++i)
            v[t[i] - 'A']--;
        int left = 0, right = 0, num = 0;
        for( ; right < size1; ++right){
            if(v[s[right] - 'A'] < 0) //Description is a valid character
                num++;
            v[s[right] - 'A']++; //Number of occurrences of valid characters plus 1
            if(num == size2){ //The number of valid characters meets the requirements of the topic
                while(v[s[left] - 'A'] > 0) //Eliminate redundant characters on the left and move the left boundary to the right
                    v[s[left++] - 'A']--;
                if(ret.empty() || right - left + 1 < ret.size())
                    ret = s.substr(left, right - left + 1);
            }
        }
        return ret;
    }
};

Time complexity: O(n); Space complexity: O(1), with a total of 52 upper and lower case characters.

Topics: Algorithm leetcode