Algorithm Title: regular expression matching (title + idea + code + comment)

Posted by Robert Plank on Mon, 07 Mar 2022 19:20:41 +0100

subject

  1. Regular Expression Matching
    Give you a string s and a character rule p, please implement a support '.' Matches the regular expression of '*'.

‘.’ Match any single character
'*' matches zero or more preceding elements
The so-called matching is to cover the whole string s, not part of the string.

Example 1:

Input: s = "aa" p = "a"
Output: false
Explanation: 'a' cannot match the entire string of 'aa'.
Example 2:

Input: s = "aa" p = "a *"
Output: true
Explanation: because '*' means that zero or more preceding elements can be matched, the preceding element here is' a '. Therefore, the string "aa" can be regarded as' a 'repeated once.
Example 3:

Input: s = "ab" p = "“
Output: true
Explanation: "." means that it can match zero or more ('*') arbitrary characters ('.').
Example 4:

Input: s = "aab" p = "cab"
Output: true
Explanation: because '*' means zero or more, here 'c' is 0, and 'a' is repeated once. Therefore, you can match the string "aab".
Example 5:

Input: s = "mississippi" p = "misisp *."
Output: false

Tips:

0 <= s.length <= 20
0 <= p.length <= 30
s may be empty and contain only lowercase letters from a-z.
p may be empty and contain only lowercase letters from a-z and characters And *.
Ensure that every time the character * appears, it is preceded by a valid character

Source: LeetCode
Link: https://leetcode-cn.com/problems/regular-expression-matching
The copyright belongs to Lingkou network. For commercial reprint, please contact the official authorization, and for non-commercial reprint, please indicate the source.

thinking

s is the string to be matched and p is the regular expression string

  • Special cases are handled directly and the results are returned, such as two empty strings (in the actual situation, the regular expression is empty and can be matched regardless of whether the matched string is empty or not, while the requirement in the title is that the regular expression is empty, if the string to be matched is empty, it will be matched, and if the string to be matched is not empty, it will not be matched)
			if (s == null || p == null){
                return false;
            }
            if (p.length() == 0){
                if (s.length() ==0){
                    return true;
                }else {
                    return false;
                }
            }
  • In fact, regular matching is a continuous attempt of several matching possibilities, and we can set a string subscript to be matched. The regular expression string subscript is used to indicate where the matching is. Until both sides are matched, both of them go to the length of the string to prove that the matching is successful, and then set the flag bit to true, To end other recursions directly.
  • The idea of recursion is to split the whole matching expression into one regular minimum item at a time and match the whole regular expression all the time. In this way, a big problem of matching the whole regular expression is split into a small problem, that is, matching a regular expression item. Then the whole matching can be completed by recursion and optimization
  • Each situation that may be encountered in the matching process shall be classified and discussed, and corresponding processing shall be carried out, as shown in the following figure:

code

Call the function isMatch(String s, String p) to match

public class Solution {
    private volatile Boolean finish = false;
    public boolean isMatch(String s, String p) {
        synchronized (finish){
            finish = false;
            //Special case handling
            if (s == null || p == null){
                return false;
            }
            if (p.length() == 0){
                if (s.length() ==0){
                    return true;
                }else {
                    return false;
                }
            }
            //Start matching
            isMatch(s.toCharArray(),s.length(),0,p.toCharArray(),p.length(),0);
            return finish;
        }
    }

    public void isMatch(char[] str,int strLen,int strPosition,char[] regex,int regexLen,int regexPosition){
        //Get out when you find it
        if (finish){
            return;
        }
        //Go out when you're finished
        if (regexPosition == regexLen){
            //If the regular string is used up and the string is used up, the matching is successful
            if (strPosition == strLen){
                finish = true;
            }
            return;
        }

        //Take a character from the regular string
        char now = regex[regexPosition];
        //Since * represents zero or several previous characters, it is necessary to tentatively remove a regular character
        char next = ' ';
        if (regexPosition < regexLen -1){
            next = regex[regexPosition+1];
        }
        //Start classified discussion
        //There's a next one and the next one is a*
        if (next == '*'){
            //Zero or several
            if (now == '.'){
                //Matches 0
                isMatch(str,strLen,strPosition,regex,regexLen,regexPosition+2);
                if (finish){
                    //When it matches, it ends directly
                    return;
                }
                //Match 1 + characters
                for (int i =1;i<=strLen-strPosition;i++){
                    isMatch(str,strLen,strPosition+i,regex,regexLen,regexPosition+2);
                }
            }else {
                boolean failed = false;
                //Match 0 now characters
                isMatch(str,strLen,strPosition,regex,regexLen,regexPosition+2);
                if (finish){
                    //When it matches, it ends directly
                    return;
                }
                //Match more than 1 now character
                for (int i =1;i<=strLen-strPosition;i++){
                    //Check several lengths
                    for (int j = 0;j<i;j++){
                        if (str[strPosition+j] != now){
                            //Unequal direct exit
                            failed = true;
                            break;
                        }
                    }
                    if (failed){
                        //If the short ones don't match, exit directly. For example, if one a doesn't match, it's impossible to match two a's
                        break;
                    }
                    //Passed the inspection
                    isMatch(str,strLen,strPosition+i,regex,regexLen,regexPosition+2);
                }
            }
        }else {
            //Single character comparison
            //When you run out of strings, go straight out
            if (strPosition == strLen){
                return;
            }
            //Match current character
            char strNow = str[strPosition];
            if (now == strNow || now == '.'){
                //Continue the same recursive comparison (if the comparison is completed, it will be checked when entering the function)
                isMatch(str,strLen,strPosition+1,regex,regexLen,regexPosition+1);
            }else {
                //Different end
                return;
            }
        }
    }

    public static void main(String[] args) {
        System.out.println(new Solution().isMatch("ab",".*c"));
    }
}

Topics: Java Algorithm leetcode string regex