day16 regular expression & enumeration class

Posted by spartan789 on Tue, 25 Jan 2022 03:56:33 +0100

Overview of regular expressions

The concept of regular expression: use a single string to describe or match a series of strings that conform to certain syntax rules

1. Find the rules through a large number of strings and get the definition rules

2. Use this rule to match new strings

3. The matching is successful and the corresponding operation is made

Basic syntax of regular expressions

1. Literal character

The character itself is a regular character

public class RedularDemo2 {
    public static void main(String[] args) {
        String str = "ab123342asdasqwe&;123.";
        //One method in the String class is the replace function, which replaces all characters that meet the rules
        //public String replaceAll(String regex,String replacement)
        // Replace each substring of this string that matches the given regular expression with the given replacement.
        String regex = "\\.";
        System.out.println(str.replaceAll(regex,"_"));//ab123342asdasqwe&;123_

        regex = "b";
        System.out.println(str.replaceAll(regex,"_"));//a_123342asdasqwe&;123.
    }
}

2. Metacharacter

characterdescribe
\Marks the next character as a special character, or a literal character, or a backward reference, or an octal escape character. For example, 'n' matches the character 'n'. ' \N 'matches a newline character. Sequence '\' matches' ', while' ('matches' ('.
^Matches the start of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after '\ n' or '\ r'.
$Matches the end of the input string. If the Multiline property of the RegExp object is set, $also matches the position before '\ n' or '\ r'.
*Matches the previous subexpression zero or more times. For example, zo * can match "z" and "zoo"* Equivalent to {0,}.
+Matches the previous subexpression one or more times. For example, 'zo +' can match "zo" and "zoo", but not "z"+ Equivalent to {1,}.
?Matches the previous subexpression zero or once. For example, "do(es)" Can match "do" or "does".? Equivalent to {0,1}.
{n}N is a nonnegative integer. Match the determined n times. For example, 'o{2}' cannot match 'o' in "Bob", but it can match two o's in "food".
{n,}n is a nonnegative integer. Match at least n times. For example, 'o{2,}' cannot match 'o' in "Bob", but can match all o's in "fooood" o{1,} 'is equivalent to' O + ' o{0,} 'is equivalent to' o * '.
{n,m}Both m and N are nonnegative integers, where n < = m. At least n matches and at most m matches. For example, "o{1,3}" will match the first three o's in "food." o{0,1} 'is equivalent to' o? '. Please note that there can be no space between comma and two numbers.
?The matching pattern is non greedy when the character follows any other qualifier (*, +,?, {n}, {n,}, {n,m}). The non greedy pattern matches as few strings as possible, while the default greedy pattern matches as many strings as possible. For example, for the string "oooo",'O +? ' A single 'o' will be matched and 'O +' will match all 'o'.
.Matches any single character except line breaks (\ n, \ r). To match any character including '\ n', use a pattern like "(. | \ n)".
(pattern)Match pattern and get this match. The obtained Matches can be obtained from the generated Matches collection. The SubMatches collection is used in VBScript and the $0... $9 attribute is used in JScript. To match parenthesis characters, use '(' or ')'.
(?:pattern)The pattern is matched but the matching result is not obtained, that is, it is a non obtained match and will not be stored for later use. This is useful when using the or character (|) to combine parts of a pattern. For example, 'industry (?: y|ies) is a simpler expression than' industry|industries'.
(?=pattern)look ahead positive assert matches the lookup string at the beginning of any string matching pattern. This is a non fetched match, that is, the match does not need to be fetched for later use. For example, "Windows(?=95|98|NT|2000)" can match "windows" in "Windows2000", but cannot match "windows" in "Windows3.1". The pre check does not consume characters, that is, after a match occurs, the search for the next match starts immediately after the last match, rather than after the characters containing the pre check.
(?!pattern)A positive negative assert matches the lookup string at the beginning of any string that does not match the pattern. This is a non fetched match, that is, the match does not need to be fetched for later use. For example, "Windows(?!95|98|NT|2000)" can match "windows" in "Windows3.1", but cannot match "windows" in "Windows2000". The pre check does not consume characters, that is, after a match occurs, the search for the next match starts immediately after the last match, rather than after the characters containing the pre check.
(?<=pattern)The look behind positive pre check is similar to the forward positive pre check, but in the opposite direction. For example, "(? < = 95|98|nt|2000) Windows" can match "Windows" in "2000Windows", but cannot match "Windows" in "3.1 Windows".
(?<!pattern)Reverse negative pre check is similar to positive negative pre check, but in the opposite direction. For example, "(? <! 95|98|nt|2000) Windows" can match "Windows" in "3.1 Windows", but cannot match "Windows" in "2000 Windows".
x|yMatch x or y. For example, 'z|food' can match "Z" or "food" (z|f)ood 'matches "zoo" or "food".
[xyz]Character set. Match any character contained. For example, '[abc]' can match 'a' in 'plain'.
[^xyz]Negative character set. Matches any characters that are not included. For example, 'abc' can match 'p', 'l', 'i', 'n' in 'plain'.
[a-z]Character range. Matches any character within the specified range. For example, '[a-z]' can match any lowercase character in the range of 'a' to 'z'.
[^a-z]Negative character range. Matches any character that is not within the specified range. For example, 'a-z' can match any character that is not in the range of 'a' to 'z'.
\bMatch a word boundary, that is, the position between the word and the space. For example, 'er\b' can match 'er' in 'never', but not 'er' in 'verb'.
\BMatches non word boundaries. ' er\B 'can match' er 'in' verb ', but cannot match' er 'in' never '.
\cxMatches the control character indicated by x. For example, \ cM matches a Control-M or carriage return. The value of x must be either A-Z or one of A-Z. Otherwise, c is treated as a literal 'c' character.
\dMatches a numeric character. Equivalent to [0-9].
\DMatches a non numeric character. Equivalent to 0-9.
\fMatch a page feed. Equivalent to \ x0c and \ cL.
\nMatch a newline character. Equivalent to \ x0a and \ cJ.
\rMatch a carriage return. Equivalent to \ x0d and \ cM.
\sMatches any white space characters, including spaces, tabs, page breaks, and so on. Equivalent to [\ f\n\r\t\v].
\SMatches any non whitespace characters. Equivalent to \ f\n\r\t\v.
\tMatch a tab. Equivalent to \ x09 and \ cI.
\vMatch a vertical tab. Equivalent to \ x0b and \ cK.
\wMatch letters, numbers, underscores. Equivalent to '[a-za-z0-u9]'.
\WMatches non letters, numbers, underscores. Equivalent to 'A-Za-z0-9_'.
\xnMatch n, where n is the hexadecimal escape value. Hexadecimal escape value must be two digits long. For example, '\ x41' matches' A '.' \x041 'is equivalent to' \ X04 '& "1". ASCII encoding can be used in regular expressions.
\numMatch num, where num is a positive integer. A reference to the match obtained. For example, '(.)\ 1 'matches two consecutive identical characters.
\nIdentifies an octal escape value or a backward reference. If \ n at least n previously obtained subexpressions, then n is a backward reference. Otherwise, if n is an octal digit (0-7), then n is an octal escape value.
\nmIdentifies an octal escape value or a backward reference. If at least nm subexpressions are obtained before \ nm, nm is a backward reference. If there are at least n fetches before \ nm, n is a backward reference followed by the text M. If none of the preceding conditions are met, if n and m are octal digits (0-7), \ nm will match the octal escape value nm.
\nmlIf n is an octal digit (0-3) and both m and l are octal digits (0-7), the octal escape value nml is matched.
\unMatch n, where n is a Unicode character represented by four hexadecimal digits. For example, \ u00A9 matches the copyright symbol (?).

 

2.1 character class

[]

public class RegularDemo3 {
    public static void main(String[] args) {
        String s = "ab123342asdasqwe&;123.";
        //Presentation format: []
        //[] indicates that characters are classified and can match any character appearing in brackets
        //As long as there is any one of a, B and 2 in the matched string, it will be matched
        String regex = "[ab2]";
        System.out.println(s.replaceAll(regex,"_"));//__1_334__sd_sqwe&;1_3.

        //Requirements: all but ab2 should be matched and replaced
        //^The presence of square brackets means to reverse and match characters that are not ab2
        regex = "[^ab2]";
        System.out.println(s.replaceAll(regex,"_"));//ab_2___2a__a_______2__
    }
}

2.2 scope

In fact, it adds a range based on the character class

public class RegularDemo4 {
    public static void main(String[] args) {
        String regex = "[ab]";
        String s = "abcdefghijklmnABCDTW1234DWFadqwr&;123=.";
        System.out.println("Before matching:" + s);//Before matching: abcdefghijklmnabcdtw1234dwfadqwr&; 123=.
        System.out.println("=========================================");
        System.out.println(s.replaceAll(regex, "_"));//__cdefghijklmnABCDTW1234DWF_dqwr&;123=.

        //Requirement: matches all lowercase letters in the string
        //[a-z] indicates matching any lowercase letter from a to z
        regex = "[a-z]";
        System.out.println(s.replaceAll(regex, "_"));//______________ABCDTW1234DWF_____&;123=.

        //[A-Z] indicates matching any capital letter from a to Z
        regex = "[A-Z]";
        System.out.println(s.replaceAll(regex, "_"));//abcdefghijklmn______1234___adqwr&;123=.

        //Match case
//        regex = "[a-zA-Z]";
        regex = "[A-z]";
        System.out.println(s.replaceAll(regex, "_"));//____________________1234________&;123=.

        //Match number
        regex = "[0-9]";
        System.out.println(s.replaceAll(regex, "_"));//abcdefghijklmnABCDTW____DWFadqwr&;___=.

        //Match numbers and uppercase and lowercase letters
        regex = "[0-z&.]";
        System.out.println(s.replaceAll(regex, "_"));//_______________________________________
    }
}

2.3 predefined classes:

\d == [0-9] number
\D == [^0-9] non numeric
\s == [\r\n\f\r] white space character
\S == [^\r\n\f\r] non white space character
\w == [a-zA-Z0-9]
\W == [^a-zA-Z0-9]
. = = represents any character

public class RegularDemo5 {
    public static void main(String[] args) {
        String regex = "[0-9]";
        String s = "abcde fghijklmn ABCDTW12.....34D WFadq r&;1!!!!23=.";
        System.out.println("Before matching:" + s);//Before matching: ABCDE fghijklmn abcdtw12 34D WFadq r&; 1!!!! 23=.
        System.out.println("=========================================");
        System.out.println(s.replaceAll(regex, "_"));//abcde fghijklmn ABCDTW__.....__D WFadq r&;_!!!!__=.

        regex = "\\d"; //[0-9] number
        System.out.println(s.replaceAll(regex, "_"));//abcde fghijklmn ABCDTW__.....__D WFadq r&;_!!!!__=.

        regex = "\\D"; //Indicates that all non numeric characters are matched
        System.out.println(s.replaceAll(regex, "_"));//______________________12_____34___________1____23__

        regex = "\\s"; //Match all white space characters
        System.out.println(s.replaceAll(regex, "_"));//abcde_fghijklmn_ABCDTW12.....34D_WFadq_r&;1!!!!23=.

        regex = "\\S"; //Matches all characters except white space
        System.out.println(s.replaceAll(regex, "_"));//_____ _________ ________________ _____ ____________

        regex = "\\w"; //Match all uppercase and lowercase letters and numbers
        System.out.println(s.replaceAll(regex, "_"));//_____ _________ ________.....___ _____ _&;_!!!!__=.

        regex = "\\W"; //Matches all non uppercase and lowercase letters and numbers
        System.out.println(s.replaceAll(regex, "_"));//abcde_fghijklmn_ABCDTW12_____34D_WFadq_r__1____23__

        regex = "."; // Represents matching any character
        System.out.println(s.replaceAll(regex, "_"));//___________________________________________________

        regex = "\\."; //Match This character
        System.out.println(s.replaceAll(regex, "_"));//abcde fghijklmn ABCDTW12_____34D WFadq r&;1!!!!23=_

    }
}

2.4 boundary characters

^: does not appear in brackets, indicating that it starts with xxx
$: ends with xxx
\b: Word boundary
\B: Non word boundary

public class RegularDemo6 {
    public static void main(String[] args) {
        //When there are no brackets, use ^, ^ indicates that it starts with xxx, and here it starts with ac
        String regex = "^abc";
        String s = "abcdefg";
        System.out.println("Before matching:" + s);//Before matching: abcdefg
        System.out.println("=========================================");
        System.out.println(s.replaceAll(regex, "_"));//_defg

        regex = "fg$";
        System.out.println(s.replaceAll(regex, "_"));//abcde_


        regex = "\\b";
        s = "hello worpd 888 1 2 & ; 0 a b c d";
        System.out.println("Before matching:" + s);//Before matching: Hello WorPd 888 1 2&; 0 a b c d
        System.out.println("===========================================");
        System.out.println(s.replaceAll(regex, "_"));//_hello_ _worpd_ _888_ _1_ _2_ & ; _0_ _a_ _b_ _c_ _d_

        regex = "\\B";
        System.out.println(s.replaceAll(regex, "_"));//h_e_l_l_o w_o_r_p_d 8_8_8 1 2 _&_ _;_ 0 a b c d

    }
}

2.5 quantifiers

? : 0 or 1 occurrences

+: one or more occurrences

*: any number of occurrences

{n} : exactly n times

{n,m}: n-m occurrences

{n, }; Indicates at least n occurrences

public class RegularDemo7 {
    public static void main(String[] args) {
        //Match 0 or 1 times starting with a
        String regex = "^a?";
        String s = "baaabcdefaaaaaag";
        System.out.println("Before matching:" + s);//Before matching: baaabcdefaaaag
        System.out.println("=======================================");
        System.out.println(s.replaceAll(regex, "_"));//_baaabcdefaaaaaag

        regex = "^a+";
        System.out.println(s.replaceAll(regex, "_"));//baaabcdefaaaaaag

        regex = "^a*";
        System.out.println(s.replaceAll(regex, "_"));//_baaabcdefaaaaaag

        //{n} : exactly n times
        //Requirement: match a string a character for 6 consecutive occurrences
        regex = "a{6}"; // aaaaaa
        System.out.println(s.replaceAll(regex, "*"));//baaabcdef*g

        //{n,m}: n-m occurrences
        regex = "a{3,4}"; // The matching is that the number of consecutive occurrences of a is between 3-4
        //Range quantifiers are matched multiple times first
        System.out.println(s.replaceAll(regex, "*"));//b*bcdef*aag

        //{n, }; Indicates at least n occurrences
        regex = "a{6,}";
        System.out.println(s.replaceAll(regex, "*"));//baaabcdef*g
    }
}

2.6 grouping: ()

public class RegularDemo8 {
    public static void main(String[] args) {
        //It means that the matching content is ab plus 1-2 c
        String reagex = "abc{1,2}";
        String s = "abcccccABC123123ABCabcccccABC123123ABCabcccccABC123123ABCabcabcabc123";
        System.out.println("Before matching:" + s);//Before matching abcccccbc123123abcabccccabcc123123abcabccccabcc123123abcabcabcabc123123abcabcabc123
        System.out.println("===========================================================");
        System.out.println(s.replaceAll(reagex, "_"));//_cccABC123123ABC_cccABC123123ABC_cccABC123123ABC___123

        //Parentheses indicate grouping
        //Indicates that abc occurs 1-2 times as a whole
        reagex = "(abc){1,2}";
        System.out.println(s.replaceAll(reagex, "_"));//_ccccABC123123ABC_ccccABC123123ABC_ccccABC123123ABC__123

        reagex = "ABC(abc){1,}";   //ABCabcabc
        System.out.println(s.replaceAll(reagex, "_"));//abcccccABC123123_ccccABC123123_ccccABC123123_123

        //matches
        System.out.println(s.matches(reagex));//false
    }

}

2.7 back reference (used to get values)

$: value, take the value in the corresponding group number, and the number of each group starts from 1

Demand: 2022-01-23 -- > 01 / 23 / 2022

public class RegularDemo9 {
    public static void main(String[] args) {
        //2022-01-23
        String regex = "(\\d{4})-(\\d{2})-(\\d{2})";
        String s = "2022-01-23  2022-02-24";
        System.out.println(s.replaceAll(regex,"$2/$3/$1"));//01/23/2022  02/24/2022

        //In the group, if I don't want it to generate a number?:
        regex = "(\\d{4})-(?:\\d{2})-(\\d{2})";
        System.out.println(s.replaceAll(regex,"$2/$1"));//23/2022  24/2022
    }
}

3. Application of regular expression in java

How to use regular expressions to implement related operations in java?

1. String lookup operations: Pattern and Matcher

2. String matching operation: you can use the matches method of the string

3. String replacement: there are replaceAll() method and replaceFirst() method in string class

4. String segmentation: there is a split() method in the string class

package com.shujia.wyh.day16;

import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegularDemo10 {
    public static void main(String[] args) {
        String regex = "\\w{3,}";
        String s = "abcd123";
        System.out.println(s.matches(regex));//true

        regex = "[a-z]{2,}";
        s = "abc defg hello111";
        System.out.println(s.replaceAll(regex, "_"));//_ _ _111
        System.out.println(s.replaceFirst(regex, "_"));//_ defg hello111

        s = "abc sbdf 123ab sa123bddss &";
        String[] s1 = s.split(" ");
        //Tool class traversal array
        System.out.println(Arrays.toString(s1));//[abc, sbdf, 123ab, sa123bddss, &]

        s = "abc sbdf 123ab sa123bddss &";
        String[] s2 = s.split("a");
        //Tool class traversal array
        System.out.println(Arrays.toString(s2));//[, bc sbdf 123, b s, 123bddss &]

        //Pattern and Matcher
        regex = "\\w{3,7}";
        Pattern compile = Pattern.compile(regex);
        Matcher matcher = compile.matcher("abcd123");
        System.out.println(matcher.matches());//true
    }
}

Example: change the string "I want to learn programming..." into "I want to learn programming"

public class RegularDemo11 {
    public static void main(String[] args) {
        String s = "I, I, I, I, I..........I.......Yes, yes, yes, yes..................Yes, yes, yes...Learn to learn.......Practice programming.......Cheng Cheng Cheng Cheng Cheng Cheng Cheng";
        //1. Take the first Remove
        String regex = "\\.+";
        String s1 = s.replaceAll(regex, "");
        System.out.println(s1);

        //2. Merge overlapping words
        regex = "(.)\\1+";//Back reference takes $1  
        String s2 = s1.replaceAll(regex, "$1");
        System.out.println(s2);

    }
}

Enumeration type

1. When there are only a limited number of objects in a class, we can define this class as an enumeration class

2. Enumeration is strongly recommended when you need to define a set of constants

How to define an enumeration class?

The implementation methods are different according to different JDK versions

1. At jdk1 Before 5, customize an enumeration class

1. The constructor needs to be privatized to ensure that the number of objects of the class is limited

2. To create a member variable, you must define it as a constant

3. Provide public static member variables to the outside world to obtain the objects of enumeration classes

4. Only public get methods are provided

5. Override toString() method

public class EnumDemo1 {
    public static void main(String[] args) {
        Season spring = Season.SPRING;
        System.out.println(spring);
        System.out.println(spring.getSEASON_NAME());
        System.out.println(spring.getSEASON_DESC());

    }
}

class Season{
    //2. To create a member variable of Seanson, you must define it as a constant
    private final String SEASON_NAME;
    private final String SEASON_DESC;


    //1. The construction method needs to be privatized to ensure that the number of objects of the class is limited
    private Season(String SEASON_NAME,String SEASON_DESC){
        this.SEASON_NAME = SEASON_NAME;
        this.SEASON_DESC = SEASON_DESC;
    }

    //3. Provide public static member variables to the outside world to obtain the objects of enumeration classes
    public static final Season SPRING = new Season("spring","in the warm spring , flowers are coming out with a rush");
    public static final Season SUMMER = new Season("summer","Scorching sun");
    public static final Season AUTUMN = new Season("autumn","fresh autumn weather");
    public static final Season WINTER = new Season("winter","snow gleams white");

    //4. Only public get methods are provided
    public String getSEASON_NAME() {
        return SEASON_NAME;
    }

    public String getSEASON_DESC() {
        return SEASON_DESC;
    }

    //5. Override toString() method
    @Override
    public String toString() {
        return "Season{" +
                "SEASON_NAME='" + SEASON_NAME + '\'' +
                ", SEASON_DESC='" + SEASON_DESC + '\'' +
                '}';
    }
}

2. At jdk1 After 5, java provides a keyword called enum to create enumeration classes

1. The constructor needs to be privatized to ensure that the number of objects of the class is limited

2. To create a member variable, you must define it as a constant

3. Enumeration has a limited number of objects, which are connected by commas and end with the last semicolon. Enumerations are placed in the header

4. Only public get methods are provided

public class EnumDemo2 {
    public static void main(String[] args) {
        Season2 spring = Season2.SPRING;
        System.out.println(spring);
        System.out.println(Season2.class.getSuperclass());//Enum has a parent class
    }
}

enum Season2{

    //3. Enumeration has a limited number of objects, which are connected by commas and end with the last semicolon
    //Enumerations are placed in the header
    SPRING("spring", "Recovery of all things"),
    SUMMER("summer", "Scorching sun"),
    AUTUMN("autumn", "fresh autumn weather"),
    WINTER("winter", "snow gleams white");

    //2. Create the properties of Season2 and handle constants
    private final String SEASON_NAME;
    private final String SEASON_DESC;


    //1. To ensure that the number of objects of the class is limited
    //Then we must have a private constructor
    private Season2(String SEASON_NAME,String SEASON_DESC){
        this.SEASON_NAME = SEASON_NAME;
        this.SEASON_DESC = SEASON_DESC;
    }


    //4. Provide SEASON_NAME and session_ get method of desc
    public String getSEASON_NAME() {
        return SEASON_NAME;
    }

    public String getSEASON_DESC() {
        return SEASON_DESC;
    }

}

Enumeration classes can implement interfaces

1. Implement the abstract method in the interface directly in the enumeration class

2. Implemented in each enumerated object

Topics: Java regex