Regular expression RegExp

Posted by Boxerman on Tue, 14 Dec 2021 02:26:21 +0100

1, Concept of regular expression

Regular expression is a logical formula for string operation. It uses some specific characters defined in advance and the combination of these specific characters to form a "regular string". This "regular string" is used to express a filtering logic for strings.

2, Composition of regular expressions

Regular expressions consist of metacharacters, character classes, and quantifiers / ^ ([character classes] {quantifiers}) ([character classes] {quantifiers})$/

3, Regular object RegExp

  1. Declare regular objects with the new keyword
		//  Regular rule: it can be string type or regular type
		
		var patt = new RegExp(Regular rules,  Modifier )
		var patt = new RegExp('a',  'g')
		var patt = new RegExp(/a/,  'g')
  1. Literal form
		var patt = /Regular rules/Modifier 

The difference between the two methods

Variables cannot be accepted in literal form

4, Regular object method

  1. test(): performs a search to see if the regular expression matches the specified string. Returns true or false.

  2. exec(): performs a search match in a specified string. Returns an array of results (containing all matching results) or null (null if there is no match). If you want to retrieve all matching results, you need to use a loop
    When the method is executed, the end position of each matching will be recorded through the lastIndex attribute, and the next execution will start from the position recorded by lastIndex; When no matching result is found, the lastIndex property is reset to 0, starting from scratch

		var patt = /o/g;
		var str = 'hello world oo';
		var res1 = patt.exec(str) 
		var res2 = patt.exec(str) 
		var res3 = patt.exec(str) 
		
		//  Get all the results with a loop
		let res;
		while(res = patt.exec(str)){
		 console.log(res)
		}

5, Modifier

  • g: global matching
  • i: Case insensitive
  • m: Multiline matching

6, Metacharacter

  • ^: what does matching begin with
  • $: match what is the end
  • (): grouping or backreferencing
  • |: or

7, Character class

  • [xyz]: a collection of character classes that matches any character in parentheses
  • [^ xyz]: an inverted character set. That is, it matches any character not contained in square brackets
    In character classes, you can use - to represent a collection of ranges
  • [0-9]: match numbers 0 to 9
  • [a-z]: match lowercase letters
  • [A-Z]: match uppercase letters
  • [A-Za-z]: match case letters
  • [A-Za-z0-9]: match upper and lower case letters and numbers
  • . : The default matches any single character except the newline character.

For common matching rules, you can use the abbreviated equivalent form

  • \w: equivalent form [A-Za-z0-9#]

  • \W: matches a non word character. Equivalent to [^ A-Za-z0-9_]

  • \d: equivalent to [0-9]

  • \D: matching a non numeric character is equivalent to [^ 0-9]

  • \s: match any whitespace

  • \S: Other than whitespace

8, Quantifier

Quantifiers limit the number of occurrences of character classes; Quantifiers are divided into greedy and inert

  • *: matches 0 or more times, equivalent form {0,}
  • +: match 1 or more times, equivalent form {1,}
  • ? : Match 0 or 1 times, equivalent form {0,1}
  • {n} : match n times
  • {n,}: match at least N times, unlimited
  • {n,m}: at least n matches and at most M matches

9, Escape character\

If you need to match special characters (characters used in regular rules) in regular expressions, you need to use escape characters \ special characters; sometimes, in order to avoid uncontrollable problems, you need to use double escape \ \ characters

10, Grouping

() grouping and back reference;
Group only, no reference:?: ()

Back reference

The back reference refers to the result of the match, not the rule of the match
Backreference \ nthe number n is the order of the left parenthesis

//  Back reference
var patt = /^<(h1)>\w{1,}<\/\1>/
var str = '<h1>hello</h1>';
var res = patt.test(str)

//  Group only, no reference
var patt = /^((?:(ECMA)Script)Hello)$/
var str = 'ECMAScriptHello';
var res = patt.test(str);
//  Back reference $n: n is the order of left parentheses; If it is the attribute of a regular object, use $n; If used in a regular expression, use \ n

console.log(RegExp.$1);
console.log(RegExp.$2);
console.log(RegExp.$3);
console.log(RegExp.$4);

11, Or

var patt = /hello(red)|(black)/
var patt = /hello[a|b]/

12, Assert

  • x(?=y): match 'x' only if 'x' is followed by 'y' This is called forward-looking assertion (forward-looking).
  • x(?!y): match 'x' only when 'x' is not followed by 'y', which is called positive negative lookup (negative forward looking).

13, Matching pattern

There are two matching patterns of regular rules: greedy pattern and inert pattern

  • Greedy pattern: match as many as possible
  var str = '123456789012345678900';
  var res = str.match(/\d{3,6}/g);
  • Lazy pattern: match as little as possible
    Can I add after greedy mode? Become inert mode +? *? {n,}? {n,m}? ??
var str = '123456789012345678900';
var res = str.match(/\d{3,6}?/g);

14, Regular correlation method in string

  • replace(): replace the string matching the regular expression; The default replacement is the first matching string; If you want to replace all, you need a global match
'hello world'.replace(/o/, 'javascript')// Replace first by default
'hello world'.replace(/o/g, 'javascript') // replace all
  • match(): find one or more result arrays matching regular expressions,
var str = 'hella world';
// let res = str.match(/o/g);  //    If there is a global modifier g, the result is an array ['O ','O']
let res = str.match(/o/);
console.log(res);
/* //  Result resolution without global modifier g:
    0: "o" // Result of index value matching (index value of contents in array)
    groups: undefined 
    index: 4 //  The index value of the first occurrence of the matched character in the specified string
    input: "hello world"  // Matches the specified character
    length: 1 // Length of array
 */

  • search(): matches whether the specified character is contained in the specified string; It returns the location index that matches, or - 1 on failure.
    var str = 'hello';
    // var res = str.search(/e/)
    var res = str.search(/a/)
    console.log(res);
  • split(): String splitting
    var str = 'haksdf2324hwehej2313jkfghwer5345634adf';
    var res = str.split(/\d{1,}/)
    console.log(res);

Topics: Front-end Back-end regex