Inheritance: let the subclass directly own all the properties and methods of the parent class
class Class name(Parent 1, parent 2,...): pass
1. Regular expression
Regular expressions are a tool to simplify complex string problems
1) Regular expressions are composed of various regular symbols
2) Introduction to re module
re module is a system module used by python to support regular related operations
Fullmatch (regular expression, string) - judge whether the regular expression exactly matches the specified string. If not, the result is None
#Judge whether the input content is legal (mobile phone number) from re import fullmatch def is_tel(tel_num:str): return fullmatch(r'1[3-9]\d{9}',tel_num)!=None
1.1 matching symbols
1. Ordinary characters - characters without special functions and special meanings in regular characters
Ordinary characters represent the symbol itself in regular expressions, such as a to Z, a to Z, 0 ~ 9, Chinese
2. Special symbols
1). —— Match any character (one point can only match one character)
result=fullmatch(r'a.d','assd') result1=fullmatch(r'a.d','asd') print(result) # None print(result1) # <re.Match object; span=(0, 3), match='asd'>
2) \ d -- match any numeric character
result=fullmatch(r'a\dd','a8d') print(result) # <re.Match object; span=(0, 3), match='a8d'>
3) \ s -- match any white space character
White space characters: space, \ t, \ n
result=fullmatch(r'a\sd','a d') print(result) # <re.Match object; span=(0, 3), match='a d'>
4) The functions of \ D and \ s are opposite to those of \ D and \ s
\D -- match any non numeric character
\S -- match any non white space character
5) [character set] - matches any character in the character set
result=fullmatch(r'a[xyz]c','abc') print(result) # None 1)All ordinary characters # Match any one of xyz: axc,ayc,azc result1=fullmatch(r'a[xyz]c','ayc') print(result1) # <re.Match object; span=(0, 3), match='ayc'> 2)contain\The matching symbol at the beginning. At this time, the function of the matching symbol is effective;[mn\d]==[mn0123456789] result=fullmatch(r'a[\dxyz]c','a5c') print(result) # <re.Match object; span=(0, 3), match='a5c'> 3)A minus sign between two symbols indicates who to whom(from small to large) [a-z]-Match any lowercase letter [A-Z]-Match any uppercase letter [a-zA-Z]-Match any letter [\u4e00-\u9fa5]-Match any Chinese result=fullmatch(r'a[a-z]c','azc') print(result) # <re.Match object; span=(0, 3), match='azc'>
6.[^character set]-Matches any character that is not in the character set result=fullmatch(r'a[^xyz]c','amc') print(result) # None result=fullmatch(r'a[^xyz]c','axc') print(result) # <re.Match object; span=(0, 3), match='axc'>
1.2 matching times
Usage: number of matching symbols
a*b-b Any number of a \d*b-b Preceded by any number
1) * - 0 or more times (any number)
result = fullmatch(r'\d*c', 'c') result1 = fullmatch(r'\d*c', '566c') print(result) # <re.Match object; span=(0, 1), match='c'> print(result1) # <re.Match object; span=(0, 4), match='566c'>
2) + - one or more times (at least one time)
result = fullmatch(r'\d*c', 'c') result1 = fullmatch(r'\d*c', '566c') print(result) # None print(result1) # <re.Match object; span=(0, 4), match='566c'>
3)?—— 0 or 1 times
# Exercise: write a regular expression that can match any positive integer (regardless of 0) result1 = fullmatch(r'[+]?[1-9]+[0-9]*', '1') result = fullmatch(r'[+]?[1-9]\d*', '1')
4){}
1){M,N}-M reach N Times( M<N) 2){M,}-at least M second 3){,N}-most N second 4){N}-N second result = fullmatch(r'a{3}', 'aaa') result1 = fullmatch(r'a{3}', 'aa') print(result) # <re.Match object; span=(0, 3), match='aaa'> print(result1) # None
1.3 greedy and non greedy
Note: except fullmatch, greedy and non greedy problems may occur in python
When the matching times are uncertain, the matching mode is divided into greedy and non greedy (greedy mode by default)
*,+,{M,N},{M,},{,N}-greedy *?,+?,{M,N}?,{M,}?,{,N}?-non-greedy (Add English?) result = search('.+?b', 'try bshbsbj823') result1= search('.+b', 'try bshbsbj823') print(result) #<re. Match object; Span = (0, 3), match = 'try B' print(result1) #<re. Match object; Span = (0, 8), match = 'try bshbsb' >
1.4 grouping and branching
1) Grouping - ()
1) Application 1: Use part of regular expression()Enclose and operate as a whole result=fullmatch(r'([a-z]{3}\d{2}){2}','abc23dfg54') 2) Application 2: Repetition -- can be passed in a grouped regular expression'\N'To repeat the first step before it N Content matched by groups result=fullmatch(r'(\d)a\1','9a9') # <re.Match object; span=(0, 3), match='9a9'> result=fullmatch(r'(\d)a\1','9a6') # None result1=fullmatch(r'(\d)(a)\2\1','1aa1') # <re.Match object; span=(0, 4), match='1aa1'> 3) Application 3: use findall If there is a group in the regular expression, only the matching content in the group will be returned when returning data # Take the number after a lowercase letter str1='sj12ms55MMK15 Standby time 15' result=findall(r'[a-z](\d+)',str1) print(result) # ['12', '55']
2) Branch -|
1) Regular 1|Regular 2 -- regular 1 and regular 2 match successfully as long as one of them can match successfully result=fullmatch(r'abc(\d{3}|[A-Z]{3})','abc123') result=fullmatch(r'abc(\d{3}|[A-Z]{3})','abcKSN')
1.5 others
1) Escape symbol - add \ 'before the symbol with special function or special meaning to make its function or meaning disappear and become an ordinary symbol
# Take a.b result=fullmatch(r'a\.b','anb') #None result=fullmatch(r'a\.b','a.b') # <re.Match object; span=(0, 3), match='a.b'> Another way to make the symbol function disappear: when a single symbol has a special function, it can be placed in[]Make its function disappear result=fullmatch(r'a[.]b','a.b')
2) Ignore case: precede the regular with (? i)
result=fullmatch(r'(?i)a.b','ACB') #<re.Match object; span=(0, 3), match='ACB'>
3) Single line matching and multi line matching
When multiple rows match Cannot match '\ n' (default) - (? m)
When a single line matches Can match '\ n' - add (? s) to the front of the regular
result=fullmatch(r'a.b','a\nb') # None result=fullmatch(r'(?s)a.b','a\nb') # <re.Match object; span=(0, 3), match='a\nb'> #Ignore case matching And \ n result=fullmatch(r'(?si)a.b','A\nb') # <re.Match object; span=(0, 3), match='A\nb'>
1.6 common functions in re module
1) (Commonly used)fullmatch(regular,character string)-Judge whether the whole string conforms to the rules of regular description (exact match). The matching object is returned if the matching is successful, and the matching object is returned if the matching is failed None 2) math(regular,character string)-At the beginning of the matching string, the matching object is returned if the matching is successful, and the matching object is returned if the matching is failed None 3) search(regular,character string)-Get the first regular string in the string, find the matching object corresponding to the return string, and cannot find the return string None 4) (Commonly used)findall(regular,character string)-Get all the regular substrings in the string and return a list. The elements in the list are strings or tuples 5) finditer(regular,character string)-Get all the regular substrings in the string and return an iterator. The elements in the iterator are the matching objects corresponding to the substring 6) (Commonly used)split(regular,character string)-Take all the regular substrings in the string as the cutting point, cut the string and return a list. The elements in the list are strings 7) (Commonly used)sub(regular,String 1,String 2)-Replace all regular substrings in string 2 with string 1, and return the replaced new string