Shell_04_Regular expression RE
What is a regular expression
Simply put, a regular expression is the way to process strings. It handles strings in units of behavior. Regular expressions, assisted by some special symbols, make it easy for users to "search/delete/replace" a particular string handler!
A regular expression is essentially an expression that can be used as string processing for regular expressions as long as the tool program supports it.For example, vi, grep, awk, sed, and so on, because they support regular expressions, these tools can use special characters of regular expressions for string processing.However, directives such as CP and LS do not support regular expressions, so Bash's own wildcards are the only ones that can be used.
Is the basis of the Linux foundation, if you have learned it, it must be "greatly helpful"!It's like the learning and military difficulties in Jin Yong's novel: Ren Du Er Mai!Wugong doubled immediately after he got through the two veins of Ren Du.
About Language Family
In the English case encoding order, zh_TW.big5 The output of C and C are as follows:
LANG=C When: 0 1 2 3 4 ... A B C D ... Z a b c d ...z LANG=zh_TW When: 0 1 2 3 4 ... a A b B c C d D ... z Z
In particular, remember:
[: alnum] stands for all uppercase and lowercase English characters and numbers 0-9 a-z a-z [: alpha:] stands for any English upper and lower case character A-Z a-z [: lower:] for lowercase characters * a-z [: upper:] for uppercase characters A-Z [: digit:] stands for number * 0-9
Exercise Sample File
Data from Brother Bird's private dishes
"Open Source" is a good mechanism to develop programs. apple is my favorite food. Football game is not use feet only. this dress doesn't fit me. However, this dress is about $ 3183 dollars. GNU is free air not free beer. Her hair is very beauty. I can't finish the test. Oh! The soup taste good. motorcycle is cheap than car. This window is clear. the symbol '*' is represented as start. Oh! My god! The gd software is a library for drafting programs. You are the best is mean you are the no. 1. The world <Happy> is the same with "glad". I like dog. google is the best tools for search keyword. goooooogle yes! go! go! Let's go. # I am VBird
Match example
Output:
Output:
Output:
Output:
Match English periods.
Output:
Output:
Output:
Output:
Output:
Match 2 consecutive a characters
Match more than two consecutive a characters
Match up to three consecutive characters a
Advanced grep
-A n Lists the n rows that match the successful rows.A is the first letter of after That's what it meant later -B n Lists the n rows that precede the successful matches.B is the first letter of before That's what it meant before
Example:
Show/etc/passwd lines with mail and the first 2 and last 3 lines
[root@e9818e4ea8b3 ~]# grep mail -B 2 -A3 /etc/passwd shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt mail:x:8:12:mail:/var/spool/mail:/sbin/nologin operator:x:11:0:operator:/root:/sbin/nologin games:x:12:100:games:/usr/games:/sbin/nologin ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
Show the top and bottom three lines of the target line (with mail characters here)
grep mail -C 3 /etc/passwd
image
Show only matched characters
grep -o 'nologin' /etc/passwd
Add statistics
grep -o -c 'nologin' /etc/passwd
File name only
grep -l 'nologin' /etc/passwd
Recursive lookup, is to find in a directory
grep -r 'nologin' /etc/
Search for test or tast
grep -n 't[ae]st' regular_repress.txt
Search for oo but don't precede it with G
grep -n '[^g]oo' regular_repress.txt
Note: When a search line contains criteria to satisfy the search criteria, the line ignores criteria that are explicitly not required. For example, the following may be found in the above example
3:tool is a good tool 8:goooooogle
Show lines with non-lowercase characters before oo
Method 1:
grep -n '[^a-z]oo' regular_repress.txt
Method 2:
grep -n '[^[:lower:]]oo' regular_repress.txt
Show lines that do not begin with English characters
grep -n '^[^[:alpha]]' regular_repress.txt
Symbol ^ has the opposite meaning in [] and the first meaning outside []
Show rows where the beginning of the line is not # and;.
grep '^[^#;]' regular_repress.txt
Find the line ending with.
grep -n '\.$' regular_repress.txt
Need to transfer with \
Lookup starts with G and ends with g, with or without middle characters
grep -n 'g.*g' regular_repress.txt
Represents an arbitrary character
*Repeats zero to more than one character in front of it
*Represents zero or more arbitrary characters
Find any file name starting with a
Method 1:
wildcard
ls -l a*
Method 2:
ls |grep -n '^a.*'
List linked files under / etc directory
ls -l /etc |grep '^l'
Count how many more
ls -l /etc |grep '^l' |wc -l
Extended Regular
Tools to support extended regularization
- grep -E
- egrep
- sed
- awk
================================================
Regular Advanced Part: Greedy | Non-Greedy (Extended)
Greed is matching as much as possible
Non-greedy means matching as little as possible, just after some quantifier (i.e., number of times), such as:. *?+?
grep implements non-greedy
grep or egrep are greedy by default and do not support non-greedy.
To be non-greedy, the -P parameter is used, which uses the regular Perl language environment
In Perl language:
- \w denotes any upper and lower case letter [a-zA-Z], underscore_And numbers [0-9]
-
\d denotes any number [0-9]
Of course, these rules apply to most programming languages, such as python java javascript go php.
-P parameter, which uses the regular Perl locale
In Perl language:
- \w denotes any upper and lower case letter [a-zA-Z], underscore_And numbers [0-9]
-
\d denotes any number [0-9]
Of course, these rules apply to most programming languages, such as python java javascript go php.