Shell_04_Regular expression RE

Posted by geekette on Sat, 13 Jun 2020 03:06:51 +0200

Shell_04_Regular expression RE

What is a regular expression

Simply put, a regular expression is the way to process strings. It handles strings in units of behavior. Regular expressions, assisted by some special symbols, make it easy for users to "search/delete/replace" a particular string handler!

A regular expression is essentially an expression that can be used as string processing for regular expressions as long as the tool program supports it.For example, vi, grep, awk, sed, and so on, because they support regular expressions, these tools can use special characters of regular expressions for string processing.However, directives such as CP and LS do not support regular expressions, so Bash's own wildcards are the only ones that can be used.

Is the basis of the Linux foundation, if you have learned it, it must be "greatly helpful"!It's like the learning and military difficulties in Jin Yong's novel: Ren Du Er Mai!Wugong doubled immediately after he got through the two veins of Ren Du.

About Language Family

In the English case encoding order, zh_TW.big5 The output of C and C are as follows:

LANG=C When: 0 1 2 3 4 ... A B C D ... Z a b c d ...z
LANG=zh_TW When: 0 1 2 3 4 ... a A b B c C d D ... z Z

In particular, remember:

[: alnum] stands for all uppercase and lowercase English characters and numbers 0-9 a-z a-z
 [: alpha:] stands for any English upper and lower case character A-Z a-z
 [: lower:] for lowercase characters * a-z
 [: upper:] for uppercase characters A-Z
 [: digit:] stands for number * 0-9

Exercise Sample File

Data from Brother Bird's private dishes

"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
However, this dress is about $ 3183 dollars.
GNU is free air not free beer.
Her hair is very beauty.
I can't finish the test.
Oh! The soup taste good.
motorcycle is cheap than car.
This window is clear.
the symbol '*' is represented as start.
Oh! My god!
The gd software is a library for drafting programs.
You are the best is mean you are the no. 1.
The world <Happy> is the same with "glad".
I like dog.
google is the best tools for search keyword.
goooooogle yes!
go! go! Let's go.
# I am VBird

Match example

Output:

Output:

Output:

Output:

Match English periods.

Output:

Output:

Output:

Output:

Output:

Match 2 consecutive a characters

Match more than two consecutive a characters

Match up to three consecutive characters a

Advanced grep

-A n Lists the n rows that match the successful rows.A is the first letter of after
        That's what it meant later

-B n Lists the n rows that precede the successful matches.B is the first letter of before
        That's what it meant before

Example:

Show/etc/passwd lines with mail and the first 2 and last 3 lines

[root@e9818e4ea8b3 ~]# grep mail -B 2 -A3 /etc/passwd
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin

Show the top and bottom three lines of the target line (with mail characters here)

grep mail -C  3 /etc/passwd

image

Show only matched characters

grep -o 'nologin' /etc/passwd

Add statistics

grep -o -c 'nologin' /etc/passwd

File name only

grep -l 'nologin' /etc/passwd

Recursive lookup, is to find in a directory

grep  -r  'nologin' /etc/

Search for test or tast

grep -n 't[ae]st'  regular_repress.txt

Search for oo but don't precede it with G

grep -n '[^g]oo'  regular_repress.txt

Note: When a search line contains criteria to satisfy the search criteria, the line ignores criteria that are explicitly not required. For example, the following may be found in the above example

3:tool is a good tool
8:goooooogle 

Show lines with non-lowercase characters before oo
Method 1:

grep -n '[^a-z]oo' regular_repress.txt

Method 2:

grep -n  '[^[:lower:]]oo' regular_repress.txt

Show lines that do not begin with English characters

grep -n  '^[^[:alpha]]' regular_repress.txt

Symbol ^ has the opposite meaning in [] and the first meaning outside []
Show rows where the beginning of the line is not # and;.

grep '^[^#;]' regular_repress.txt

Find the line ending with.

grep -n  '\.$' regular_repress.txt

Need to transfer with \

Lookup starts with G and ends with g, with or without middle characters

grep -n   'g.*g' regular_repress.txt

Represents an arbitrary character
*Repeats zero to more than one character in front of it
*Represents zero or more arbitrary characters

Find any file name starting with a
Method 1:

wildcard

ls -l a*

Method 2:

ls |grep -n '^a.*'

List linked files under / etc directory

ls -l /etc |grep '^l' 

Count how many more

ls -l /etc |grep '^l' |wc -l

Extended Regular

Tools to support extended regularization

  • grep -E
  • egrep
  • sed
  • awk

================================================

Regular Advanced Part: Greedy | Non-Greedy (Extended)

Greed is matching as much as possible

Non-greedy means matching as little as possible, just after some quantifier (i.e., number of times), such as:. *?+?

grep implements non-greedy

grep or egrep are greedy by default and do not support non-greedy.
To be non-greedy, the -P parameter is used, which uses the regular Perl language environment


In Perl language:

  • \w denotes any upper and lower case letter [a-zA-Z], underscore_And numbers [0-9]
  • \d denotes any number [0-9]
    Of course, these rules apply to most programming languages, such as python java javascript go php.

-P parameter, which uses the regular Perl locale

In Perl language:

  • \w denotes any upper and lower case letter [a-zA-Z], underscore_And numbers [0-9]
  • \d denotes any number [0-9]
    Of course, these rules apply to most programming languages, such as python java javascript go php.

Topics: ftp Programming Python Java