linux three swordsman
1. Three swordsman application scenarios
characteristic | command | scene |
---|---|---|
grep | filter | The grep command filters the fastest |
sed | Replace, modify file content, take line | If replacement / modification is required; Take out the content in a range (from 11:00 to 12:00) |
awk | Column fetching and statistical calculation | Take column Comparison, comparison > = < =! - >< Statistics, calculation (awk array) |
2 three swordsman grep
option | meaning |
---|---|
-E | ==egrep supports extended regularization |
-A | See, after,-A5 matches what you want and displays the next 5 lines |
-B | See, before,-A5 matches what you want and displays the following 5 lines |
-C | Context, context - C5 matches the content you want and displays up and down 5 lines |
-c | Count the number of occurrences, equivalent to wc -l |
-v | Reverse, eliminate |
-w | Exact match, nothing on the left and nothing on the right |
[root@nn-01 ~]# seq 10 |grep 3 -A5 3 4 5 6 7 8
3 three swordsman sed
3.1 features and format
-
sed stream editor, sed treats the processed content (file) as a stream and processes it continuously until the end of the file
-
sed format
command option (s) sed command function (g) modifier Parameters (file) sed -r 's#oldboy#oldgirl#g' oldboy.txt -
The core function of sed function: addition, deletion, modification and query
function s Replace substitute p Display print d Delete delete cai Add c/a/i
3.2 execution process of SED command
Reading the file line by line into memory is like doing water flow, judging and executing the corresponding operation.
3.3 core application of SED command
1) sed - find p
Find format | explain |
---|---|
'2p' | Specify a row to find |
'1,5p' | Specify a line number range to find |
'/lidao/p' | Similar to grep, filtering, / / can write regular |
'/10:00/,/11:00/p' | Indicates the filtering of the range |
2)sed - delete d
Delete format | explain |
---|---|
'2d' | Delete specified row |
'1,5d' | Deletes the row of the specified line number range |
'/lidao/d' | Delete the line containing lidao, / / which can write regular |
'/10:00/,/11:00/d' | Delete the line in the specified range. If the specified end does not match, it will match until the last line. Agree with '/ 10:00/,$d'. |
3)sed - increase cai
command | explain |
---|---|
c | replace replaces this line |
a | Append append after the specified content |
i | insert inserts before the specified row |
4)sed - replace s
- s substitute
- G global, sed command. By default, only the first matching content of each line is replaced.
- If the content to be replaced is empty, it is equivalent to deletion
- Back reference,'s#()#'#g ', enclose the content you want to reference in parentheses, and use \ 1 and \ 2 to reference
format |
---|
's###g' |
4 three swordsmen -awk
4.1awk implementation process
awk -F, 'BEGIN{print"begin of file"}{print $2}END{print "end of file"}'
process | implement | give an example |
---|---|---|
Before reading the file | Command assignment or command line parameter | -F, |
BEGIN | BEGIN{print"begin of file"} | |
Read from file | The execution process is similar to the sed command, which is read and executed line by line | {print $2} |
After reading the file | END | END{print "end of file"} |
4.2 rows and columns
noun | Name in awk | Some notes |
---|---|---|
that 's ok | Record record | By default, each row is divided by carriage return |
column | field | Each column is separated by spaces by default |
Row and column separators in awk can be modified |
1) Take line
awk | a |
---|---|
NR==1 | Take out the first row |
NR>=1 && NR<=5 | Take out the range from line 1 to line 5 |
/oldboy/ | |
/101/,/105/ | |
Symbol | > < >= <= == != |
2) Take column
-
-F specifies the separator and specifies the end tag of each column (the default is space, continuous space and tab key)
-
number word take Out some one column , notes meaning : stay a w k in Take out a column of numbers. Note: in awk Take out a column of numbers. Note: the content in awk means to take out a column
-
0 surface show whole that 's ok of within Allow , 0 indicates the content of the whole line, 0 indicates the contents of the whole row and NF indicates the last column
[hadoop@db48 ~]$ #Take out the ip address in the first network card and the contents of the specified row and column. [hadoop@db48 ~]$ ip a s eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1454 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:5b:87:da brd ff:ff:ff:ff:ff:ff inet 10.10.110.33/16 brd 10.10.255.255 scope global eth0 valid_lft forever preferred_lft forever [hadoop@db48 ~]$ ip a s eth0 |awk 'NR==3'|awk -F"[ /]+" '{print $3}' 10.10.110.33 [hadoop@db48 ~]$ ip a s eth0 |awk -F"[ /]+" 'NR==3{print $3}' 10.10.110.33
4.3 matching mode
- Who can be the condition of awk
- Comparison symbol: > < > = < = ==
- regular
- Range expression
- Special conditions BEGIN and END
awk | -F"[ /]+" | 'NR==3{print $3}' |
---|---|---|
command | option | ’Condition {action}‘ |
’Mode {action}‘ | ||
'partten{action}' |
1) Comparison expression - refer to the line taking section above
2) Regular
- //Support extended regular
- awk can be accurate to a certain column, and a certain row contains / does not contain
- ~Contain
- !~ Not included
regular | awk regularity |
---|---|
^Means to begin with | The beginning of a column $3~/^oldboy/ |
$means ending with | End of a column 4 / l i d a o 4~/lidao 4 /lidao/ |
#Find the row that starts with 2 in the third column of / etc/password, and display the first, third and last columns [hadoop@db48 ~]$ tail -10 /etc/passwd tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin zsw:x:1000:1000::/home/zsw:/bin/bash lxq:x:1001:1001::/home/lxq:/bin/bash hadoop:x:1002:1002::/home/hadoop:/bin/bash mysql:x:27:27:MySQL Server:/var/lib/mysql:/bin/false gj:x:1003:1003::/home/gj:/bin/bash tmh:x:1004:1004::/home/tmh:/bin/bash zzl:x:1005:1005::/home/zzl:/bin/bash phy:x:1006:1006::/home/phy:/bin/bash wbq:x:1007:1007::/home/wbq:/bin/bash [hadoop@db48 ~]$ awk -F':' '$3~/^2/{print $1,$3,$NF}' /etc/passwd daemon 2 /sbin/nologin nscd 28 /sbin/nologin mysql 27 /bin/false
3) Representation range
- /Where to start / where to end / often used
- NR1 and NR5 start on the first line and end on the fifth line. Similar to sed -n '1,5p'
#Displays the ip address within the specified time range (11:02:00 to 11:02:30) awk '/11:02:00/,/11:02:30/{print $1}' access.log
4) Special modes BEGIN {} and END {}
pattern | meaning | Application scenario |
---|---|---|
BEGING{} | The contents will be executed before awk * * * reads the file*** | 1) Carry out simple statistics and calculation without reading files (commonly used) 2) Add a header before processing the file (understand) 3) Used to define awk variables (rarely used because - v can be used) |
END{} | The contents will be executed after awk reads the file | 1) awk performs statistics. The general process is: first perform calculation, and finally output the results in END (common) 2)awk uses arrays to output array results (common) |
- END statistical calculation
- Statistical method:
statistical method | Abbreviated form | Application scenario |
---|---|---|
i=i+1 | i++ | Count times |
sum=sum+??? | sum+=??? | Summation |
array[]=array[]++ | array[]++ | Array classification, statistics, count |
Note: I and sum are variables |
#Count the number of empty lines in / etc/services [hadoop@db48 ~]$ awk '/^$/' /etc/services |wc -l 17 [hadoop@db48 ~]$ awk '/^$/{i++}END{print i}' /etc/services 17 #seq 100 summation 1 + 2 + 3 +... + 100 awk implementation [hadoop@db48 ~]$ seq 100 |awk '{sum=sum+$1}END{print sum}' 5050 #If viewing the process [hadoop@db48 ~]$ seq 100 |awk '{sum=sum+$1;print sum}END{print sum}'
4.4awk array
- Statistics log:
- Count the number of times: count the number of times each ip appears, the number of times each status code appears, the number of times each user in the system is attacked, and the number of times the attacker's ip appears
- Cumulative summation: count the traffic consumed by each ip
shell array | awk array | ||
---|---|---|---|
form | array[0]=oldboy array[1]=lidao | array[0]=oldboy array[1]=lidao | |
use | echo ${array[0]} | print array[0] | |
Batch output array contents | for i in ${array[*]} do echo $i done | for(i in array) print array[i] | awk array special loop. The variable gets the index of the array. You want the contents of the array array[i] |
[hadoop@db48 ~]$ awk 'BEGIN{a[0]=oldboy;a[1]=lidao;print a[0],a[1]}' #Letters in awk will be recognized as variables. If you just want to use a string, you need to add double quotation marks "" [hadoop@db48 ~]$ awk 'BEGIN{a[0]="oldboy";a[1]="lidao";print a[0],a[1]}' oldboy lidao [hadoop@db48 ~]$ awk 'BEGIN{a[0]="oldboy";a[1]="lidao";for(i in a) print i,a[i]}' 0 oldboy 1 lidao #Count the number of domain names and arrange them in reverse order [hadoop@db48 tmp]$ cat url.txt ://jingyan.baidu.com/article/6079ad0e7744e869fe86db18.html https://www.baidu.com/s?ie=UTF-8&wd=typoro%E8%B7%B3%E5%87%BA%E5%88%97%E8%A1%A8 https://www.baidu.com/s?ie=UTF-8&wd=ll%20%E6%96%87%E4%BB%B6%E9%A2%9C%E8%89%B2 https://blog.csdn.net/qq_29242127/article/details/77141485 https://blog.csdn.net/qq_29242127/article/details/77141485 https://blog.csdn.net/qq_29242127/article/details/77141485 #Sort - R (reverse) - n (number) - K2 (second column) [hadoop@db48 tmp]$ awk -F"[/]+" '{array[$2]++}END{for(i in array) print i,array[i]}' url.txt |sort -rnk2 blog.csdn.net 3 www.baidu.com 2 jingyan.baidu.com 1
4.5awk cycle, judgment
shell programming c language for loop | awk for loop | |
---|---|---|
for((i=1;i<=10;i++)) do echo $i done | for(i=1;i<=10;i++) print i | The awk for loop is used to loop through each field |
#1+100 [hadoop@db48 tmp]$ awk "BEGIN{for(i=1;i<=100;i++)sum+=i;print sum}" 5050 [hadoop@db48 tmp]$ awk "BEGIN{ for(i=1;i<=100;i++) sum+=i; print sum }" 5050 [hadoop@db48 tmp]$ awk "BEGIN{ for(i=1;i<=100;i++) {sum+=i; print sum} }" 5050
shell programming if conditional sentence | awk if conditional judgment | |
---|---|---|
if["oldhuang" -eq 18];then echo take to dbj fi | If (conditional) print "dbj" | |
if["oldhuang" -eq 18];then echo take to dbj else echo "rest" fi | if() print "dbj" else print "rest" |
#Find out the disks with disk utilization greater than 70%, and print Filesystem,Size,Used [hadoop@db48 ~]$ df -h|awk -F"[ %]+" 'NR>1{if($5>70) print "Disk is not enough\t" $1,$2,$5}' Disk is not enough /dev/vda1 20G 77 #Note: awk if there are multiple judgment conditions, the first condition can be placed in 'condition {action}', and the second condition generally uses if #Interview questions, count the words with less than 6 words in the following sentences and display them #I am oldboy teacher welcom to oldboy teacher class [hadoop@db48 ~]$ echo I am oldboy teacher welcom to oldboy teacher class|awk '{ for(i=1;i<=NF;i++) if(length($i)<6) print $i }' I am to class
4.6awk built in variables
Built in variable | meaning |
---|---|
NR | Number of Record |
NF | Number or Field each row has multiple fields (columns) $NF represents the last column |
FS | -F: == -v FS=: Field Separator field separator, end tag of each field |
OFS | Output file separator output field separator (when awk displays each column, what is the division between each column? The default is space) |
4.7 summary
-
gawk gnu awk
-
awk option - F -v
-
awk execution process
-
awk row and column fetching
-
awk patterns: comparison, regular, range, special patterns
-
awk array: statistical analysis log
-
awk for loop, if condition judgment
-
Objectives:
- access.log counts the number of occurrences of each ip and the number of occurrences of each status code
- secure counts the number of times each user of the system is attacked and the number of times the attacker's ip appears
-
Cumulative summation: count the traffic consumed by each ip