awk concept and usage

Posted by blockage on Tue, 28 Dec 2021 12:32:55 +0100

1, awk overview

1. How awk works

Read the text line by line, with spaces or by default tab Key as a separator, save each field obtained by segmentation to the built-in variable, and execute the editing command according to the mode or condition.
sed The processing of a whole line of command idioms, and awk It is preferred to divide a row into multiple "fields" before processing. awk The information is also read line by line, and the execution results can be obtained through print The function of will print and display the field data. in use awk You can use logical operators during the command“&&"Indicates "and"“||"Indicates "or" and "!" Means "not"; It can also perform simple mathematical operations, such as+,-,*,/,%,^They represent addition, subtraction, multiplication, division, remainder and power respectively.

2. awk usage format

awk option 'condition{Operator}' file...
awk -f Script file	file...

3. Common built-in variables of awk (can be used directly)

Built in variablefunction
FSColumn separator. Specifies the field separator for each line of text, which defaults to spaces or tab stops. Same as "- F"
NFNumber of fields in the currently processed row
NRThe line number (ordinal number) of the currently processed line
$0The entire line content of the currently processed line
$nThe nth field of the current processing row (column n)
FILENAMEFile name processed
RSLine separator. When awk reads data from a file, it will cut the data into many records according to the definition of RS, while awk only reads one record at a time for processing. The default is "\ n"

2, awk usage demo

1. Output by line

awk '{print}' File name or awk '{print $0}' file name
 Output all content
[root@localhost ~]# cat 1.txt		#see file
1
2
3
4
[root@localhost ~]# awk '{print}' 1.txt 		#Output file content
1
2
3
4
[root@localhost ~]# awk '{print $0}' 1.txt 		#Output file content
1
2
3
4

2. Outputs the contents of the specified line

awk 'NR==n,NR==m {print}' file name
 Output No n Line to m Content of line
[root@localhost ~]# awk 'NR==2,NR==3{print}' 1.txt 
2
3

awk '(NR > = n) & & (NR < = m) {print}' file name
Output the contents of lines n to m

[root@localhost ~]# awk '(NR>=2)&&(NR<=3){print}' 1.txt 
2
3

awk 'NRN | NRM {print}' file name
Output the contents of lines n and m

[root@localhost ~]# awk 'NR==2 || NR==3{print}' 1.txt 
2
3

3. Output specific content (combined with regular expressions)

Output odd and even contents (modify the remainder to 0)/1 (OK)
[root@localhost ~]# awk '(NR%2)==0{print}' 1.txt
2
4
[root@localhost ~]# awk '(NR%2)==1{print}' 1.txt
1
3

Output content that begins or ends with a specific
[root@localhost ~]# awk '/^2/ {print}' 1.txt 		#Output content starting with 2
2
[root@localhost ~]# awk '/3$/ {print}' 1.txt		#Output content ending in 3 
3

4. BEGIN and END

awk 'BEGIN {x=0}; /^a/ {x++}; END {print x}' file name
 String in output file a Number of first lines
BEGIN The pattern indicates that it needs to be executed before processing the specified text BEGIN Actions specified in the mode; awk Reprocess the specified text before execution END The action specified in the mode, END{}Statements such as print results are often placed in statement blocks.

Example:

[root@localhost ~]# awk 'BEGIN{X=0}; /^2/ {x++}; END{print x}' 1.txt 	#Count the number of lines starting with 2
1
[root@localhost ~]# awk 'BEGIN{X=0}; /3$/ {x++}; END{print x}' 1.txt 	#Count the number of lines ending in 3
1

5,awk -FS/-F

1,awk -F ":" '{print $n}' file name
 with:Number is the separator, and the number of each line is output n Fields
[root@localhost ~]# cat 2.txt 
1:2:3:4:5
a,b,c,d,e

[root@localhost ~]# awk -F ':' '{print $2}' 2.txt 		#View the second field with: as the separator
2
[root@localhost ~]# awk -F ',' '{print $2}' 2.txt 		#View the second field with, as the separator

b

awk -F ":" '{print $n,$m}' file name
Take the: sign as the separator to output the nth and m fields of each line

[root@localhost ~]# awk -F ':' '{print $2,$4}' 2.txt 		#Output the 2nd and 4th fields with: as the separator
2 4
 

awk -F ":" '$n < m {print $n}' file name
Take the: sign as the separator. When the nth field is less than m, the nth field is output

[root@localhost ~]# awk -F ':' '$2<4 {print $2}' 2.txt 
2

[root@localhost ~]# awk -F ':' '$2>4 {print $2}' 2.txt 		#The second field 2 is no larger than the fourth field 4, and no output is used
[root@localhost ~]# 

awk -F “:” ‘! ($n < m) {print} 'file name
“!” The exclamation point is reversed

[root@localhost ~]# awk -F ':' '!($2>4) {print $2}' 2.txt 		#With: as the separator, when the second character is not greater than the fourth character, the second content is output,
2					

[root@localhost ~]# awk -F ':' '!($2>4) {print}' 2.txt 			#You can also delete $2 and change it to {print} to output all the contents
1:2:3:4:5
a,b,c,d,e

6. Ternary operator

awk -F ":" '{max=($n>=$m) ? $n : $m; {print max}}' file name
 Ternary operator. with:No. is a separator, if No n The value of the first field is greater than or equal to the second field m The value of the first field n Assign values of fields to max,Otherwise, put the second m Assign values of fields to max

In short, it is to assign the maximum value to max,

[root@localhost ~]# awk -F ':' '{max=($2>$4)? $2:$4; {print max}}' 2.txt 		#Simply put, it is to assign the maximum value to max
4

7. Specific fields contain specific characters

1,awk -F ":" '$n~ "a" {print $m}' file name
 with:No. is a separator, and output No n Fields contain strings a Line of m Fields
[root@localhost ~]# awk -F ':' '$7~ "/bash" {print $0}' /etc/passwd			#The fields here should be in double quotation marks, such as: "/ bash"
root:x:0:0:root:/root:/bin/bash
qiao:x:1000:1000:qiao:/home/qiao:/bin/bash

2,awk -F ":" '($n~ "a") && (NF==m) {print}' file name
 with:No. is a separator, and output No n Fields contain strings a And there m Rows of fields

If the seventh field of the line with: as the separator contains "/ bash" and has 7 fields, the first character of the line is output

[root@localhost ~]# awk -F ':' '($7~ "/bash") && (NF==7) {print $1}' /etc/passwd
root
qiao
3,awk -F ":" '($n != "a") && ($m != "b") {print}' file name
 with:No. is a separator, and output No n Field is neither a string a Nor is it a string b Line of
[root@localhost ~]# awk -F ':' '($7!="/bin/bash") && ($7!="/sbin/nologin") {print}' /etc/passwd
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt

8. Combining pipe symbols|

1. Count the total number of fields through pipeline operators

[root@localhost ~]# echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin

[root@localhost ~]# echo $PATH | awk 'BEGIN{RS=":"}; END{print NR}'	#Define RS =:, use: to split rows, and then count the number of rows processed, which is also the number of fields before processing
5

2. View current memory usage percentage
Gets the line starting with Mem:, which is separated by a space or tab by default

[root@localhost ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:           1984         783         382          10         818         958
Swap:          4095           0        4095
[root@localhost ~]# free -m | awk '/Mem:/ {print int($3/($3+$4)*100)"%"}'
67%

9. awk and array

1,use awk Build array
awk 'BEGIN {a[0]=10; a[1]=20; print a[0]}'
awk 'BEGIN {a[0]=10; a[1]=20; print a[1]}'
[root@localhost ~]# awk 'BEGIN {a[0]=10; a[1]=20; print a[0]}'
10
[root@localhost ~]# awk 'BEGIN {a[0]=10; a[1]=20; print a[1]}'
20

awk 'BEGIN {a["ab"]=10; a["xyz"]=20; print a["ab"]}'

[root@localhost ~]# awk 'BEGIN {a["abc"]=10; a["xyz"]=20; print a["abc"]}'
10

awk 'BEGIN {a[0]=10; a[1]=20; a[2]=30; for (i in a) {print i,a[i]}}'

[root@localhost html]# awk 'BEGIN {a[0]=10; a[1]=20; a[2]=30; for (i in a) {print i,a[i]}}'
0 10
1 20
2 30

The command in BEGIN is executed only once,
In addition to numbers, the subscript of awk array can also use strings. Double quotes are required for strings

2. awk loop traversal

[root@localhost ~]# cat 1.txt 
1
2
3
4
[root@localhost ~]# awk '{a[1]++} END{for(i in a) {print a[i]}}' 1.txt		#The initial value of a[1] is 0, and for(i in a) a[1] + + will be executed once. Therefore, the number of lines in the final document is equal to a[1],
4

Use awk to get the file test1 Repeat lines and times in TXT

[root@localhost ~]# cat 1.txt 
1
2
1
2
3
4
4
[root@localhost ~]# awk '{a[$1]++} END{for(i in a) {print a[i],i}}' 1.txt
2 4
2 1
2 2
1 3

Note: the initial value of a[ ] is 0, and it will be 1 after a[ ] +. Here, the final value of a[ ] + + in awk is determined by the number of lines in the text. After reading the text line by line, execute the command in END