21 liunx shell programming description awk command; BEGIN command, END command, built-in variable NF NR FS; Logic operation of awk; The difference between print and printf

Posted by filteredhigh on Wed, 05 Jan 2022 05:06:06 +0100

awk is also a great data processing tool.
awk is used to intercept qualified columns.
Awk is much more powerful than cut; It can even be called awk programming.

awk command

Format:

awk 'Condition 1{Action 1} Condition 2{Action 2} ...' filename

meaning:

awk is followed by single quotation marks. There will be multiple conditions and action conditions {action} in the quotation marks. Just like if else in java, if condition 1 is met, action 1 is executed.
awk can process subsequent files or read standard output from the previous command.
awk is mainly used to process the "data in the field of each row", and the default "field separator" is "blank key" or "[tab] key"!

[userwin@MiWiFi-R3L-srv ~]$ df -h
 file system                 Capacity used available used% Mount point
/dev/mapper/centos-root   18G  2.0G   16G   12% /
devtmpfs                 479M     0  479M    0% /dev
tmpfs                    489M     0  489M    0% /dev/shm
tmpfs                    489M  6.7M  483M    2% /run
tmpfs                    489M     0  489M    0% /sys/fs/cgroup
/dev/sda1                497M  107M  391M   22% /boot
tmpfs                     98M     0   98M    0% /run/user/1000
# Want to use the cut command to get the second column of the above content; Was the result unexpected? 
[userwin@MiWiFi-R3L-srv ~]$ df -h | cut -d " " -f2







# The file system column is followed by multiple spaces, which is not recognized by the cut command. The awk command just solves this problem.
[userwin@MiWiFi-R3L-srv ~]$ df -h | awk '{print $2}'
capacity
18G
479M
489M
489M
489M
497M
98M

The file system column is followed by multiple spaces, which is not recognized by the cut command; As shown below:

Get the second and sixth columns df -h | awk '{print $2 "\ t" $6}'
The quotation mark is directly followed by {print. The condition is omitted here. The default is true.
awk most commonly used action! List the field data through the function of print! The fields are separated by blank key or [tab] key

[userwin@MiWiFi-R3L-srv ~]$ df -h | awk '{print $2 "\t" $6}'
capacity	Mount point
18G	/
479M	/dev
489M	/dev/shm
489M	/run
489M	/sys/fs/cgroup
497M	/boot
98M	/run/user/1000

The difference between print and printf

In the awk statement, the printf output will not wrap lines and needs to be added manually. \ n
Let's look at the following example:

[userwin@MiWiFi-R3L-srv ~]$ df -h | awk '{print $2 "\t" $6}'
capacity	Mount point
18G	/
479M	/dev
489M	/dev/shm
489M	/run
489M	/sys/fs/cgroup
497M	/boot
98M	/run/user/1000
[userwin@MiWiFi-R3L-srv ~]$ df -h | awk '{printf $2 "\t" $6}'
capacity	Mount point 18 G	/479M	/dev489M	/dev/shm489M	/run489M	/sys/fs/cgroup497M	/boot
[userwin@MiWiFi-R3L-srv ~]$ df -h | awk '{printf $2 "\t" $6 "\n"}'
capacity	Mount point
18G	/
479M	/dev
489M	/dev/shm
489M	/run
489M	/sys/fs/cgroup
497M	/boot
98M	/run/user/1000

Get the number before% in column 5 of sda1

# see
[userwin@MiWiFi-R3L-srv ~]$ df -h 
file system                 Capacity used available used% Mount point
/dev/mapper/centos-root   18G  2.0G   16G   12% /
devtmpfs                 479M     0  479M    0% /dev
tmpfs                    489M     0  489M    0% /dev/shm
tmpfs                    489M  6.7M  483M    2% /run
tmpfs                    489M     0  489M    0% /sys/fs/cgroup
/dev/sda1                497M  107M  391M   22% /boot
tmpfs                     98M     0   98M    0% /run/user/1000
# Get sda1 row
[userwin@MiWiFi-R3L-srv ~]$ df -h | grep "sda1" 
/dev/sda1                497M  107M  391M   22% /boot
# Get column 5
[userwin@MiWiFi-R3L-srv ~]$ df -h | grep "sda1" | awk '{print $5}'
22%
# Use cut to intercept the value before%
[userwin@MiWiFi-R3L-srv ~]$ df -h | grep "sda1" | awk '{print $5}' |cut -d '%' -f1
22

Built in variable NF NR FS for awk

awk command: each field in each line has variable names, that is, $1, $2... And other variable names. Each field corresponds to each column$ 1 refers to the first column;
$0 means "a whole column of data"

Variable name	Representative meaning
NF	Total number of fields per row ($0)
NR	At present, awk processes the "row" data
FS	The current delimited byte is blank by default

# View the contents of / bin/bash in the passwd file
[userwin@MiWiFi-R3L-srv ~]$ cat /etc/passwd | grep "/bin/bash"  
root:x:0:0:root:/root:/bin/bash
userwin:x:1000:1000:userwin:/home/userwin:/bin/bash
# Use awk custom separator to get 1 and 3 columns
[userwin@MiWiFi-R3L-srv ~]$ cat /etc/passwd | grep "/bin/bash" | awk '{FS=":"} {print $1 "\t" $3}'
root:x:0:0:root:/root:/bin/bash	
userwin	1000

Why is the first column not split?

Take a look at the execution sequence of awk. First read in the first line, and then execute the separator FS =: after reading. Therefore, the following figure appears

You need to add an empty line using the BEGIN command, as follows

[userwin@MiWiFi-R3L-srv ~]$ cat /etc/passwd | grep "/bin/bash" |\
 awk 'BEGIN{FS=":"} {print $1 "\t" $3}'
root	0
userwin	1000

Execution sequence of awk

First read in the first row and fill the data in the first row into variables such as $0, $1, $2;
Judge whether the following "action" is required according to the restriction of "condition type";
Complete all actions and condition types;
If there are subsequent "row" data, repeat steps 1 ~ 3 above until all data are read.

BEGIN in awk

BEGIN add content before processing data

[userwin@MiWiFi-R3L-srv ~]$ df -h | grep "sda1" | awk 'BEGIN{print "sda1 The utilization rate of is:"}{print $5}' 
sda1 The utilization rate of is:
22%
[userwin@MiWiFi-R3L-srv ~]$ df -h | grep "sda1" | awk 'BEGIN{printf "sda1 The utilization rate of is:"}{print $5}' 
sda1 Utilization rate of: 22%

END in awk

Add content after BND processes data

[userwin@MiWiFi-R3L-srv ~]$ df -h | grep "sda1" | awk 'BEGIN{print "sda1 The utilization rate of is:"} END{print "Command execution completed!!"}{print $5}' 
sda1 The utilization rate of is:
22%
Command execution completed!!

Logical operation of awk

Arithmetic unit	Representative meaning
>	Greater than
<	Less than
>=	Greater than or equal to
<=	Less than or equal to
==	Equal to
!=	Not equal to

[userwin@MiWiFi-R3L-srv ~]$ df -h | grep "sda1" | awk '{print $5}' | awk 'BEGIN{FN="%"} $1>=20{print "Disk utilization over 20%"}'
Disk utilization over 20%
[userwin@MiWiFi-R3L-srv ~]$ df -h | grep "sda1" | awk '{print $5}' | awk 'BEGIN{FN="%"} $1>=80{print "Disk utilization over 80%"} $1<80{print "Disk utilization does not exceed 80%"}'
Disk utilization does not exceed 80%

Topics: shell awk

Programmer Think