awk common syntax

Posted by deemurphy on Thu, 21 Oct 2021 21:53:17 +0200

The awk command is used for output, statistics, and other processing of records that match styles in a file.
awk command syntax:
awk pattern { ACTION }

While awk reads the file, the variable $0 holds the entire line, the variable $1 holds the contents of the first field, $2 holds the contents of the second field,....

The variable NF holds the number of fields a row contains, so the variable $NF holds the contents of the last field, $NF-1 holds the contents of the second last field,....

The variable NR holds the line number.

1. Print Fields

1) Change the order of printing fields 2, 1 and 3.

awk '/^[^#]/{print $2, $1, $3}' /etc/fstab
/ /dev/mapper/rhel-root xfs
/boot UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b xfs
/u01 /dev/mapper/rhel-home xfs
swap /dev/mapper/rhel-swap swap
/dev/shm tmpfs tmpfs

2) Print the number of fields per line.

awk '/^[^#]/{print "Record:", NR, "has", NF, "fields."}' /etc/fstab
Record: 9 has 6 fields.
Record: 10 has 6 fields.
Record: 11 has 6 fields.
Record: 12 has 6 fields.
Record: 13 has 6 fields.

3) Print the length of each field.

awk -F: '/oracle/{for(i=1;i<=NF;i++) print "Field",$i, "has", length($i), "letters."}' /etc/passwd
#Print out results to file/tmp/file.txt
awk -F: '{print "Field 1", "has", length($1), "letters." > "/tmp/file.txt"}' /etc/passwd

2. Print field after matching style

1) Print related fields after matching swap lines.

awk '/swap/{print $2, $1, $3}' /etc/fstab
swap /dev/mapper/rhel-swap swap

2) Print lines after matching defined CASE number rules.

echo "1-2345598" | awk '/^[0-9]+$/ || /^[0-9]-[0-9]+$/ {print}'

3) Print lines after matching the | (or) defined conditions.

 netstat -an  |awk '/CLOSE_WAIT|ESTABLISHED|TIME_WAIT|LISTEN/ {print $0}'

4) Match and print logs between two dates.

awk '/Sep 22/{a=1}a{b=b?b"\n"$0:$0;if(b~/Sep 23/){print b;b=""}}' messages-20210926

#Use''and''to refer to variables by including them in the //symbol
date1='Sep 22'
date2='Sep 23'
awk ''"/$date1/"'{a=1}a{b=b?b"\n"$0:$0;if(b~'"/$date2/"'){print b;b=""}}' messages-20210926

Or add''and'' directly before and after the variable to refer to the variable.

awk '/'"$date1"'/{a=1}a{b=b?b"\n"$0:$0;if(b~/'"$date2"'/){print b;b=""}}' messages-20210926

The sed command has similar functionality.

sed -n '/'"$date1"'/{p;:1;n;:2;/'"$date2"'/{p;b1};N;b2}' messages-20210926

3.BEGIN and END statements

In a BEGIN statement, you can store actions to be performed before reading and processing a file, typically to print the title of a report and change the value of an intrinsic variable.
In an END statement, all the records used to store a file are read and processed, usually to print a summary description or sum of values.

awk 'BEGIN {print "Begin..."} ;/#[H-L]o/ {print $0}; /#P[e-o]/ {print $0} ;END{ print "The End"}' /etc/ssh/sshd_config
Begin...
#Port 22
#HostKey /etc/ssh/ssh_host_dsa_key
#LogLevel INFO
#LoginGraceTime 2m
#HostbasedAuthentication no
#PermitEmptyPasswords no
#PermitTTY yes
#PermitUserEnvironment no
#PidFile /var/run/sshd.pid
#PermitTunnel no
The End

4. Separators

1) Input Separator

The variable FS holds the field's input delimiter, which is either a space or a Tab by default. Use the -F parameter in the awk command or define the FS variable in the BEGIN statement to change the field's input delimiter value.

If there are several types of domain delimiters at the same time, use double quotation marks''and brackets []' to place the various delimiters ('[:]'), or use double quotation marks''and symbol | to include the various delimiters ('|:').

#Get the maximum USER ID of the system user

awk -F: '{print $3}' /etc/passwd |sort -n |tail -1

Examples of multiple input delimiters:

cat file2
root /:root:/bin/sh
grid /u01/app:oinstall:/bin/bash
oracle /u01/app/oracle:dba:/bin/ksh

awk -F"[ :]" '{print $1, $3}' file2
root root
grid oinstall
oracle dba

awk 'BEGIN{FS=" |:"};{print $1, $3}' file2
root root
grid oinstall
oracle dba

awk 'BEGIN{FS="[ :]"};{print $1, $3}' file2
root root
grid oinstall
oracle dba

2) Output Delimiter

The output delimiter of the variable OFS hold field, which defaults to a space.

No Output Delimiter Example:

awk 'BEGIN{FS="[ :]"};{print $1 $2 $3 $4}' file2
root/root/bin/sh
grid/u01/appoinstall/bin/bash
oracle/u01/app/oracledba/bin/ksh

Example output delimiter with space as default:

awk 'BEGIN{FS="[ :]"};{print $1,$2,$3,$4}' file2
root / root /bin/sh
grid /u01/app oinstall /bin/bash
oracle /u01/app/oracle dba /bin/ksh

Example output delimiter with Tab:

awk 'BEGIN{FS="[ :]"};{print $1"\t"$2"\t"$3"\t"$4}' file2
root    /       root    /bin/sh
grid    /u01/app        oinstall        /bin/bash
oracle  /u01/app/oracle dba     /bin/ksh

awk 'BEGIN{FS="[ :]";OFS="\t"};{print $1,$2,$3,$4}' file2
root    /       root    /bin/sh
grid    /u01/app        oinstall        /bin/bash
oracle  /u01/app/oracle dba     /bin/ksh

5. Variable NR

The variable NR counts the number of rows read in from the file, and the value of the variable NR is updated for each row read.

Display the number of lines contained in the file after reading it:

grep -E -v "^#|^[ ]*$" /etc/fstab |awk '{print $2, $1} END {print "The number of lines of fstab is " NR}'
/ /dev/mapper/rhel-root
/boot UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b
/u01 /dev/mapper/rhel-home
swap /dev/mapper/rhel-swap
/dev/shm tmpfs
The number of lines of fstab is 5

Output the fields of the row after line 8:

awk 'NR>8 {print $2, $1, $3}' /etc/fstab
/ /dev/mapper/rhel-root xfs
/boot UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b xfs
/u01 /dev/mapper/rhel-home xfs
swap /dev/mapper/rhel-swap swap
/dev/shm tmpfs tmpfs

Count the number of fields in each row after line 8:

 awk 'NR>8 {print NF}' /etc/fstab
6
6
6
6

6. Use external variables and define internal variables

As shown in Section 2, using single quotation marks with double quotation marks'$var'(double quotation marks with single quotation marks are not recommended for certain Linux versions on some Linux systems and are therefore not recommended) allows you to reference external variables defined in shell s or scripts in awk commands.

Referencing an external variable var example:

var="priv1"
awk '/'"$var"'/{print $3}' /etc/hosts
rac1-priv1
rac2-priv1

Define an internal variable counter example:

 awk '/^[^#]/{counter=counter+1;print} END {print "The number of lines of fstab is " counter "."}' /etc/fstab
/dev/mapper/rhel-root   /                       xfs     defaults        0 0
UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b /boot                   xfs     defaults        0 0
/dev/mapper/rhel-home   /u01                 xfs     defaults        0 0
/dev/mapper/rhel-swap   swap                    swap    defaults        0 0
tmpfs   /dev/shm        tmpfs   defaults,size=21G      0       0
The number of lines of fstab is 5.

Example of accumulating values for a column:

awk '{total=total+$2;print "Field 2 = " $2} END {print "Total=" total}' file3
Field 2 = 100000
Field 2 = 120000
Field 2 = 30000
Field 2 = 20000
Field 2 = 50000
Field 2 = 80000
Field 2 = 30000
Field 2 = 20000
Total=450000

Statistical values that match a classification:

awk 'BEGIN {Ncount=0;Scount=0}
     BEGIN {printf "%s %10s %10s \n", "Record#", "Region", "Price"}
     /north/ {printf "%6d %10s %10s \n",NR,$1,$2,Ncount=Ncount+1}
     /south/ {printf "%6d %10s %10s \n",NR,$1,$2,Scount=Scount+1}
     END {print "Total of North region is:", Ncount, "."}
     END {print "Total of South region is:", Scount, "."}' file3

Record#     Region      Price 
     2      north     120000 
     3  northwest      30000 
     4  northeast      20000 
     6      south      80000 
     7  southeast      30000 
     8  southwest      20000 
Total of North region is: 3 .
Total of South region is: 3 .

7.printf function

Pritf outputs characters and numbers in a defined format with the following syntax:
printf("string format" [, numeric])

The format of the string includes%d (integer),%f (floating point),%c (character),%s (string).
If you need to format the width of a string, you can add numbers to the above format, such as%15s for a 15-length string.
Output defaults to right alignment, if left alignment is required, add-after% to do so.

Example printf output:

awk '/^[^#]/{printf("%-45s %-19s %-9s %-20s %-1d %1d \n", $1,$2,$3,$4,$5,$6)}' /etc/fstab
/dev/mapper/rhel-root                         /                   xfs       defaults             0 0 
UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b     /boot               xfs       defaults             0 0 
/dev/mapper/rhel-home                         /u01                xfs       defaults             0 0 
/dev/mapper/rhel-swap                         swap                swap      defaults             0 0 
tmpfs                                         /dev/shm            tmpfs     defaults,size=21G    0 0

8.awk if statement

In an awk statement, if can be used to determine whether a domain meets the defined rules and to operate accordingly based on the criteria.

In the awk statement, if is used in the following format:
if {Execute Code Block 1}
else {Execute Code Block 2}

if (condition 1) {
if (Judgment 2) {Execute Code Block 1}
else {Execute Code Block 2}
}

1) Compare numbers

Relational Operators	Explain
==	Be equal to
!=	Not equal to
>	greater than
<	less than
>=	Greater than or equal to
<=	Less than or equal to

Example:

awk '{num= $2 / $3; if(num>5) print $1,num}' file3
east 10
north 13.3333
west 7.14286
south 10

2) Compare strings

Relational Operators	Explain
==	Be equal to
!=	Not equal to
~	Contains regular expression styles
!~	Does not contain regular expression styles

Example:

awk '{if($1 ~ /north/) print $1, $3}' file3
north 9000
northwest 6000
northeast 5000

awk '{if($1 == "north") print $1, $3}' file3
north 9000

username="grid"
awk -F: '{if($4 ~/\<'"$username"'\>/) print $0}' /etc/group
dba:x:502:oracle,grid
asmadmin:x:504:grid
asmdba:x:506:grid,oracle
asmoper:x:507:grid

awk -F: '{if($4 ~/\<'"$username"'\>/) print $1}' /etc/group
dba
asmadmin
asmdba
asmoper

3) Logical operations

Logical operators	Explain
&&	Logical and
\|\|	Logical or
!	Logical NOT

Example:

awk '{if($1 ~ /north/ && $3 > 5000) print $1, $3}' file3
north 9000
northwest 6000

4) Mathematical operations

Mathematical Operators	Explain
+	plus
-
*	ride
/	except
%	Remainder
^	Exponential operation
++	Variable plus 1
–	Variable minus 1

Example:

awk '/grid/{counter++;print} END {print "#The number of groups to which the user grid belongs is " counter "."}' /etc/group
dba:x:502:oracle,grid
asmadmin:x:504:grid
asmdba:x:506:grid,oracle
asmoper:x:507:grid
#The number of groups to which the user grid belongs is 4.

9.awk while statement

Grammar:
while executing code

while {
Execute Code Block
}

do {
Execute Code Block
} while

Print all column examples:

awk '{i=1}; {while (i <= NF) {print $i; i++}}' file4
sdb
sdc
sde
sdf
sdg
sdi

10.awk for statement

Grammar:
for (initial conditional statement; judgement condition; conditional change statement per step)
{Execute Code Block}

for (index of array)
{Execute Code Block}

Print all column examples:

awk '{for(i=1;i<=NF;i++)print $i}' file4
sdb
sdc
sde
sdf
sdg
sdi

Print all lines of the file in reverse order:

cat file3
east 100000 10000
north 120000 9000
northwest 30000 6000
northeast 20000 5000
west 50000 7000
south 80000 8000
southeast 30000 6000
southwest 20000 4000

awk '{line[NR]=$0}; END{for(c=NR;c>0;c--)print line[c]}' file3
southwest 20000 4000
southeast 30000 6000
south 80000 8000
west 50000 7000
northeast 20000 5000
northwest 30000 6000
north 120000 9000
east 100000 10000

11. Arrays

In awk statements, arrays do not need to be defined to be used directly, and the index of the array does not need to be a number, and any string associated with the value to be stored can be used as the index of the array.

Array stored values can also be referenced using the following methods:
for (index of array)
{Execute Code Block}

Count the number of values in a column:

# cat file3
east 100000 10000
north 120000 9000
northwest 30000 6000
northeast 20000 5000
west 50000 7000
south 80000 8000
southeast 30000 6000
southwest 20000 4000
center  25000   5000

awk '{a[$3]++} END{for(i in a)print i,a[i]}' file3
10000 1
7000 1
8000 1
4000 1
9000 1
5000 2
6000 2

How much sorting is used by process resources:

ps -eo 'user s pri pid ppid pcpu pmem vsz rss stime time nlwp psr args'|awk 'NR==1{print} NR!=1{r[++n]=$0} END{for(i in r)print r[i]|"sort -rn -k6 -k7"} '

USER     S PRI   PID  PPID %CPU %MEM    VSZ   RSS STIME     TIME NLWP PSR COMMAND
root     S  19  2271     1  0.2  1.1 4700804 290364 Oct19 00:07:52 38   1 /u01/app/19.0.0/grid/jdk/jre/bin/java -server -Xms128m -Xmx256m -Djava.awt.headless=true -Ddi
sable.checkForUpdate=true -XX:ParallelGCThreads=5 oracle.rat.tfa.TFAMain /u01/app/19.0.0/grid/tfa/rac1/tfa_home
root     S  19  1432     1  0.1  0.0 114156  2280 Oct19 00:03:58    1   2 /bin/sh /etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null
gdm      S  19  2338  2211  0.0  0.5 3773248 141564 Oct19 00:00:43 33   2 /usr/bin/gnome-shell
rtkit    S  18   985     1  0.0  0.0 196848  1732 Oct19 00:00:01    3   7 /usr/libexec/rtkit-daemon
rpc      S  19   963     1  0.0  0.0  69220  1056 Oct19 00:00:00    1   2 /sbin/rpcbind -w
root     S  90   376     2  0.0  0.0      0     0 Oct19 00:00:00    1   7 [irq/16-vmwgfx]
root     S  39   931     2  0.0  0.0      0     0 Oct19 00:00:00    1   5 [xprtiod]
root     S  39   930     2  0.0  0.0      0     0 Oct19 00:00:00    1   7 [rpciod]
root     S  39    92     2  0.0  0.0      0     0 Oct19 00:00:00    1   5 [deferwq]
root     S  39   905     2  0.0  0.0      0     0 Oct19 00:00:00    1   1 [xfs-eofblocks/d]
root     S  39   904     2  0.0  0.0      0     0 Oct19 00:00:00    1   1 [xfs-log/dm-2]

12.awk built-in functions

1) String function

function	Explain
gsub(r,s[,t])	Replace all strings in string t that match the regular expression r with string s. The function returns the number of substitutions. If the string t is not provided, the default is $0.
sub(r,s[,t])	Replace the first string in string t that matches the regular expression r with string s. Return 1 if successful, otherwise return 0. If the string t is not provided, the default is $0.
index(str,substr)	Returns the first occurrence of the string substr in the string str, or 0 if it is not found
length(str)	Returns the length of the string
match(s,r)	Matches the string of the regular expression r in string s, returns the position where the regular expression r begins to match in s, and returns 0 if it is not found
split(string,array[,sep])	Split the string string string into an array, and if the sep separator is not specified, use FS as the separator to return the number of elements in the array.
substr(string,m[,n])	Returns a string string at the starting position m and a substring containing n characters. If n is not provided, all characters from m to the end of the string are returned.
tolower(string)	Converts all uppercase strings to lowercase strings and returns a new string.
toupperr(string)	Converts all lowercase strings to uppercase strings and returns a new string.

Substrings of a column are converted to uppercase:

cat file2
root /:root:/bin/sh
grid /u01/app:oinstall:/bin/bash
oracle /u01/app/oracle:dba:/bin/ksh

awk '{print toupper(substr($1,0,5))}' file3
EAST
NORTH
NORTH
NORTH
WEST
SOUTH
SOUTH
SOUTH
CENTE

Replace the string of a column:

awk -F "[ :]" '{sub("/bin/","",$4); print $4}' file2
sh
bash
ksh

Substring a column before splitting:

awk -F "[ :]" '{split(substr($4,match($4,"/bin/")),a,"/"); print a[3]}' file2
sh
bash
ksh

2) IO function

function	Explain
close(filename-expr) or close(command-expr)	Close the file or pipe.
Getline [var] [<file] or command \| getline [var]	Read the next line from an input, file, or pipe
next	Read the next input line and proceed to the next loop to execute style and procedure statements
delete array[element]	Remove an element from the array array
print[args][destination]	Print Arg to target output. If no arg is specified, the default is $0. If destination is not specified, output to stdout, the standard output.
printf(format [,expression(s)] [destination])	Print as required by the format, as detailed in the printf function above.
sprintf(format [,expression(s)] [destination])	Returns a formatted string without printing to the output.
system(command)	Execute system commands and return to execution status.

Invoke system commands for all lines in the file:

awk -F: '{cmd="grep "$1" /etc/group";system(cmd)}' oracleuser|sort -u
asmadmin:x:504:grid
asmdba:x:506:grid,oracle
asmoper:x:507:grid
dba:x:502:oracle,grid
oracle:x:1000:

3) Mathematical Functions

function	Explain
atan2(y,x)	Returns the tangent of y/x
cos(x)	Returns the cosine of x
exp(arg)	Returns arg's natural exponent e^arg
int(arg)	Returns the integer value of arg
log(arg)	Returns the natural logarithm of arg
rand()	Produces a random number between 0 and 1, and the function returns the same random number unless the srand() function is used to specify the seed for generating the random number
sin()	Returns the sine of x
sqrt(arg)	Returns the square root of arg
srand(expr)	Specifies the seed to generate the random number, defaulting to time as the seed

Example:

awk '{print atan2($2,$1)}' file5
1.10715

Topics: Linux shell bash

Programmer Think

awk common syntax

1. Print Fields

2. Print field after matching style

3.BEGIN and END statements

4. Separators

1) Input Separator

2) Output Delimiter

5. Variable NR

6. Use external variables and define internal variables

7.printf function

8.awk if statement

1) Compare numbers

2) Compare strings

3) Logical operations

4) Mathematical operations

9.awk while statement

10.awk for statement

11. Arrays

12.awk built-in functions

1) String function

2) IO function

3) Mathematical Functions

Hot Topics