Programmed apes must know awk from Linux commands

Posted by stuartbrown20 on Thu, 16 May 2019 23:12:17 +0200

Preface

 

For a professional programmer, Linux-related knowledge is necessary. Text processing is more common, such as formatting the output data we need, which may come from text files or pipe characters, or counting the frequency and total number of data we need in the text.Then awk is worth learning.

 

text

 

In Linux, awk, sed, grep are called "three swordsmen". They are all related to text manipulation. What are their characteristics?

 

grep: Suitable for simple lookup and matching.
sed: Suitable for modifying matched text.
awk: Suitable for complex formatting of text.

 

So awk is a programming tool language for text processing. It scans each line of input data and, if it matches the current pattern, executes the corresponding action. If it does not match or if the action of the current line has been executed, it continues processing the next line until the data is read.

 

Basic Usage

 

awk basic syntax

 

awk [option] 'pattern{action}' files

//awk keyword
//Some parameters that can be omitted by [option]
//'pattern{action}'pattern is a matching condition and can be omitted.An action is a specific action that is executed.
//Files are files that we operate on and can operate on multiple files.

  

Typical use of awk

 

 

awk '{
    BEGIN{action ...} //Pre-Execution Statement
    {action...} //Match processing per row of data
    END{action...} //Post-Execution Statement
}'

  


awk built-in variable

 

variable Effect
FS Input field splitter, default blank character
OFS Output field splitter, default white space character
RS Input record is also the row data separator, default line break
ORS Output record is also the row data separator, default line break
NF Number of fields into which the current row is split
NR The current line number, starting from 1, will add up in multiple files
FNR The current line number, which starts at 1 and differs from the NR, is the cumulative sum of the corresponding files
FILENAME Current filename
$0 Current row data
$1 ~ $n Gets the N th field of the row record

 

Example:

 

[root@wangzh awkdemo]# cat /etc/passwd

root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
...

  

//Modify the input field splitter using FS, then output the line number and the value of field 1 and 7
[root@wangzh awkdemo]# awk 'BEGIN{FS=":"} {print NR,$1,$7}'  /etc/passwd
1 root /bin/bash
2 bin /sbin/nologin
3 daemon /sbin/nologin
4 adm /sbin/nologin
...

  

//The difference from the previous example is that the output of the title is added and the delimiter of the output field is modified to'-'
[root@wangzh awkdemo]#awk 'BEGIN{FS=":";print "Result Title"} {print NR,$1}' OFS="-"  /etc/passwd
Result Title
1-root
2-bin
3-daemon
4-adm
...

  


Operators and Regular

 

This content is similar to most of our programming languages, you can compare it horizontally and understand a little for the students you have just met.

Arithmetic operator: ==,>,<,!=,>=,<=,+,-

Logical Operator: &, ||

Regular:

  • /regex/Execute action if line content matches

  • !/regex/Execute action if line content does not match regular

  • $1 ~/regex/Match the rule only in the first field

  • $1!~/regex/first field does not match the rule

 

Case:

 

//'-F:'is another way of defining input fields to split characters, this matches the information that the first field contains'root'
[root@wangzh awkdemo]# awk -F: '$1 ~ /root/ {print}' /etc/passwd
root:x:0:0:root:/root:/bin/bash
dockerroot:x:994:991:Docker User:/var/lib/docker:/sbin/nologin

  

//Output data from first to third rows
[root@wangzh awkdemo]# ip addr | awk 'NR>=1 && NR<=3 {print}'
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo

  

if/for/while

 

if(condi1){action1} else if(condi2){action2} else{action3}

---

for(i=1;i<=NR;i++){
    action1;
    action2;
    ...
}

---

while(condi){
    action1;
    ...
}
 

//As you can see, the usage is almost identical to many programming languages, and a simple example is given below.

 

[root@wangzh awkdemo]# cat t1.log
1 aa
3 bb
10 cc
9 dd
5 ee

---

//Determine if the first field in each row is a number between 3 and 9, and then output the corresponding result?
[root@wangzh awkdemo]# awk '{if($1 ~ /[3-9]/){print "yes"} else {print "no"}}' t1.log
no
yes
no
yes
yes

  

Built-in function

 

In awk, there are also many built-in functions, which help us encapsulate some character operations, mathematical operations, and so on. The specific usage needs you to consult the help manual. Here's how to use the more commonly used sub() function.

 

Reference: http://www.cnblogs.com/chengmo/archive/2010/10/08/1845913.html

 

sub( Ere, Repl, [ string ] )

The string argument is a string that needs to be processed, defaulting to $0, which is the current line
Replace Ere regular matching strings with Repl strings

 

[root@wangzh awkdemo]# awk 'BEGIN{info="this is a test2019test!";sub(/[0-9]+/,"!",info);print info}'
this is a test!test!

  

Actual cases

 

Count the occurrence of keywords in text

 

[root@wangzh awkdemo]# cat data.txt
ID NAME
1 xiaom
2 zsan
3 lisi
4 lisi
5 lisi
6 xiaom
7 lisi
8 xiaom
9 xiaoh
10 zsan


[root@wangzh awkdemo]# awk 'BEGIN{print "Statistics Result >>>>>"} {if(FNR>1){result[$2]+=1}} END{for(i in result){print i,"count:"result[i]} {print "over >>>>"}}' data.txt
Statistics Result >>>>>
xiaoh count:1
xiaom count:3
zsan count:2
lisi count:4
over >>>>

  

epilogue

 

The purpose of this article is to let students who do not touch this content have a perceptual understanding of text processing. To master awk, you can never just look at it. You must practice it yourself. When you encounter problems, check the manual more. I believe you will soon be a master of text processing.

 

---------------------------------------------------------

 

Public number blog synchronizes Github warehouse, interested friends can help give a Star oh, the code is not easy, thank you for your support.

https://github.com/PeppaLittlePig/blog-wechat

 

Recommended reading

 

<Java Logging Use Posture Correctly>

<Java Exception Handling Best Practices and Trap Prevention>

<On Several Postures of JVM Explosion and Self-rescue Methods>

 

What do you gain from reading this article?Forward to share with friends

Focus on "program apes in the middle of the night" to share the driest dried goods

 

Topics: Linux Programming Docker github