1, sort command
1. The role of sort
Sort the contents of files in behavioral units, or according to different data types
2. Syntax format
sort [option] parameter cat file | sort option
3. Common options
Common options | effect |
---|---|
-f | Ignore case and convert all lowercase letters to uppercase letters for comparison |
-b | Ignore spaces before each line |
-n | Sort by number |
-r | Reverse sort |
-u | Equivalent to uniq, which means that only one row of the same data is displayed |
-t | Specify the field separator, which is divided by using the [Tab] key by default |
-k | Specify sort field |
-o | < output file > transfer the sorted resu lt s to the specified file |
4. Use examples
1,sort command First, compare the first non empty character of each line, according to the empty line>number>Letter (lowercase)>In uppercase). If the first character is the same, the second character that is not empty is compared, and so on.
[root@localhost ~]# cat 1.txt e d c b a A B C D E 11 1 22 2 33 44 55 [root@localhost ~]# sort 1.txt 1 11 2 22 33 44 55 a A b B c C d D e E
2. - f option
Use the - f option so that uppercase letters take precedence over lowercase letters
[root@localhost ~]# sort -f 1.txt 44 1 11 2 22 33 55 A a B b C c D d E e
3. - n option
Because the sort command compares in character order, numbers cannot be sorted effectively. When we need to sort numbers, we can use the - n option.
[root@localhost ~]# sort -n 1.txt a A b B c C d D e E 1 2 11 22 33 44 55
4. - r option
-r is reverse sort
[root@localhost ~]# sort -r 1.txt E e D d C c B b A a 55 44 33 22 2 11 1
5. - u option
-u displays duplicate lines as one line
[root@localhost ~]# cat 2.txt 11 11 22 33 33 aa aa bb c c [root@localhost ~]# sort -u 2.txt 11 22 33 aa bb c
6. - t, - k options
-t option to specify the separator- The k option specifies the sort order.
Specify the third column with ':' as the separator, and sort / etc/passwd according to the number size
[root@localhost ~]# sort -t ':' -k3 -n /etc/passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt mail:x:8:12:mail:/var/spool/mail:/sbin/nologin operator:x:11:0:operator:/root:/sbin/nologin games:x:12:100:games:/usr/games:/sbin/nologin ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin rpc:x:32:32:Rpcbind Daemon:/var/lib/rpcbind:/sbin/nologin ntp:x:38:38::/etc/ntp:/sbin/nologin gdm:x:42:42::/var/lib/gdm:/sbin/nologin tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin avahi:x:70:70:Avahi mDNS/DNS-SD Stack:/var/run/avahi-daemon:/sbin/nologin tcpdump:x:72:72::/:/sbin/nologin sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin radvd:x:75:75:radvd user:/:/sbin/nologin dbus:x:81:81:System message bus:/:/sbin/nologin postfix:x:89:89::/var/spool/postfix:/sbin/nologin nobody:x:99:99:Nobody:/:/sbin/nologin qemu:x:107:107:qemu user:/:/sbin/nologin usbmuxd:x:113:113:usbmuxd user:/:/sbin/nologin pulse:x:171:171:PulseAudio System Daemon:/var/run/pulse:/sbin/nologin rtkit:x:172:172:RealtimeKit:/proc:/sbin/nologin abrt:x:173:173::/etc/abrt:/sbin/nologin dhcpd:x:177:177:DHCP server:/:/sbin/nologin systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin gnome-initial-setup:x:991:986::/run/gnome-initial-setup/:/sbin/nologin sssd:x:992:987:User for sssd:/:/sbin/nologin
7. - o option
-o option to overwrite the sorting results and output them to the specified file
Count the file size in the VaR directory, sort according to the file size, and then overwrite the output to the var.txt file.
[root@localhost ~]# du -a /var | sort -nr -o var.txt [root@localhost ~]# vim var.txt 4953252 /var 4417332 /var/ftp/centos7 4417332 /var/ftp 3919212 /var/ftp/centos7/Packages 359940 /var/ftp/centos7/LiveOS 359936 /var/ftp/centos7/LiveOS/squashfs.img 353064 /var/cache 350440 /var/cache/yum/x86_64/7 350440 /var/cache/yum/x86_64 350440 /var/cache/yum 192592 /var/cache/yum/x86_64/7/updates 173508 /var/lib 153580 /var/cache/yum/x86_64/7/base 109892 /var/cache/yum/x86_64/7/updates/gen 103108 /var/lib/rpm 98576 /var/cache/yum/x86_64/7/base/gen 92036 /var/lib/rpm/Packages 85152 /var/ftp/centos7/Packages/firefox-52.2.0-2.el7.centos.x86_64.rpm 74936 /var/ftp/centos7/Packages/libreoffice-core-5.0.6.2-14.el7.x86_64.rpm 66636 /var/cache/yum/x86_64/7/updates/packages 63972 /var/ftp/centos7/Packages/texlive-cm-super-svn15878.0-38.el7.noarch.rpm 62036 /var/ftp/centos7/images 53428 /var/ftp/centos7/isolinux 53368 /var/cache/yum/x86_64/7/updates/packages/linux-firmware-20200421-80.git78c0348.el7_9.noarch.rpm 53072 /var/lib/tftpboot 53044 /var/ftp/centos7/images/pxeboot 52316 /var/cache/yum/x86_64/7/updates/gen/primary_db.sqlite 49820 /var/cache/yum/x86_64/7/updates/gen/filelists_db.sqlite 48056 /var/cache/yum/x86_64/7/base/gen/filelists_db.sqlite 47300 /var/lib/tftpboot/initrd.img 47300 /var/ftp/centos7/isolinux/initrd.img
2, uniq command
1. Role of uniq
Used to report or ignore consecutive duplicate lines in a file, often associated with sort Command
2. Syntax format
uniq [option] parameter cat file | uniq option
3. Common options
Common options | explain |
---|---|
-c | Count and delete duplicate lines in the file |
-d | Show only consecutive repeating lines |
-u | Show rows that appear only once |
Use example
1. uniq command
For continuous repetitive de duplication, discontinuous de duplication.
[root@localhost ~]# cat 2.txt 11 11 22 33 33 11 aa aa bb 33 c c [root@localhost ~]# uniq 2.txt 11 22 33 11 aa bb 33 c
To remove all duplicates, you can sort first and then remove duplicates.
[root@localhost ~]# sort -n 2.txt | uniq aa bb c 11 22 33
2. - c option
Count the number of repetitions and remove the duplicate
[root@localhost ~]# sort -n 2.txt | uniq -c 2 aa 1 bb 2 c 3 11 1 22 3 33
3. - d option
Use the - d option to display only consecutive repeating lines
[root@localhost ~]# uniq -d 2.txt 11 33 aa c [root@localhost ~]# cat 2.txt 11 11 22 33 33 11 aa aa bb 33 c c
If you need to display all duplicate lines, you can use the "sort -n" command and then "uniq -d" operation.
[root@localhost ~]# cat 2.txt 11 11 22 33 33 11 aa bb aa bb 33 c c [root@localhost ~]# sort -n 2.txt | uniq -d aa bb c 11 33
4. - u option
Use the - u option to display only rows that do not have consecutive repetitions (rows that do not have consecutive repetitions are not displayed)
[root@localhost ~]# uniq -u 2.txt 22 11 aa bb aa bb 33
Show rows without duplicates
[root@localhost ~]# sort -n 2.txt | uniq -u 22
3, tr command
1. The role of tr
It is often used to replace, compress and delete characters from standard input
2. Syntax format
tr [option] [parameter] The parameter is the character set to be operated. The usage method is as follows: Character set 1: Specifies the original character set to be converted or deleted. When performing a conversion operation, you must specify the target character set for the conversion using the parameter "character set 2". However, the parameter "character set 2" is not required for deletion. Character set 2: Specifies the target character set to convert to.
3. Common options
Common options | explain |
---|---|
-c | Keep the characters of character set 1, and replace other characters (including newline characters) \ nwith character set 2 |
-d | Delete all characters belonging to character set 1 |
-s | Compress the repeated string into a string; Replace character set 1 with character set 2 |
-t | Character set 2 replaces character set 1. The result is the same without options |
4. Use example
1. tr command
tr The command can replace the characters in character set 1 with the characters in character set 2, and there is a one-to-one correspondence, so the number of characters before and after must be the same.
[root@localhost ~]# echo 'ABC' | tr "A-Z" "a-z" abc [root@localhost ~]# echo 'aBb' | tr "A-Z" "a-z" abb [root@localhost ~]# echo 'ABC' | tr "AB" "CD" CDC
2. - c option
Use the - c option to retain the characters in character set 1, and other characters other than character set 1 (including newline characters) will be replaced with characters in character set 2.
[root@localhost ~]# echo -e "abc\nabc" | tr -c "b\n" "0" 0b0 0b0
3. - d option
Use the - d option to delete all characters belonging to character set 1.
[root@localhost ~]# echo -e "abc\nabc" | tr -d "a" bc bc
4. - s option
You can compress the repeated string into one string, or you can use character set 2 to replace the content of character set 1 with the content of character set 2 for compression.
[root@localhost ~]# echo 'aaaaaabbbbc' | tr -s "ab" abc [root@localhost ~]# echo 'aaaaaabbbbc' | tr -s "ab" "0" 0c
"tr -s" \ n "command compresses empty lines
[root@localhost ~]# echo -e 'a\n\n\nb' | tr -s "\n" a b
5. Delete the "^ M" character caused by Windows files
If the line feed character "\ n" is encountered in Linux, the operation of carriage return + line feed will be carried out. Instead, the carriage return character will only be displayed as the control character "^ M", and carriage return will not occur. In Windows, the carriage return + line feed character "\ r\n" is required to correctly execute the carriage return + line feed operation. If a control character is missing or the order is wrong, it cannot start another line correctly.
Generally, we cannot detect whether there is "^ M" symbol, which can be viewed through the "cat -v" command.
The problem of "^ M" can also be modified by dos2unix software
[root@localhost ~]# cat abc.txt aa bb cc[root@localhost ~]# [root@localhost ~]# cat -v abc.txt aa^M ^M bb^M cc[root@localhost ~]#
Method 1:
Directly use the tr command to replace "\ r" with "". After conversion, there will be a space at the end of each line.
[root@localhost ~]# cat abc.txt | tr "\r" " " > ABC.txt [root@localhost ~]# cat -v ABC.txt aa bb cc[root@localhost ~]#
Method 2:
Use the "tr -s" command to replace "\ r" with "", as above.
You can also replace "" with "\ n". Since consecutive "\ n" will be compressed into one "\ n", if there is an empty line, it will be wrapped only once, resulting in no empty line after conversion.
[root@localhost ~]# cat abc.txt | tr -s "\r" " " > ABC.txt [root@localhost ~]# cat -v ABC.txt aa bb cc[root@localhost ~]# [root@localhost ~]# cat abc.txt | tr -s "\r" "\n" > ABC.txt [root@localhost ~]# cat -v ABC.txt aa bb cc[root@localhost ~]#
6. Sort the array with a=(2 4 6 3 1 5) from small to large
Idea: output the array, replace the space with newline, and sort with sort
[root@localhost ~]# a=(2 4 6 3 1 5) [root@localhost ~]# echo ${a[@]} 2 4 6 3 1 5 [root@localhost ~]# echo ${a[@]} | tr " " "\n" | sort -n 1 2 3 4 5 6
4, cut command
1. Function of cut
Displays the specified part of the line and deletes the specified field from the file
2. Syntax format
cut option parameter cat file | cut option
3. Common options
Common options | explain |
---|---|
-b | Split in bytes |
-c | Split in characters |
-f | By specifying which field to extract. The cut command uses Tab as the default field separator |
-d | Tab is the default field separator and can be changed to a different separator using this option |
–complement | This option is used to exclude the specified field |
–output-delimiter | Change the separator of the output |
4. Use example
1. - d -f option
Extract the first field with: as the separator
[root@localhost ~]# cut -d ':' -f 1 /etc/passwd root bin daemon adm lp sync shutdown halt mail operator games ftp nobody systemd-network dbus polkitd abrt libstoragemgmt rpc colord saslauth setroubleshoot rtkit pulse qemu ntp radvd chrony tss usbmuxd geoclue sssd gdm rpcuser nfsnobody gnome-initial-setup avahi postfix sshd tcpdump qiao dhcpd
Intercept the fields 1-3, 6 and 7 in the / etc/passwd file with the specified conditions
[root@localhost ~]# grep 'bin/bash' /etc/passwd root:x:0:0:root:/root:/bin/bash qiao:x:1000:1000:qiao:/home/qiao:/bin/bash [root@localhost ~]# grep 'bin/bash' /etc/passwd | cut -d ':' -f 1-3,6,7 root:x:0:/root:/bin/bash qiao:x:1000:/home/qiao:/bin/bash
2. – completion - f option
Exclude specified fields
[root@localhost ~]# grep 'bin/bash' /etc/passwd | cut -d ':' --complement -f 2 root:0:0:root:/root:/bin/bash qiao:1000:1000:qiao:/home/qiao:/bin/bash
3. – output delimiter option
Change the delimiter of the specified field
[root@localhost ~]# grep 'bin/bash' /etc/passwd | cut -d ':' -f 1,7 --output-delimiter='' root/bin/bash qiao/bin/bash
4. - b option
Intercepts the specified character in bytes
[root@localhost ~]# j=123456789 [root@localhost ~]# echo $j | cut -b 2-4 234 [root@localhost ~]# echo ${j:2:4} 3456 [root@localhost ~]# expr substr $j 3 4 3456
5, eval command
1. Role of eval
Add before the command word eval When, shell It is scanned twice before the command is executed. eval The command will first scan the command line for all permutations, and then execute the command. This command is applicable to variables whose function cannot be realized in one scan. This command scans the variable twice.
2. Use example
[root@localhost ~]# echo 'cutomorry' > file [root@localhost ~]# myfile="cat file" [root@localhost ~]# echo $myfile cat file [root@localhost ~]# eval $myfile cutomorry
6, Regular expression
1. The difference between regular expressions and wildcards
Regular expressions are usually used to match characters, while wildcards are usually used to match files
2. Composition of regular expressions
Regular expressions are composed of ordinary characters and metacharacters, Ordinary characters include upper and lower case letters, numbers, punctuation marks and some other symbols Metacharacter is a special character with special meaning in regular expression. It can be used to specify the occurrence mode of its leading character (that is, the character or expression in front of metacharacter) in the target object
3. Common metacharacters in basic regular expressions
Common metacharacters | explain |
---|---|
|Escape character, used to cancel the meaning of special symbols, for example: \n. $etc | |
^ | The starting position of the matching string, for example: a, the, #, [a-z] |
$ | The end of the matching string, for example: wordKaTeX parse error: Expected group after '^' at position 2:^ ̲ Match blank lines |
. | Match any character except \ n, for example: go d,g…d |
* | Match the front sub expression 0 or more times, for example: goo*d, go* d |
[list] | Match a character in the list, for example, go[ola]d, [abc], [A-Z], [a-z0-9] and [0-9] match any digit |
[^list] | Match a character in any non list list, for example: [0-9], [A-Z0-9], [^ A-Z] match any non lowercase letter |
\ {n\ } | Match the previous subexpression n times. For example, go{2}d and '[0 = 9] {2}' match two digits |
\{n,\ } | Match the previous subexpression no less than n times, for example: go{2,}d and '[0-9] {2,}' match two or more digits |
\{n,m\} | Match the previous subexpression n to m times. For example, go{2,3}d and '[0-9] {2,3}' match two to three digits |
Supported tools include: grep,egrep,sed,awk Note: egrep,awk use\{n\},\{n,\},\{n,m\}When matching“{}"Don't add it before“\"
4. Extended regular expression metacharacter
Common metacharacters | explain |
---|---|
+ | Match the front sub expression more than 1 time, for example: go+d, will match at least one 0, such as god, good, good, etc |
? | Match the front sub expression 0 or 1 times, for example: go?d. Will match gd or god |
() | Take the string in parentheses as a whole, for example: g(oo)+d, which will match the whole more than once, such as good, good, etc |
l | Match the string in or, for example: g(oo) |
Supported tools include: egrep and awk
5. Application examples
Matching email address, requirements: user name@,The character length is 6 digits or more, and the beginning can only be a letter or_,The symbols that can be used in the middle are.-#_ Subdomain names can be upper and lower case letters, numbers and symbols.-_ .Top level domain name. The string length is 2-5 between
[root@localhost ~]# vim mailadd.txt zhangsan1234.@qq.com qiao_666@sina.com.cn luoxiang@163.com zhao liu@wo.cn sun@qi.com [root@localhost ~]# egrep '^([a-zA-Z_][a-zA-Z0-9_#\-\.]{5,})@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5}$)' mailadd.txt zhangsan1234.@qq.com qiao_666@sina.com.cn luoxiang@163.com