Hook course - re learning operating system - Introduction to Linux instructions

Posted by Will Poulson on Thu, 10 Mar 2022 08:41:48 +0100

1. What is the process? A process is an executive copy of an application. The executable file of the application is placed in the file system. Starting the executable file will form a copy of the application in the operating system (specifically in memory), which is the process.

2. The function of Linux Pipeline is to transfer data between commands. For example, the result of one command can be used as the input of another command. The command here is the process. More precisely, pipes pass data between processes.

3. Each process has its own standard input stream, standard output stream and standard error stream.

  • The standard input stream (represented by 0) can be used as the context of process execution (process execution can obtain data from the input stream).
  • The results written in the standard output stream (represented by 1) are printed on the screen.
  • If an exception occurs during the execution of the process, the exception information will be recorded in the standard error flow (represented by 2).

4. Redirection: specifically, the > symbol is called override redirection; > > This is called append redirection. > The target file will be overwritten every time, and > > will be appended to the target file (LS - L > out). Alternatively, you can redirect the standard error stream to the standard output stream and then to the file (LS1 & > out or LS1 > out 2 > & 1).

5. Pipes are very similar to redirection, but pipes connect one to another for calculation. Redirection is to direct the contents of one file to another. The two are often used in combination. Pipelines in Linux are also files. There are two types of pipelines:

  • Anonymous pipeline is also in the file system, but it is only a storage node and does not belong to any directory. To put it bluntly, there is no path.
  • Named Pipeline is a file with its own path. With the mkfifo instruction, you can create a named pipe (mkfifo pipe1).

6. The uniq instruction can be used for de duplication. The uniq instruction can find adjacent duplicate lines in the file and then de duplication.

7. grep -v is the result not included in the matching. For example, we can do this if we want to include Spring but not MyBatis:

find ./ | grep Spring | grep -v MyBatis

^    # The beginning of the anchor line, such as' ^ grep 'matches all lines beginning with grep.    
$    # Anchor the end of the line, such as: 'grep $' matches all lines ending in grep.
.    # Matches a non newline character, such as: 'gr.p' matches GR followed by an arbitrary character, followed by P.    
*    # Matches zero or more previous characters, such as' * grep 'matches all one or more spaces followed by the line of grep.  
--color=auto # The marker matches the color. 

8. WC - L is used to count the number of rows. For example: how many lines are there in a java file? (wc -l Client.java), how many files are there in the current directory? (ls | wc -l).

# Using nginx access_log counts the PV (Page View) of the website. Every time a user visits a page, it is a PV
wc -l access.log

9. Tee instruction reads data from the standard input stream to the standard output stream, and can save the intermediate results. For example: find all Java files containing Spring keywords from the current directory. Tee itself does not affect the execution of the instruction, but tee will save the result of the find instruction to the JavaList file.

find ./ -iname "*.java" | tee JavaList | grep Spring

10. The xargs instruction constructs and executes line by line instructions from the standard data stream. Xargs obtains strings from the input stream, then cuts strings with blanks, line breaks, etc., constructs instructions based on these strings, and executes these instructions on the last line. For example: count the number of lines of all Java files in the directory.

find ./ -iname "*.java" | xargs wc -l

11. A & symbol is added after cat pipe1. This & symbol represents that the instruction is executed in the background and will not block the user from continuing to input.

cat pipe1 &

12. How to set the initial permissions after the file is created? After a file is created, the permission is usually rw-rw-r --, that is, the user and group dimensions cannot be executed, and all users can read it. After the file is created, the user of the file will be set as the user who created the file. The user group is the work group of the user at that time. If there is no special setting, it belongs to the group with the same name of the user.

13. Instructions that all users can execute, such as ls, are required. How are their permissions allocated? The user dimension can be read, written and executed, and the group dimension and all users can read and execute. At this point, you may have a question: what happens if a file is set to be unreadable but can be executed? Of course, the answer is not executable. If you can't read the contents of the file, you can't execute it.

[root@apm-0001 ~]# ls -l /usr/bin/ls
-rwxr-xr-x. 1 root root 117680 10 March 31, 2018 /usr/bin/ls

14. When the user enters a file name, if the full path is not specified, Linux will look for the file in some directories. You can see which directories Linux will look for executable files through echo $PATH.

15. The kernel is the core ability of the operating system to connect hardware, provide operating hardware, disk, memory paging, process and so on, and has the permission to directly operate all memory. Therefore, the kernel can not provide all its capabilities to users, and can not allow users to call through shell instructions. Under Linux, the kernel provides the system calls required by some processes in the form of C language API.

16. The main goal of excellent authority architecture is to make the system safe and stable, and restrict and isolate users and programs from each other. This requires that the authority division in the authority system is clear enough and the cost of allocating authority is low enough. Therefore, an excellent architecture should follow the Least Privilege principle.

17. Please briefly describe the principle of Linux permission division? Linux follows the principle of minimum permissions.

  • The permissions of each user and each group should be small enough. In the actual production process, it is best that the administrator permissions can be split to contain each other and prevent problems.
  • Each application should have as little access as possible. Ideally, each application occupies a separate container (such as Docker), so there is no problem of interaction. Even if the application is broken, it cannot break the protective layer of Docker.
  • As few roots as possible. If a user needs root capability, it should be surrounded by permissions - upgrade the permissions immediately (such as sudo), and release the permissions immediately after processing.
  • The hierarchical protection of permissions is realized at the system level, and the permissions of the system are divided into rings. When the outer Ring calls the inner Ring, the inner Ring needs to verify the permissions.

18. Can multiple users log in to root and only use the root account? Of course not! For example, you have a MySQL process running on the root account. If a hacker breaks through your MySQL service and obtains the permission to execute SQL on MySQL, your whole system will be exposed to the hacker. This can lead to very serious consequences.

Hackers can use MySQL's Copy From Prgram command to do whatever they want, such as backing up your key files first, then deleting them, and threatening you to make money through the specified account. If the principle of minimum permission is implemented, even if a hacker breaks through our MySQL service, he can only obtain the minimum permission. Of course, it's terrible for hackers to get MySQL permissions, but the loss is much smaller than getting all permissions.

19. The ifconfig command is used to configure and display the network parameters of the network interface in the Linux kernel.

ifconfig   #Active network interface
ifconfig -a  #All configured network interfaces, whether activated or not
ifconfig eth0  #Display network card information of eth0
ifconfig eth0  #Display network card information of eth0
ifconfig eth0 mtu 1500    #Set the maximum packet size that can pass to 1500 bytes
ifconfig eth0 arp    #Enable arp protocol of network card eth0
ifconfig eth0 -arp   #Turn off the arp protocol of network card eth0
ifconfig eth0 up     #boot adapter 
ifconfig eth0 down   #Turn off the network card

20. The netstat command is used to print the status information of the network system in Linux, which can let you know the network situation of the Linux system.

# -a or -- all: displays the sockets in all connections;
# -n or -- numeric: directly use the ip address instead of the domain name server;
# -l or -- listening: displays the Socket of the server under monitoring;
# -r or -- route: display the Routing Table;
# -t or -- TCP: displays the connection status of TCP transmission protocol;
# -u or -- UDP: displays the connection status of UDP transmission protocol;
# -p or -- programs: displays the program ID and program name of the Socket being used;
# -i or -- interfaces: display the network interface information form;
netstat -ap | grep java # Find the port where the program runs
netstat -anp | grep 8081 | grep LISTEN | awk '{printf $7}' | cut -d/ -f1 # Find process ID through port
netstat -ntu | grep :80 | awk '{print $5}' | cut -d: -f1 | awk '{++ip[$1]} END {for(i in ip) print ip[i],"\t",i}' | sort -nr # View the IP address of the most connected to a service port
netstat -nt | grep -e 127.0.0.1 -e 0.0.0.0 -e ::: -v | awk '/^tcp/ {++state[$NF]} END {for(i in state) print i,"\t",state[i]}' # TCP status list
netstat -an | tail -n +3| grep TIME_WAIT | wc -l # Viewing time_ Number of connections in wait state (netstat will have two headers, which can be filtered out with tail)

21. The socket statistics of ss is better than netstat. Another tool attached to iproute2 package allows you to query the socket statistics.

When the number of socket connections of the server becomes very large, the execution speed will be very slow whether using netstat command or directly cat /proc/net/tcp. Maybe you won't feel it personally, but please believe me, when the server maintains tens of thousands of connections, using netstat is a waste of life, and using ss is to save time.

The secret of ss is that it makes use of TCP in the TCP stack_ diag. tcp_diag is a module for analysis and statistics, which can obtain the first-hand information in the Linux kernel, which ensures the fast and efficient of ss. Of course, if you don't have TCP in your system_ Diag and ss can also operate normally, but the efficiency will become slightly slower.

# -a. -- all: display all sockets
# -n. -- numeric: do not resolve the service name
# -l. -- listening: sockets showing listening status
# -t. -- TCP: display only TCP sockets
# -u. -- UDP: display only UCP sockets
# -p. -- processes: displays processes that use socket s
ss -s       # Show Sockets summary
ss -l       # Lists all open network connection ports
ss -pl      # View the socket used by the process
ss  -tan|awk 'NR>1{++S[$1]}END{for (a in S) print a,S[a]}' # View TCP connection status

22,awk Is a Domain Specific Language for processing text. So what is a Domain Specific Language? English is Domain Specific Language. Domain Specific Language is a language specially designed to deal with a domain. For example, awk is a DSL for analyzing and processing text, html is a DSL for describing web pages, and SQL is a DSL for querying data.

# Access to nginx Log for pv (Page views) grouping
awk '{print substr($4, 2, 11)}' access.log | sort | uniq -c
# Access to nginx Log for UV (Uniq Visitor) analysis, count the number of visitors, and use IP access for statistics
awk '{print $1}' access.log | sort | uniq -c | wc -l
# Access to nginx Log is grouped by day to analyze the UV situation every day
awk '{print substr($4,2,11) " " $1}' access.log | sort | uniq | awk '{uv[$1]++;next}END{for (day in uv) print day, uv[day]}'

# Access to nginx Log group to count which terminals have visited these websites
awk -F\" '{print $6}' access.log | sort | uniq -c | sort -fr
# Access to nginx Log analysis of the Top three web pages of the number of visits
awk '{print $7}' access.log | sort | uniq -c | head -n 3