Learning notes on Linux performance optimization

Posted by jdorsch on Tue, 01 Feb 2022 18:12:21 +0100

1. What is the average load?

uptime command

[root@b0b5a9371ce4 /]# uptime
 09:59:49 up 11 days, 14:50,  0 users,  load average: 0.16, 0.07, 0.02
 
 
09:59:49              //current time 
up 11 days, 14:50     //System running time
0 user                //Number of users logging in
//load average is the average load in the past 1 minute, 5 minutes and 15 minutes
load average: 0.16, 0.07, 0.02

Average load refers to the average number of processes in the running state and non interruptible state of the system per unit time, that is, the average number of active processes. It has no direct relationship with CPU utilization.

2. View Cpu context switch?

vmstat 5

Parameter interpretation:

cs (context switch) is the number of context switches per second.
in (interrupt) is the number of interrupts per second.
r (Running or Runnable) is the length of the ready queue, that is, the number of processes running and waiting for CPU.
b (Blocked) is the number of processes in a non interruptible sleep state.

To view the details of each process, you need to use the pidstat we mentioned earlier. Add the - w option to it and you can see the context switching of each process.

pidstat -w 5  // Output a set of data every 5 seconds

3. High CPU utilization, but no application can be found?

pstree can display the relationship between all processes in tree form:

# pstree -apnh / / displays the relationship between processes

It can be used to analyze CPU performance events, which is very suitable here. Still run the perf record -g command in the first terminal, wait for a while (for example, 15 seconds), and then press Ctrl+C to exit. Then run perf report to view the report:

# Record performance events, wait about 15 seconds and press Ctrl+C to exit
$ perf record -g

# View report
$ perf report

Using execsnoop to monitor the above cases, you can directly get the PID of the parent process of the stress process and its command line parameters, and you can find that a large number of stress processes are starting constantly:

The ftrace used by execsnoop is a commonly used dynamic tracing technology, which is generally used to analyze the runtime behavior of the Linux kernel. I will also introduce it in detail and take you to use it in later courses.

4. What about a large number of zombie processes and unavailable processes in the system?

Process status, use ps aux or top -c to view:

[push@cp-pre Log]$ ps aux | grep Swoole
push      9783  0.0  0.0 112828   980 pts/1    S+   10:25   0:00 grep --color=auto Swoole
work     13303  0.8  0.9 630268 78060 ?        Sl   08:23   1:02 EasySwoole.TaskWorker.2
work     14816  0.8  0.9 1269480 72984 ?       Sl   08:29   1:00 EasySwoole.TaskWorker.1
work     15659  0.8  0.9 1322792 74372 ?       Sl   08:32   0:58 EasySwoole.TaskWorker.0
work     19629  0.8  0.8 1433160 70596 ?       Sl   08:49   0:49 EasySwoole.TaskWorker.3
work     26367  1.2  0.7 1219700 59412 ?       Ssl  3 November 12:24 EasySwoole
work     26368  0.0  0.3 492740 25000 ?        S    3 November 0:00 EasySwoole
work     26372  0.7  0.6 590724 49600 ?        Sl   3 November 7:14 EasySwoole.Worker.0
work     26373  0.7  0.6 615132 48104 ?        Sl   3 November 7:19 EasySwoole.Worker.1
work     26374  0.7  0.6 593148 50568 ?        Sl   3 November 7:14 EasySwoole.Worker.2
work     26375  0.7  0.6 591100 49444 ?        Sl   3 November 7:17 EasySwoole.Worker.3
work     26376  0.7  0.6 593116 49728 ?        Sl   3 November 7:16 EasySwoole.Worker.4
work     26377  0.7  0.6 601700 48700 ?        Sl   3 November 7:12 EasySwoole.Worker.5
work     26378  0.7  0.6 591128 49316 ?        Sl   3 November 7:17 EasySwoole.Worker.6
work     26379  0.7  0.6 612000 48740 ?        Sl   3 November 7:18 EasySwoole.Worker.7
work     26380  0.0  0.3 496888 26372 ?        S    3 November 0:44 EasySwoole
work     26381  0.0  0.3 496888 27228 ?        S    3 November 0:04 EasySwoole.Crontab
work     26386  0.0  0.3 498936 25600 ?        S    3 November 0:03 EasySwoole.Bridge

R is the abbreviation of Running or Runnable, which indicates that the process is Running or waiting to run in the ready queue of the CPU. D is the abbreviation of Disk Sleep, that is, non interruptible sleep. Generally, it means that the process is interacting with the hardware, and the interaction process is not allowed to be interrupted by other processes or interrupts.
Z is the abbreviation of Zombie. If you have played the game "plants vs zombies", you should know its meaning. It indicates that the Zombie process, that is, the process has actually ended, but the parent process has not recycled its resources (such as process descriptor, PID, etc.).
S is the abbreviation of Interruptible Sleep, which means that the process is suspended by the system because it waits for an event. When the event that the process is waiting for occurs, it will wake up and enter the R state.
I is the abbreviation of Idle, which is used for kernel threads that can't interrupt sleep. As mentioned earlier, non interruptible processes caused by hardware interaction are used
D means, but for some kernel threads, they may not actually have any load, and the Idle is used to distinguish this situation. Note that processes in state D will increase the average load, while processes in state I will not.

Session and process group: s indicates that the process is the leading process of a session, while + indicates the foreground process group.

Process group refers to a group of interrelated processes. For example, each child process is a member of the parent process's group;
Session refers to one or more process groups sharing the same control terminal.

Using the most familiar ps or top, you can view the status of the process, including running (R), idle (I), non interruptible sleep (D), interruptible sleep (S), zombie (Z) and pause (T)

Solution steps:

1. Run dstat command in the terminal to observe the usage of CPU and I/O:

dstat 1 10

Use pidstat, but this time remove the process number and simply observe the I/O usage of all processes.

# Output multiple groups of data at an interval of 1 second (here are 20 groups)
$ pidstat -d 1 20

5. Several ideas of CPU performance optimization?

Take Web application as an example:

In terms of application dimension, we can use throughput and request latency to evaluate the performance of the application.
From the dimension of system resources, we can use CPU utilization to evaluate the CPU utilization of the system.

6. How does Linux memory work?

++Memory mapping: + + Linux kernel provides each process with an independent virtual address space, and this address space is continuous. In this way, the process can easily access memory, more specifically virtual memory.

Memory mapping is actually mapping the virtual memory address to the physical memory address. In order to complete the memory mapping, the kernel maintains a page table for each process to record the mapping relationship between virtual address and physical address.

++Virtual memory space distribution + +:

1. Read only segments, including codes and constants.
2. Data segment, including global variables, etc.
3. Heap, including dynamically allocated memory, grows upward from low address.
4. The file mapping segment, including dynamic library and shared memory, starts from the high address and grows downward.
5. Stack, including local variables and the context of function calls. The stack size is fixed, usually 8 MB.

In these five memory segments, the memory of heap and file mapping segments is dynamically allocated. For example, using malloc() or mmap() of C standard library can dynamically allocate memory in heap and file mapping segments respectively.

How to view memory usage:

free

In the first column, total is the total memory size;
In the second column, used is the size of used memory, including shared memory;
In the third column, free is the size of unused memory;
In the fourth column, shared is the size of shared memory;
In the fifth column, buff/cache is the size of cache and buffer;
In the last column, available is the amount of memory available for the new process.

++Buffer and Cache + +: buffer is the Cache of disk data, while Cache is the Cache of file data. They will be used in both read and write requests.

Memory recovery stack memory is automatically allocated and managed by the system. Once the program runs beyond the scope of this local variable, the stack memory will be automatically recycled by the system, so there will be no memory leakage. Heap memory is allocated and managed by the application itself.

7.linux cache hit rate

The so-called cache hit rate refers to the number of requests to obtain data directly through the cache, accounting for the percentage of all data requests.

cachestat and cachetop are the tools to check the system cache hits. (manual installation kit required)

8.linux file structure

The Linux file system assigns two data structures to each file, index node and directory entry

The index node, referred to as inode for short, is used to record the metadata of the file, such as inode number, file size, access rights, modification date, data location, etc. The index node corresponds to the file one by one. Like the file content, it will be persisted and stored on disk. So remember, inodes also take up disk space.
Directory entries, called dentry for short, are used to record the name of a file, the index node pointer, and the association with other directory entries. Multiple associated directory entries constitute the directory structure of the file system. However, unlike inodes, directory entries are an in memory data structure maintained by the kernel, so they are often called directory entry cache.

Directory entries, index nodes, logical blocks and super blocks constitute the four basic elements of Linux file system.

VFS manages files through data structures such as directory entries, index nodes, logical blocks and super blocks.

Directory item, which records the name of the file and the directory relationship between the file and other directory items.
The index node records the metadata of the file.
Logical block is the smallest read-write unit composed of continuous disk sectors, which is used to store file data.
Super block is used to record the overall state of the file system, such as the use of index nodes and logical blocks.

Among them, the directory item is a memory cache; Superblocks, inodes and logical blocks are persistent data stored on disk.

Capacity view:

df -hi

9. How does Linux disk I/O work?

For the measurement standard of disk performance, we must mention five common indicators, that is, utilization rate, saturation, IOPS, throughput and response time, which we often use. These five indicators are the basic indicators to measure disk performance.

- Utilization refers to disk processing I/O Percentage of time spent. Excessive usage (e.g. over 80%)%)，Usually means disk I/O There are performance bottlenecks.
- Saturation refers to disk processing I/O How busy you are. Too high saturation means that there is a serious performance bottleneck in the disk. When the saturation is 100% The disk is unable to accept the new I/O Request.
- IOPS(Input/Output Per Second)，Is the number of seconds per second I/O Number of requests.
- Throughput refers to the throughput per second I/O Request size.
- Response time refers to I/O The interval between a request being sent and a response being received.

10. View socket information

Socket information:

# head -n 3 indicates that only the first three lines are displayed
# -l indicates that only listening sockets are displayed
# -n shows the numeric address and port (not the name)
# -p means to display process information
$ netstat -nlp | head -n 3
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      840/systemd-resolve

# -l indicates that only listening sockets are displayed
# -t indicates that only TCP sockets are displayed
# -n shows the numeric address and port (not the name)
# -p means to display process information
$ ss -ltnp | head -n 3
State    Recv-Q    Send-Q        Local Address:Port        Peer Address:Port
LISTEN   0         128           127.0.0.53%lo:53               0.0.0.0:*        users:(("systemd-resolve",pid=840,fd=13))
LISTEN   0         128                 0.0.0.0:22               0.0.0.0:*        users:(("sshd",pid=1459,fd=3))

Since the low-level protocol is the basis of the high-level protocol, generally speaking, the network optimization we call actually includes the optimization of all layers of the whole network protocol stack.

11. Use tcpdump/Wireshark to analyze network traffic?

tcpdump and Wireshark are the most commonly used network packet capture and analysis tools, and they are also essential tools for analyzing network performance.

12. Practice

To view the system OOM command:

dmesg

Link layer positioning network packet loss:

netstat -i

RX-OK, RX-ERR, RX-DRP and RX-OVR in the output respectively represent the total number of packages received, the total number of errors, the number of packets lost due to other reasons (such as insufficient memory) after entering the Ring Buffer and the number of packets lost due to Ring Buffer overflow.

TX-OK, TX-ERR, TX-DRP and TX-OVR also represent similar meanings, but refer to the corresponding indicators when sending.

Locating packet loss at network layer and transport layer:

netstat -s

Mainly observe the data of IpExt.

Iptables rules are uniformly managed in a series of tables, including filter (for filtering), NAT (for NAT), mangle (for modifying packet data) and raw (for raw data packets). Each table can include a series of chains for grouping and managing iptables rules.

Programmer Think

Learning notes on Linux performance optimization

Hot Topics