[TCP/IP] [debugging] one of a series of verification methods for packet loss, blocked flow, wrong packet and other problems - log mode

Posted by charlesg on Sat, 23 Oct 2021 19:23:00 +0200

preface

  • In the process of TCP/IP related development, it is often necessary to check the problems such as packet loss, wrong packet and blocked flow. Because the TCP/IP forwarding surface involves software and hardware, and there are many functions and branches in the software process, it is difficult to add printing or analyze the process.
  • This series of articles will briefly summarize and explain the commonly used positioning methods. You can choose one or combine them according to the situation of the problem.

Introduction to log mode

  • Although printing is relatively Low, in some cases it is the most direct and effective way, and in some cases it is the killer mace of developers.

Direct print message

  • Call print directly where needed, similar to the following_ SKB function.
void print_skb(struct sk_buff *skb)
{
    if (skb) {
        char *buf = skb->data;
        int len = skb->len;
        int i = 0;

        printk("[%s:%d]Packet length = %#4x\n", __FUNCTION__, __LINE__, len);
        for (i = 0; i < len; i++){
            if (i % 16 == 0) printk("%#4.4x", i);
            if (i % 2 == 0) printk(" ");
            printk("%2.2x", ((unsigned char *)buf)[i]);
            if (i % 16 == 15) printk("\n");
        }
        printk("\n\n");
    }
}

It is worth mentioning that in some kernel versions, the above code is not printed completely, and SKB - > data needs to be replaced with SKB - > Mac_ header.
In addition, if there are too many messages in this printing method, the screen will be swiped, which may bring the system down.

Conditional printing

  • To print according to the specific fields in the message, a field needs to be added in the skb structure:
struct sk_buff {
  unsigned int dbg_flag;
}

For a specific location, first judge whether a skb needs to be printed. For example, to track the DHCP message, identify the DHCP message at the location of an entry or point and put the flag on the location. For details, refer to the following code segment:

struct ethhdr *mh = (struct ethhdr *)(skb->mac_header);
if(mh->h_proto != htons(ETH_P_IP)) {
    return;
}

struct iphdr *iph = (struct iphdr *)ip_hdr(skb);
if(iph && iph->protocol != IPPROTO_UDP) {
    return ;
}
    // ip header --> iph->ihl*4;
struct udphdr *udph = (struct udphdr *) ((char *)iph + iph->ihl*4);
if(udph && udph->source == htons(68) && udph->dest == htons(67)) {
    skb->dbg_flag = 1;
}

Then, at other subsequent points, you can directly control printing according to this flag:

if (1 == skb->dbg_flag) {
    print_skb(skb);
}

Switch controlled printing

  • Modify a kernel variable through a proc file, and then control printing according to the variable.
    echo 1>/proc/net/logctl //Assume that the logctl file corresponds to g_logctl variable

Specific controls in the code:

    if (1 == g_logctl) {
        print_skb(skb);
    }

ratelimit speed limit printing

  • The speed limit judgment function can use net_ratelimit or printk_ratelimit;
 if (net_ratelimit()) {
     print_skb(skb);
 }
 
  if (printk_ratelimit()) {
     print_skb(skb);
 }

There is a defect in this printing, which may be suppressed by the speed limit and cannot be printed. At this time, the following printing will appear:

__ratelimit: 250 callbacks suppressed

The specific speed limit parameters can be modified in the following ways:

[qxhgd@localhost ~]$ cat /proc/sys/kernel/printk_ratelimit
5
[qxhgd@localhost ~]$ cat /proc/sys/kernel/printk_ratelimit_burst
10
# printk_ratelimit allows up to 10 messages to be printed in 5s by default: / proc/sys/kernel/printk_ratelimit (how long) and / proc/sys/kernel/printk_ratelimit_burst (maximum number of messages allowed in the above time period)

Custom speed limit printing

  • The system time can also be used to control printing from the line as required. For example, the following code segment can control printing of up to one message per second:
unsigned long g_lastTime = 0; #global variable
if (jiffies - g_lastTime > 1 * HZ)  {
    print_skb(skb);
    g_lastTime = jiffies;
}

above.

If this article is helpful to you, you are welcome to reward:
Alipay and WeChat

Topics: TCPIP