Host GW model analysis of flannel in Kubernetes network

Posted by mchip on Thu, 20 Jan 2022 10:36:51 +0100

1. Flannel's host GW mode

Flannel's host GW mode is a pure three-layer network interworking scheme, and the mutual access between pods is realized by routing. In the host GW mode, the cross node network communication needs to be realized through the routing table on the node, so the host hosts of both sides of the communication must be able to route directly. This requires that all nodes in the cluster must be in the same network in this mode. Corresponding plug-ins need to be installed in the public cloud environment, such as Alibaba cloud CCM.

2. Networking in host GW mode

The main network facilities involved in the flannel host GW mode are:

  • veth pair
  • cni0 Bridge
  • Physical network card
  • Routing table

We started from Pod A to explore networking methods. First, enter the network namespace of Pod A to view the route of Pod A

[root@cn-beijing ~]# ip route
default via 10.10.2.1 dev eth0
10.10.0.0/16 via 10.10.2.1 dev eth0
10.10.2.0/24 dev eth0 proto kernel scope link src 10.10.2.238

Pod A's packets are sent to 10.10.2.1 through eth0. Here are two problems:

  1. 10.10.2.1 who is this IP?
  2. How is eth0 connected to the device 10.10.2.1?

Exit the network namespace of Pod A and view the IP address of the device on the host through the command

[root@cn-beijing ~]# ip a
...
5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 2e:5b:e9:3d:88:51 brd ff:ff:ff:ff:ff:ff
    inet 10.10.2.1/24 brd 10.10.2.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::2c5b:e9ff:fe3d:8851/64 scope link
       valid_lft forever preferred_lft forever
...

It can be found that 10.10.2.1 is the address of cni0 bridge. How is Pod A connected to cni0 bridge. Let's first look at the interfaces that exist on the cni0 bridge

[root@cn-beijing ~]# bridge link
2270: vethef2d98bc state UP @docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master cni0 state forwarding priority 32 cost 2

You can see that the vethef2d98bc device master is on the cni0 bridge.

Enter the network namespace of Pod A again to check the type of eth0 network card in Pod A

[root@cn-beijing ~]# ip -d link show eth0
3: eth0@if2270: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    link/ether 86:41:d0:e3:0c:65 brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0
    veth addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

It is found that eth0 is one end of a veth device. Next, query the number of the opposite end device of eth0 through the command

[root@cn-beijing ~]# ip address show dev eth0
3: eth0@if2270: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 86:41:d0:e3:0c:65 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.10.2.238/24 brd 10.10.2.255 scope global eth0
       valid_lft forever preferred_lft forever

Through the instruction result, you can see that the opposite veh device number of eth0 is 2270(@if2270).

Go back to the main network namespace of Node A and find the network device with number 2270

[root@cn-beijing ~]# ip a
...
2270: vethef2d98bc@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
    link/ether 5e:14:e7:9e:ca:fd brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::5c14:e7ff:fe9e:cafd/64 scope link
       valid_lft forever preferred_lft forever
...

You can see that the network device No. 2270 is vethef2d98bc and is hung on the bridge cni0 (master cni0). So far, the connection problem between Pod A and 10.10.2.1 has been solved.

In the host GW mode, each node node is an independent network segment, which is configured on the cni0 bridge. As shown in the figure, the Pod segment of Node A is 10.10.2.1/24, and the IP of cni0 bridge is 10.10.2.1.
The following question is to find out what happens when the network request of Pod A is sent to cni0. Let's take a look at how an ICMP message flows.

3. Graphic package flow in host GW mode

It can be seen from the discussion in the previous section that ICMP messages are sent to the cni0 bridge through routing and veth devices. What should the bridge do after receiving the messages?
In order to catch the flow mode, we open the TRACE function of iptables. Execute the following commands from Node A and Node B

[root@cn-beijing ~]# iptables -t raw -A OUTPUT -p icmp -j TRACE
[root@cn-beijing ~]# iptables -t raw -A PREROUTING -p icmp -j TRACE

After the above command turns on the TRACE function, the package flow information will be recorded in the / var/log/messages file.
We enter the network namespace of Pod A and ping (for analysis, we only ping once)

[root@cn-beijing ~]# ping 10.10.3.4 -c 1
PING 10.10.3.4 (10.10.3.4) 56(84) bytes of data.
64 bytes from 10.10.3.4: icmp_seq=1 ttl=62 time=0.747 ms

--- 10.10.3.4 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.747/0.747/0.747/0.000 ms

3.1. ICMP REQ flow - send

You can see that the packet flow process is recorded in the messages file of Node A. The ICMP request flow is as follows

Jan 19 17:19:47 cn-beijing kernel: TRACE: raw:PREROUTING:policy:2 IN=cni0 OUT= PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: mangle:PREROUTING:policy:1 IN=cni0 OUT= PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:PREROUTING:rule:1 IN=cni0 OUT= PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:KUBE-SERVICES:return:11 IN=cni0 OUT= PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:PREROUTING:policy:3 IN=cni0 OUT= PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: mangle:FORWARD:policy:1 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:1 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:KUBE-FORWARD:return:4 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:2 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:KUBE-SERVICES:return:1 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:3 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:KUBE-EXTERNAL-SERVICES:return:1 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:4 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:DOCKER-USER:return:1 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:5 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:DOCKER-ISOLATION-STAGE-1:return:2 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:10 IN=cni0 OUT=eth0 PHYSIN=vethef2d98bc MAC=2e:5b:e9:3d:88:51:86:41:d0:e3:0c:65:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: mangle:POSTROUTING:policy:1 IN= OUT=eth0 PHYSIN=vethef2d98bc SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:POSTROUTING:rule:1 IN= OUT=eth0 PHYSIN=vethef2d98bc SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:KUBE-POSTROUTING:rule:1 IN= OUT=eth0 PHYSIN=vethef2d98bc SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:POSTROUTING:rule:4 IN= OUT=eth0 PHYSIN=vethef2d98bc SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:POSTROUTING:policy:8 IN= OUT=eth0 PHYSIN=vethef2d98bc SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 

From the above, we can see that the ICMP packet entering from cni0 enters the FORWARD chain after passing through the forwarding chain, and the message is forwarded to the eth0 network card of Node A. according to the IPTABLES rules, the routing process between forwarding and FORWARD has gone through. Let's take a look at the host routing table

[root@cn-beijing ~]# ip route
default via 192.168.0.13 dev eth0
10.10.0.0/24 via 192.168.0.1 dev eth0
10.10.1.0/24 via 192.168.0.2 dev eth0
10.10.2.0/24 dev cni0 proto kernel scope link src 10.10.2.1
10.10.3.0/24 via 192.168.0.4 dev eth0
10.10.4.0/24 via 192.168.0.5 dev eth0
10.10.5.0/24 via 192.168.0.6 dev eth0
169.254.0.0/16 dev eth0 scope link metric 1002
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
192.168.0.0/28 dev eth0 proto kernel scope link src 192.168.0.3

You can see the records in the routing table

10.10.3.0/24 via 192.168.0.4 dev eth0

The packet sent to 10.10.3.4 is sent through eth0, and the next hop is 192.168.0.4, that is, the IP address of Node B.

3.2. ICMP REQ receive

You can see that the packet flow process is recorded in the messages file of Node B. The ICMP request flow is as follows

Jan 19 17:19:47 cn-beijing kernel: TRACE: raw:PREROUTING:policy:2 IN=eth0 OUT= MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: mangle:PREROUTING:policy:1 IN=eth0 OUT= MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:PREROUTING:rule:1 IN=eth0 OUT= MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:KUBE-SERVICES:return:11 IN=eth0 OUT= MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:PREROUTING:policy:3 IN=eth0 OUT= MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: mangle:FORWARD:policy:1 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:1 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:KUBE-FORWARD:return:4 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:2 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:KUBE-SERVICES:return:1 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:3 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:KUBE-EXTERNAL-SERVICES:return:1 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:4 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:DOCKER-USER:return:1 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:5 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:DOCKER-ISOLATION-STAGE-1:return:2 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: filter:FORWARD:rule:10 IN=eth0 OUT=cni0 MAC=00:16:3e:2e:80:3b:ee:ff:ff:ff:ff:ff:08:00 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: mangle:POSTROUTING:policy:1 IN= OUT=cni0 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:POSTROUTING:rule:1 IN= OUT=cni0 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:KUBE-POSTROUTING:rule:1 IN= OUT=cni0 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:POSTROUTING:rule:4 IN= OUT=cni0 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 
Jan 19 17:19:47 cn-beijing kernel: TRACE: nat:POSTROUTING:policy:8 IN= OUT=cni0 SRC=10.10.2.238 DST=10.10.3.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=63186 DF PROTO=ICMP TYPE=8 CODE=0 ID=8704 SEQ=0 

From the above, we can see that the ICMP packet entering from cni0 enters the FORWARD chain after passing through the forwarding chain, and the message is forwarded to the cni0 bridge. From the IPTABLES rules, we can see that the routing process has gone through between forwarding and FORWARD. Let's take a look at the host routing table

[root@cn-beijing ~]# ip route
default via 192.168.0.13 dev eth0
10.10.0.0/24 via 192.168.0.1 dev eth0
10.10.1.0/24 via 192.168.0.2 dev eth0
10.10.2.0/24 via 192.168.0.3 dev eth0
10.10.3.0/24 dev cni0 proto kernel scope link src 10.10.3.1
10.10.4.0/24 via 192.168.0.5 dev eth0
10.10.5.0/24 via 192.168.0.6 dev eth0
169.254.0.0/16 dev eth0 scope link metric 1002
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
192.168.0.0/28 dev eth0 proto kernel scope link src 192.168.0.4

You can see the records in the routing table

10.10.3.0/24 dev cni0 proto kernel scope link src 10.10.3.1

The packet sent to 10.10.3.4 is sent to the bridge cni0.

4. Summary

From the above analysis, it can be seen that the host GW mode only uses routing to realize network interworking, and does not use virtual network technologies such as vxlan.

Topics: Kubernetes network Container