Virtual network of Docker container

Posted by powah on Fri, 03 Dec 2021 14:56:59 +0100

1 virtualized network

  • The Linux kernel supports six namespaces. As long as there are corresponding client tools in the user space, you can operate on the corresponding namespaces.

    • The host name and domain name are called UTS
    • USER's name: USER
    • Mount file system: Mount
    • IPC for interprocess communication
    • Process ID: Pid
    • Network: Net
  • As one of the six namespaces implemented by docker container, network is essential. It has been loaded into the kernel support in Linux kernel 2.6.

  • Network namespace is mainly used to isolate network devices and protocol stacks

  • The Linux kernel supports the simulation of layer-2 and layer-3 devices. Docker0 of the host is a virtual layer-2 device with switching function implemented by software. The network card devices in docker appear in pairs. For example, one end of the network cable is in docker and the other end is on the docker0 bridge. This can be realized by using brctl tool.

  • As shown in the figure above, if our physical machine has four physical network cards, we want to create four namespaces, and these devices can be independently associated with a separate namespace. The first network card is assigned to the first namespace, the second to the second namespace, the third to the third namespace, and the fourth to the fourth namespace. At this time, other namespaces cannot see the current namespace, because a device can only belong to one namespace.

  • In this way, each namespace can be configured with an IP address and communicate directly with the external network because they use a physical network card.

  • But what if we have more namespaces than physical network cards? At this time, we can use the virtual network card device to simulate a group of devices in a pure software way. The Linux kernel level supports the simulation of two levels of devices, one is a layer 2 device and the other is a layer 3 device.

Layer 2 equipment (link layer)

  • The link layer is the equipment to realize message forwarding. Using the simulation of layer-2 devices by the kernel, a virtual network card interface is created. This network interface appears in pairs and is simulated as two ends of a network cable. One end is plugged into the host and the other end is plugged into the switch.

  • The native kernel supports layer-2 virtual bridge devices and uses software to build switches. For example, brctl of bridge utils tool.

  • Using software switches and software implemented namespaces, you can simulate a host connected to the switch to realize the function of network connection. Two namespaces are equivalent to two hosts connected to the same switch.

Layer 3 equipment (software switch)

  • OVS: Open VSwitch is an open source virtual switch that can simulate and implement advanced three-layer network devices, such as VLAN, VxLAN, GRE, etc. it does not belong to the module of the Linux kernel itself, so it needs to be installed additionally. It is developed by many network equipment production companies such as Cisco, and its function is very powerful.

  • SDN: Software Defined Network / software driven network. It needs to support virtualization network at the hardware level and build complex virtualization network on each host to run multiple virtual machines or containers.

The Linux kernel natively supports the layer-2 virtual bridge device, that is, the function of software virtual switch. As shown in the figure below

  • At this time, if there is another namespace, it creates a pair of virtual network cards. One end is connected to the namespace and the other end is connected to the virtual switch. At this time, it is equivalent to that two namespaces are connected to the same switch network. At this time, if the network card addresses of the two namespaces are configured in the same network segment, it is obvious that they can communicate with each other. As shown in the figure below:

    From the physical device of network communication to the network card, it is realized by pure software, which is called virtual network

2. Communication between single node containers

If two namespace s on the same host need to communicate, you can establish a virtual switch on the host and use pure software to establish a pair of network cards, half on the container and half on the switch. At this time, as long as the container is configured with the IP address of the same subnet, it can communicate.

This is the communication mode between two containers on a single node. The communication between two containers on a single node is also complicated. For example, we expect the container to communicate across switches

We make two virtual switches, each of which is connected to different containers, as shown in the figure above. At this time, how to realize C1 and C3 communication? In fact, we can create a pair of network cards through the namespace. One end is connected to SW1 and the other end is connected to SW2. In this way, the two switches C1 and C3 can communicate in different switches. However, there is another problem, that is, if C1 and C3 are in different networks? If we are not in the same network, we must use routing forwarding to make it communicate, that is, we have to add a router between the two switches. In fact, the Linux kernel itself supports routing forwarding. We only need to turn on the routing forwarding function. At this point, we can start another container, run a kernel in this container, and turn on its forwarding function. In this way, we simulate a router to realize routing forwarding through this router.

3 communication between different node containers


As shown in the figure above, how can C1 communicate with C5? If we use bridging, it is easy to generate broadcast storm. Therefore, in the scene of large-scale virtual machine or container, using bridging is undoubtedly self destruction, so we should not use bridging to realize communication.

In this way, we can not bridge, but also need to communicate with the outside, so we need to use the following methods

  • NAT

    If you want to communicate externally, you should use nat technology instead of bridging.

    If C3 and C5 communicate, C3 points the gateway to S1 and opens the core forwarding function on the physical machine. Message: C3 - > S1 - > (routing table, forwarding) - > external network. However, the message cannot be returned because C3 is a private address. Therefore, before the C3 message leaves the host, convert the IP to the IP address on the S1 host, which is the source address conversion.

    In this way, C5 can reply directly to S1. The nat table inside S1 knows that the message actually belongs to C3, so it will be automatically forwarded to C3.

    The above communication must be realized through nat, and there are two levels of nat agents. Because C5 may also be internal to nat. Therefore, S1 cannot see C5 unless C5 DNAT is released.

    Therefore, for example, if C5 is published to the address and port of S2, S2 will automatically convert its request to C5.

    After SNAT and DNAT conversion, the efficiency is not high. And the two sides of the communication can't see the real other party. The advantage is that the network is easy to manage.

  • Overlay Network

    This network mode does not need to completely expose the host or completely hide the host. The method is as follows

    • Multiple host s, create a virtual bridge, and let the VM connect to the virtual bridge.
    • Create a tunnel on the virtual bridge so that C3 can see C5 directly
    • The physical machine can communicate directly, and C3 and C5 are in the same address segment. C3 sends the message to the bridge first, and the bridge knows that C5 is not local, so it sends the message through the physical network card: before forwarding it through the tunnel, the message C3|C5 encapsulates a layer of IP header: H1|H2
    • H2 disassembles the first layer of package and sees C3|C5, so it forwards the message to C5

The above implementation of two-level three-layer packaging, using one IP to carry another IP, which is tunneling technology. In this way, C3 and C5 can communicate directly


Overlay Network will tunnel forward the message, that is, add an IP header for the message before sending it, that is, parts 1.1 and 1.2 in the figure above. Here, 1.1 is the source and 1.2 is the target. After receiving the message, host 2 unpacks and finds that the target container to be found is C2, so it forwards the packet to C2.

4 Docker container network

Docker automatically provides three types of networks after installation, which can be viewed using the docker network ls command

[root@Docker ~]# docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
ee8d79d4ac78   bridge    bridge    local
f07e7613bacb   host      host      local
d951c3cc12d5   none      null      local

Docker uses Linux bridging. A docker container bridge (docker0) is virtualized on the host. When docker starts a container, an IP address, called container IP, will be assigned to the container according to the network segment of the docker bridge. At the same time, the docker bridge is the default gateway of each container. Because the containers in the same host are connected to the same bridge, the containers can communicate directly through the container IP of the container.

5. Four network modes of docker

Network modeto configureexplain
host–network hostThe container and host share the Network namespace
container–network container:NAME_OR_IDThe container shares the Network namespace with another container
none–network noneThe container has an independent Network namespace, but it does not have any network settings, such as assigning veth pair and bridge connection, configuring IP, etc
bridge–network bridgeDefault mode

When Docker creates a container, use the option – network to specify which network model to use. The default is bridge (docker0)

  • Closed container: only loop interface, that is, none type

  • Bridge container a: bridge network type. The container network is connected to the docker0 network

  • joined container A: container network type, which allows two containers to have part of the namespace isolation (User, Mount and Pid), so that the two containers have the same network interface and network protocol stack

  • Open container: Open Network: it directly shares the three namespaces (UTS, IPC and Net) of the physical machine. The world uses the network card communication of the physical host to give the container the privilege to manage the physical host network, that is, the host network type

5.1 host mode

  • Host mode does not create an isolated network environment for the container. The reason why it is called host mode is that the Docker Container in this mode will share the same network namespace with the host host host, so the Docker Container can use the eth0 of the host host to communicate with the outside world like the host host. In other words, the IP address of Docker Container is the IP address of the host eth0. Its features include:
    • Containers in this mode do not have isolated network namespace s
    • The IP address of the container is the same as that of the Docker host
    • Note that the port number of the service in the container cannot conflict with the port number already used on the Docker host
    • host mode can coexist with other modes
  • The container using the host mode can directly use the IP address of the host to communicate with the outside world. The service port inside the container can also use the port of the host without NAT. The biggest advantage of the host is that the network performance is relatively good, but the ports already used on the docker host can no longer be used, and the network isolation is not good
//Example
[root@Docker ~]# docker images
REPOSITORY        TAG       IMAGE ID       CREATED        SIZE
zhaojie10/nginx   v1.20.2   6f5b02e61ad4   18 hours ago   549MB
centos            latest    5d0da3dc9764   2 months ago   231MB
[root@Docker ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:0c:29:1b:44:be brd ff:ff:ff:ff:ff:ff
    inet 192.168.25.148/24 brd 192.168.25.255 scope global dynamic noprefixroute ens33
       valid_lft 1163sec preferred_lft 1163sec
    inet6 fe80::2e0f:34ad:7328:bbf9/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:30:6d:78:03 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever

[root@Docker ~]# docker run -it --network host --rm --name nginx 6f5b02e61ad4 /bin/sh
sh-4.4# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:0c:29:1b:44:be brd ff:ff:ff:ff:ff:ff
    inet 192.168.25.148/24 brd 192.168.25.255 scope global dynamic noprefixroute ens33
       valid_lft 1796sec preferred_lft 1796sec
    inet6 fe80::2e0f:34ad:7328:bbf9/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:30:6d:78:03 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever

5.2 container mode

  • This pattern specifies that the newly created container shares a Network Namespace with an existing container, rather than with the host. The newly created container will not create its own network card and configure its own IP, but share IP and port range with a specified container. Similarly, in addition to the network, the two containers are isolated from each other, such as file system and process list. The processes of the two containers can communicate through lo network card devices.
  • Container network mode is a special network mode in docker. Docker containers in this mode will share the network environment of other containers. Therefore, at least there is no network isolation between the two containers, and the two containers are network isolated from the host and other containers.

//Example
[root@Docker ~]# docker images
REPOSITORY        TAG       IMAGE ID       CREATED        SIZE
zhaojie10/nginx   v1.20.2   6f5b02e61ad4   18 hours ago   549MB
centos            latest    5d0da3dc9764   2 months ago   231MB

[root@Docker ~]# docker run -it --rm --name nginx 6f5b02e61ad4 /bin/sh
sh-4.4# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
4: eth0@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

[root@Docker ~]# docker ps        #Reopen a new terminal
CONTAINER ID   IMAGE          COMMAND     CREATED          STATUS          PORTS     NAMES
d964b9adf5fe   6f5b02e61ad4   "/bin/sh"   39 seconds ago   Up 37 seconds             nginx

[root@Docker ~]# docker run -it --rm --name centos --network  container:d964b9adf5fe 5d0da3dc9764        #Creating a new container specifies that the network uses the container mode

[root@d964b9adf5fe /]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
4: eth0@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

5.3 none mode

Using the none mode, the Docker container has its own Network Namespace, but no network configuration is performed for the Docker container. In other words, the Docker container has no network card, IP, routing and other information. We need to add network card and configure IP for Docker container.

In this network mode, the container has only lo loopback network and no other network card. The none mode can be specified through – network none when the container is created. This type of network has no way to network, and the closed network can well ensure the security of the container.

Application scenario:

  • Start a container to process data, such as converting data formats
  • Some background computing and processing tasks
//Example
[root@Docker ~]# docker images
REPOSITORY        TAG       IMAGE ID       CREATED        SIZE
zhaojie10/nginx   v1.20.2   6f5b02e61ad4   19 hours ago   549MB
centos            latest    5d0da3dc9764   2 months ago   231MB

[root@Docker ~]# docker run -it --rm --name nginx --network none 6f5b02e61ad4 /bin/bash
[root@2c3fe7cd5a0b /]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever

5.4 bridge mode

  • Bridge mode is the default network mode of docker. If the - network parameter is not written, it is the bridge mode. When using docker run -p, docker actually makes DNAT rules in iptables to realize port forwarding function. You can use iptables -t nat -vnL to view.

  • When the Docker process starts, a virtual bridge named docker0 will be created on the host, and the Docker container started on the host will be connected to this virtual bridge. The virtual bridge works similar to the physical switch, so that all containers on the host are connected to a layer-2 network through the switch.

  • Assign an IP to the container from the docker0 subnet, and set the IP address of docker0 as the default gateway of the container. Create a pair of virtual network card veth pair devices on the host. Docker places one end of the veth pair device in the newly created container and names it eth0 (container network card), and the other end in the host with a similar name like vethxxx, and adds this network device to the docker0 bridge. You can view it through the brctl show command.

The bridge mode is shown in the figure below

Docker bridge is virtualized by the host, not a real network device. The external network cannot be addressed, which also means that the external network cannot access the container directly through container IP. If the container wants external access, it can be accessed by mapping the container port to the host host host (port mapping), that is, when docker run creates the container, it can be enabled through the - P or - P parameter, and when accessing the container, it can access the container through [host IP]: [container port].

//Example
[root@Docker ~]# docker images
REPOSITORY        TAG       IMAGE ID       CREATED        SIZE
zhaojie10/nginx   v1.20.2   6f5b02e61ad4   19 hours ago   549MB
centos            latest    5d0da3dc9764   2 months ago   231MB

[root@Docker ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:0c:29:1b:44:be brd ff:ff:ff:ff:ff:ff
    inet 192.168.25.148/24 brd 192.168.25.255 scope global dynamic noprefixroute ens33
       valid_lft 1309sec preferred_lft 1309sec
    inet6 fe80::2e0f:34ad:7328:bbf9/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:30:6d:78:03 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:30ff:fe6d:7803/64 scope link 
       valid_lft forever preferred_lft forever


[root@Docker ~]# docker run -it --rm --name nginx  6f5b02e61ad4 /bin/bash
[root@a65e5aa2362c /]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
6: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

[root@Docker ~]# docker run -it --rm --name centos -p 8080:80 5d0da3dc9764
[root@2ddc151d2030 /]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
10: eth0@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

[root@Docker ~]# ss -antl
State   Recv-Q  Send-Q     Local Address:Port     Peer Address:Port  Process  
LISTEN  0       128              0.0.0.0:8080          0.0.0.0:*              
LISTEN  0       128              0.0.0.0:22            0.0.0.0:*              
LISTEN  0       128                 [::]:8080             [::]:*              
LISTEN  0       128                 [::]:22               [::]:*      
    
[root@Docker ~]# iptables -t nat -vnL
......
Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0           
    0     0 DNAT       tcp  --  !docker0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8080 to:172.17.0.3:80
    

[root@Docker ~]# docker network inspect bridge        #View the detailed configuration of the bridge network
[
    {
        "Name": "bridge",
        "Id": "ee8d79d4ac782949ebfcd7c19baa710bcfa7587c662a82b97a772463b5ba3590",
        "Created": "2021-12-03T04:03:23.448080844-05:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.17.0.0/16",
                    "Gateway": "172.17.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "2ddc151d20308ee1a6f41ea2811e18708503596924db8c5185bd3f6f020b8afe": {
                "Name": "centos",
                "EndpointID": "0f82b0f327236a5f4a763de8c653a0b6bb971d0b0deb12f67e9b24261d78c07e",
                "MacAddress": "02:42:ac:11:00:03",
                "IPv4Address": "172.17.0.3/16",
                "IPv6Address": ""
            },
            "a65e5aa2362cb6cf2b22d1fd756855832057b27061a0a3f769cfd5903d8dd506": {
                "Name": "nginx",
                "EndpointID": "7899c74a02bfc07866268ea3b4a49d2d7984b3eff1e9f62c1ac1c0303de71c87",
                "MacAddress": "02:42:ac:11:00:02",
                "IPv4Address": "172.17.0.2/16",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]
 
[root@Docker ~]# yum -y install epel-release
[root@Docker ~]# yum install bridge-utils
[root@Docker ~]# brctl show
bridge name	bridge id		STP enabled	interfaces
docker0		8000.0242306d7803	no		veth3c0bcb5
							veth5a42188


Topics: Docker network Container