IPVS matching extension of Netfilter

Posted by Design on Thu, 24 Oct 2019 15:29:31 +0200

The iptables command below looks at the help information for the ipvs matching extension. The matching fields are protocol number, address, port, direction of data flow, forwarding mode and port number of control connection. For FTP like services, the control connection port is 21 and the data port is 20.

$ iptables -m ipvs --help    
iptables v1.6.0


IPVS match options:
[!] --ipvs                      packet belongs to an IPVS connection

Any of the following options implies --ipvs (even negated)
[!] --vproto protocol           VIP protocol to match; by number or name, e.g. "tcp"
[!] --vaddr address[/mask]      VIP address to match
[!] --vport port                VIP port to match; by number or name, e.g. "http"
    --vdir {ORIGINAL|REPLY}     flow direction of packet
[!] --vmethod {GATE|IPIP|MASQ}  IPVS forwarding method used
[!] --vportctl port             VIP port of the controlling connection to match, e.g. 21 for FTP

IPVS matching module initialization

The function IPVS MT init first registers the IPVS match extension definition XT IPVS MT reg. The matching name is IPVS. Here, the matching function IPVS? MT and the check function IPVS? MT? Check are defined.

static struct xt_match xt_ipvs_mt_reg __read_mostly = {
    .name       = "ipvs",
    .revision   = 0,
    .family     = NFPROTO_UNSPEC,
    .match      = ipvs_mt,
    .checkentry = ipvs_mt_check,
    .matchsize  = XT_ALIGN(sizeof(struct xt_ipvs_mtinfo)),
};

static int __init ipvs_mt_init(void)
{
    return xt_register_match(&xt_ipvs_mt_reg);
}

Match the matchsize field in the structure. Given the length of the data structure XT ﹣ IPVS ﹣ mtinfo, its definition contains all the field variables to match. Its last member variable, bitmask, has a value of one or more in XT ﹣ IPVS ﹣ XX. Setting indicates that the corresponding field is specified; otherwise, the corresponding field is not processed. For example, setting XT ﹣ IPVS ﹣ method in bitmask indicates that FWD ﹣ method, the forwarding mode field, is specified.

The value of the field invert is the same as that of bitmask, but the corresponding bit in the invert indicates the operation of inverting. For example, the XT ﹣ IPVS ﹣ method bit is set in the convert to indicate that the pattern other than the forwarding pattern specified in the FWD ﹣ method variable is matched.

enum {
    XT_IPVS_IPVS_PROPERTY = 1 << 0, /* all other options imply this one */
    XT_IPVS_PROTO =     1 << 1,
    XT_IPVS_VADDR =     1 << 2,
    XT_IPVS_VPORT =     1 << 3,
    XT_IPVS_DIR =       1 << 4,
    XT_IPVS_METHOD =    1 << 5,
    XT_IPVS_VPORTCTL =  1 << 6,
}

struct xt_ipvs_mtinfo {
    union nf_inet_addr  vaddr, vmask;
    __be16          vport;
    __u8            l4proto;
    __u8            fwd_method;
    __be16          vportctl;
    
    __u8            invert;
    __u8            bitmask;
};

Match check

According to the registered function IPVS MT check, IPVS matching module only supports IPv4 and IPv6 protocols. This function is executed before matching the function IPVS? Mt.

static int ipvs_mt_check(const struct xt_mtchk_param *par)
{
    if (par->family != NFPROTO_IPV4
#ifdef CONFIG_IP_VS_IPV6
        && par->family != NFPROTO_IPV6
#endif
        ) {
        pr_info("protocol family %u not supported\n", par->family);
        return -EINVAL;
    }
    return 0;

Execution matching

Because the XT? IPVS? IPVS? Property property is a basic property, and this field is implied in other fields later. If the bitmask is completely equal to XT? IPVS? IPVS? Property, it means that only this field is set, that is, the command word of iptables configuration command line: (– IPVS). The matching is determined according to the value of the invert. If the message is only processed by the IPVS system, the member IPVS property of the skb is 1. If the invert is 0, the matching occurs.

Secondly, if the message is not processed by ipvs system and ipvs property is 0, there is no need to continue processing.

static bool ipvs_mt(const struct sk_buff *skb, struct xt_action_param *par)
{
    const struct xt_ipvs_mtinfo *data = par->matchinfo;
    struct netns_ipvs *ipvs = net_ipvs(xt_net(par));    
    const u_int8_t family = xt_family(par);   /* ipvs_mt_check ensures that family is only NFPROTO_IPV[46]. */
    struct ip_vs_iphdr iph;
	
    if (data->bitmask == XT_IPVS_IPVS_PROPERTY) {
        match = skb->ipvs_property ^ !!(data->invert & XT_IPVS_IPVS_PROPERTY);
        goto out;
    }

    /* other flags than XT_IPVS_IPVS_PROPERTY are set */
    if (!skb->ipvs_property) {
        match = false;
        goto out;
    }

The function IP ﹣ vs ﹣ fill ﹣ iph ﹣ SKB obtains the IP header information in the packet and stores it in the variable IP ﹣ vs ﹣ iphdr structure type. The match field XT? IPVS? Proto corresponds to the command option (– vproto) of the iptables configuration command. Note that the field comparison here is different from the operation of the XT ﹣ IPVS ﹣ IPVS ﹣ property field above. If the protocol field is equal and the convert field is 1, the matching fails. On the contrary, if the match is successful, the initial value of match is true.

Then, according to the protocol value, the protocol processing structure and IPVS connection structure of IPVS are obtained.

    ip_vs_fill_iph_skb(family, skb, true, &iph);

    if (data->bitmask & XT_IPVS_PROTO)
        if ((iph.protocol == data->l4proto) ^ !(data->invert & XT_IPVS_PROTO)) {
            match = false;
            goto out;
        }
    pp = ip_vs_proto_get(iph.protocol);
    if (unlikely(!pp)) {
        match = false;
        goto out;
    }
    /* Check if the packet belongs to an existing entry */
    cp = pp->conn_out_get(ipvs, family, skb, &iph);
    if (unlikely(cp == NULL)) {
        match = false;
        goto out;
    }

The following code is matched and compared according to the information in the IPVS connection structure cp, and its logic is consistent with the XT IPVS proto field above. Including: XT ﹣ IPVS ﹣ vport, XT ﹣ IPVS ﹣ vportctl, XT ﹣ IPVS ﹣ dir, XT ﹣ IPVS ﹣ method and XT ﹣ IPVS ﹣ vaddr.

For the XT ﹣ IPVS ﹣ vportctl field, you need to find its IPVS control connection structure and compare the virtual service port, because this is the control port.

For the comparison of the direction XT? IPVS? Dir field, use the information value of the conntrack system, ctinfo, to match.

    if (data->bitmask & XT_IPVS_VPORT)
        if ((cp->vport == data->vport) ^ !(data->invert & XT_IPVS_VPORT)) {
            match = false; goto out_put_cp;
        }
    if (data->bitmask & XT_IPVS_VPORTCTL)
        if ((cp->control != NULL && cp->control->vport == data->vportctl) ^ !(data->invert & XT_IPVS_VPORTCTL)) {
            match = false; goto out_put_cp;
        }
    if (data->bitmask & XT_IPVS_DIR) {
        enum ip_conntrack_info ctinfo;
        struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
        if (ct == NULL) {
            match = false; goto out_put_cp;
        }
        if ((ctinfo >= IP_CT_IS_REPLY) ^ !!(data->invert & XT_IPVS_DIR)) {
            match = false; goto out_put_cp;
        }
    }
    if (data->bitmask & XT_IPVS_METHOD)
        if (((cp->flags & IP_VS_CONN_F_FWD_MASK) == data->fwd_method) ^ !(data->invert & XT_IPVS_METHOD)) {
            match = false; goto out_put_cp;
        }
    if (data->bitmask & XT_IPVS_VADDR) {
        if (ipvs_mt_addrcmp(&cp->vaddr, &data->vaddr, &data->vmask, family) ^ !(data->invert & XT_IPVS_VADDR)) {
            match = false; goto out_put_cp;
        }

IPVS matching application

In the kernel mailing list, the author provides a forwarding mode of NAT (SNAT+DNAT) using IPVS matching. The NAT/Masq forwarding mode of IPVS implements DNAT forwarding. The iptables command below enables the SNAT function.

# ipvsadm -A -t 192.168.100.30:80 -s rr 
# ipvsadm -a -t 192.168.100.30:80 -r 192.168.10.20:80 -m 
# ... 

# Source NAT for VIP 192.168.100.30:80 
# iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 --vport 80 -j SNAT --to-source 192.168.10.10 

or SNAT-ing only a specific real server: 

# iptables -t nat -A POSTROUTING --dst 192.168.10.20 -m ipvs --vaddr 192.168.100.30/32 -j SNAT --to-source 192.168.10.10 

The above command will reach the traffic of virtual service: 192.168.100.30:80, and perform SNAT conversion. The source address will be changed to 192.168.10.10. Or just for the traffic scheduled to the real server 192.168.10.20, perform SNAT conversion.

There is a problem here. If the port of the specified virtual service is FTP port 21, and the SNAT operation is performed on it, then for the data channel of FTP, its port is not 21, the following command is required to specify SNAT for its data connection, and the command option (– vportctl) is used here.

# SNAT FTP control connection 
# iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 --vport 21 -j SNAT --to-source 192.168.10.10 

# SNAT FTP passive data connection 
# iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 --vportctl 21 -j SNAT --to-source 192.168.10.10 

In the above configuration SNAT mode, when the IPVS synchronization function is enabled, different behaviors will occur on the master and slave computers.

Kernel version 5.0

Topics: iptables ftp