Introduction, compilation, installation and principle of Open vSwitch (OVS)

Posted by _giles_ on Wed, 09 Feb 2022 04:27:47 +0100

catalogue

OVS installation

OVS installation

CentOS

OVS common command reference

Flow table management

Flow rule composition

Packet out

reference resources

OVS compilation

Direct source code compilation and installation

Update kernel module

Compile RPM package

Enable DPDK

Compiling kernel modules

Compile deb package

reference resources

OVS principle

OVS architecture

Main module responsibilities

Main data structure

Main process

Add Bridge

Flow table matching

Packet receiving processing

upcall message processing

reference resources

OVS installation

OVS installation

CentOS

yum install centos-release-openstack-newton
yum install openvswitch
systemctl enable openvswitch
systemctl start openvswitch

If you want to install the master version, you can use https://copr.fedorainfracloud.org/coprs/leifmadsen/ovs-master/ BUILD:

wget -o /etc/yum.repos.d/ovs-master.repo https://copr.fedorainfracloud.org/coprs/leifmadsen/ovs-master/repo/epel-7/leifmadsen-ovs-master-epel-7.repo
yum install openvswitch openvswitch-ovn-*

OVS common command reference

How to add bridge and port

ovs-vsctl add-br br0
ovs-vsctl del-br br0
ovs-vsctl list-br
ovs-vsctl add-port br0 eth0
ovs-vsctl set port eth0 tag=1 #vlan id
ovs-vsctl del-port br0 eth0
ovs-vsctl list-ports br0
ovs-vsctl show

Configure IP for OVS port

ovs−vsctl add−port br-ex port tag=10 −− set Interface port type=internal # default is access
ifconfig port 192.168.100.1

How to configure stream mirroring

ovs-vsctl -- set Bridge br-int mirrors=@m -- --id=@tap6a094914-cd get Port tap6a094914-cd -- --id=@tap73e945b4-79 get Port tap73e945b4-79 -- --id=@tapa6cd1168-a2 get Port tapa6cd1168-a2 -- --id=@m create Mirror name=mymirror select-dst-port=@tap6a094914-cd,@tap73e945b4-79 select-src-port=@tap6a094914-cd,@tap73e945b4-79 output-port=@tapa6cd1168-a2

# clear
ovs-vsctl remove Bridge br0 mirrors mymirror
ovs-vsctl clear Bridge br-int mirrors

Capturing packets on ovs port patch Tun by using the mirror feature

ip link add name snooper0 type dummy
ip link set dev snooper0 up
ovs-vsctl add-port br-int snooper0
ovs-vsctl -- set Bridge br-int mirrors=@m  \
-- --id=@snooper0 get Port snooper0  \
-- --id=@patch-tun get Port patch-tun  \
-- --id=@m create Mirror name=mymirror select-dst-port=@patch-tun \
select-src-port=@patch-tun output-port=@snooper0

# capture
tcpdump -i snooper0

# clear
ovs-vsctl clear Bridge br-int mirrors
ip link delete dev snooper0

How to configure QOS, such as queue and speed limit

# egress
$ ovs-vsctl -- \
add-br br0 -- \
add-port br0 eth0 -- \
add-port br0 vif1.0 -- set interface vif1.0 ofport_request=5 -- \
add-port br0 vif2.0 -- set interface vif2.0 ofport_request=6 -- \
set port eth0 qos=@newqos -- \
--id=@newqos create qos type=linux-htb \
  other-config:max-rate=1000000000 \
  queues:123=@vif10queue \
  queues:234=@vif20queue -- \
--id=@vif10queue create queue other-config:max-rate=10000000 -- \
--id=@vif20queue create queue other-config:max-rate=20000000
$ ovs-ofctl add-flow br0 in_port=5,actions=set_queue:123,normal
$ ovs-ofctl add-flow br0 in_port=6,actions=set_queue:234,normal

# ingress
ovs-vsctl set interface vif1.0 ingress_policing_rate=10000
ovs-vsctl set interface vif1.0 ingress_policing_burst=8000

# clear
ovs-vsctl clear Port vif1.0 qos
ovs-vsctl list qos
ovs-vsctl destroy qos _uuid
ovs-vsctl list qos
ovs-vsctl destroy queue _uuid

How to configure flow monitoring sflow

ovs-vsctl -- --id=@s create sFlow agent=vif1.0 target=\"10.0.0.1:6343\" header=128 sampling=64 polling=10  -- set Bridge br-int sflow=@s
ovs-vsctl -- clear Bridge br-int sflow

How to configure flow rules

ovs-ofctl add-flow br-int idle_timeout=0,in_port=2,dl_type=0x0800,dl_src=00:88:77:66:55:44,dl_dst=11:22:33:44:55:66,nw_src=1.2.3.4,nw_dst=5.6.7.8,nw_proto=1,tp_src=1,tp_dst=2,actions=drop
ovs-ofctl del-flows br-int in_port=2 //in_ All flow rules with port = 2 are deleted
ovs-ofctl  dump-ports br-int
ovs-ofctl  dump-flows br-int
ovs-ofctl show br-int //View port number
  • Support fields and nw_tos,nw_ecn,nw_ttl,dl_vlan,dl_vlan_pcp,ip_frag,arp_sha,arp_tha,ipv6_src,ipv6_dst et al;
  • Support flow actions, as well as output: port and mod_dl_src/mod_dl_dst,set field, etc;

How to view OVS configuration

ovs-vsctl list/set/get/add/remove/clear/destroy table record column [VALUE]

The TABLE name supports bridge, controller, interface, mirror, NetFlow and open_ vswitch,port,qos,queue,ssl,sflow

Configure VXLAN/GRE

ovs-vsctl add-port br-ex port -- set interface port type=vxlan options:remote_ip=192.168.100.3
ovs−vsctl add−port br-ex port −− set Interface port type=gre options:remote_ip=192.168.100.3
ovs-vsctl set interface vxlan type=vxlan option:remote_ip=140.113.215.200 option:key=flow ofport_request=9

Display and learn MAC

ovs-appctl fdb/show br-ex

Set controller address

ovs-vsctl set-controller br-ex tcp:192.168.100.1:6633
ovs-vsctl get-controller br0

Flow table management

Flow rule composition

Each flow rule consists of a series of fields, which are divided into three parts: basic field, condition field and action field:

  • The basic fields include the effective time and duration_sec. Table item_ ID, priority, number of packets processed n_packets, idle timeout idle_timeout, idle timeout, idle_timeout is in seconds. After the set idle timeout is exceeded, the flow rule will be automatically deleted. Setting the idle timeout to 0 means that the flow rule will never expire, idle_timeout will not be included in the output of OVS ofctl dump flows brname.
  • The condition field includes the input port number in_port, source destination mac address dl_src/dl_dst, source destination ip address nw_src/nw_dst, packet type dl_type, network layer protocol type nw_proto, etc. can be any combination of these fields, but when the bottom field in the network hierarchy does not give a determined value, the upper field is not allowed to be given a determined value, that is, in a flow rule, the bottom protocol field is allowed to be specified as a determined value, the high-level protocol field is specified as a wildcard (if not specified, it matches any value), and the high-level protocol field is not allowed to be specified as a determined value, The underlying protocol field is a wildcard (if it is not specified, it matches any value). Otherwise, all the flow rules in OVS vswitchd # will be lost and the network cannot be connected.
  • The action fields include normal forwarding, directing to a switch port output: port, dropping drop, and changing the source destination mac address mod_dl_src/mod_dl_dst, etc., a flow rule can have multiple actions, and the action execution is completed in the specified order.

Flow rules can contain wildcards and abbreviations, and ANY field can be equal to * or ANY, such as discarding all received packets

ovs-ofctl add-flow xenbr0 dl_type=*,nw_src=ANY,actions=drop

The abbreviation is to abbreviate the field group to the protocol name. At present, the supported abbreviations include ip, arp, icmp, tcp and udp. The corresponding relationship with the flow rule condition field is as follows:

dl_type=0x0800 <=>ip
dl_type=0x0806 <=>arp
dl_type=0x0800,nw_proto=1 <=> icmp
dl_type=0x0800,nw_proto=6 <=> tcp
dl_type=0x0800,nw_proto=17 <=> udp
dl_type=0x86dd. <=> ipv6
dl_type=0x86dd,nw_proto=6. <=> tcp6
dl_type=0x86dd,nw_proto=17. <=> udp6
dl_type=0x86dd,nw_proto=58. <=> icmp6

Shield an IP

ovs-ofctl add-flow xenbr0 idle_timeout=0,dl_type=0x0800,nw_src=119.75.213.50,actions=drop

Packet redirection

ovs-ofctl add-flow xenbr0 idle_timeout=0,dl_type=0x0800,nw_proto=1,actions=output:4

Remove VLAN tag

ovs-ofctl add-flow xenbr0 idle_timeout=0,in_port=3,actions=strip_vlan,normal

Forwarding after changing the source IP address of the packet

ovs-ofctl add-flow xenbr0 idle_timeout=0,in_port=3,actions=mod_nw_src:211.68.52.32,normal

Note package

# Format: OVS ofctl packet out switch in_ port actions packet
# Where, packet is a hex format packet
ovs-ofctl packet-out br2 none output:2 040815162342FFFFFFFFFFFF07C30000

Common fields of flow table

  • in_port=port = OpenFlow port number of the port where the packet is delivered
  • dl_vlan=vlan = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLAN = VLA
  • dl_ SRC = < MAC > and DL_ DST = < MAC > match the MAC address of the source or target, 01:00:00:00:00 / 01:00:00:00:00:00 represents the broadcast address, 00:00:00:00 / 01:00:00:00:00 represents the unicast address
  • dl_type=ethertype = matching Ethernet protocol type, where: dl_type=0x0800 = IPv4 protocol = dl_type=0x086dd = IPv6 protocol = dl_type=0x0806 = ARP Protocol
  • nw_src=ip[/netmask] and nw_dst=ip[/netmask] DL_ When typ = 0x0800 , match the IPv4 address of the source or target, which can make the IP address or domain name
  • nw_proto=proto and DL_ The type field is used together. When DL_ When type = 0x0800 , match the , IP , protocol number; When dl_type=0x086dd = IPv6 protocol number
  • table=number specifies the number of the Flow table to be used. The range is 0-254. If not specified, the default value is 0. By using the Flow table number, you can create or modify the Flow in multiple tables
  • Reg < IDX > = value [/ mask] the value of the register in the switch. When a packet enters the switch, all registers are cleared, and the user can modify the value in the register through the Action instruction

Common operations

  • output:port: outputs packets to the specified port. Port refers to the OpenFlow port number of the port
  • mod_vlan_vid: modify VLAN tag in packet
  • strip_vlan: remove VLAN tag in packet
  • mod_dl_src/ mod_dl_dest: modify the MAC {address information of the source or target
  • mod_nw_src/mod_nw_dst: modify the IPv4 address information of the source or target
  • Replace the stream of table: remit: Ubin:_ Port field and match again
  • load:value − > DST [start.. end]: write data to the specified field

Tracking packet processing

ovs-appctl ofproto/trace br0 in_port=3,tcp,nw_src=10.0.0.2,tcp_dst=22

ovs-appctl ofproto/trace br-int \
 in_port=1,dl_src=00:00:00:00:00:01,\
  dl_dst=00:00:00:00:00:02 -generate

Packet out

import binascii
from scapy.all import *
a=Ether(dst="02:ac:10:ff:00:22",src="02:ac:10:ff:00:11")/IP(dst="172.16.255.22",src="172.16.255.11", ttl=10)/ICMP()
print binascii.hexlify(str(a))

ovs-ofctl packet-out br-int 5 "normal" 02AC10FF002202AC10FF001108004500001C000100000A015A9DAC10FF0BAC10FF160800F7FF00000000

reference resources

 

OVS compilation

Direct source code compilation and installation

export OVS_VERSION="2.6.1"
export OVS_DIR="/usr/src/ovs"
export OVS_INSTALL_DIR="/usr"
curl -sSl http://openvswitch.org/releases/openvswitch-${OVS_VERSION}.tar.gz | tar -xz && mv openvswitch-${OVS_VERSION} ${OVS_DIR}

cd ${OVS_DIR}
./boot.sh
# If you enable DPDK, you also need to add -- with DPDK = / usr / local / share / DPDK / x86_ 64-native-linuxapp-gcc
./configure --prefix=${OVS_INSTALL_DIR} --localstatedir=/var --enable-ssl --with-linux=/lib/modules/$(uname -r)/build
make -j `nproc`
make install
make modules_install

Update kernel module

cat > /etc/depmod.d/openvswitch.conf << EOF
override openvswitch * extra
override vport-* * extra
EOF

depmod -a
cp debian/openvswitch-switch.init /etc/init.d/openvswitch-switch
/etc/init.d/openvswitch-switch force-reload-kmod

Compile RPM package

make rpm-fedora RPMBUILD_OPT="--without check"

Enable DPDK

make rpm-fedora RPMBUILD_OPT="--with dpdk --without check"

Compiling kernel modules

make rpm-fedora-kmod

Compile deb package

apt-get install build-essential fakeroot
dpkg-checkbuilddeps
# It has been compiled. You need to clean it first
# fakeroot debian/rules clean
DEB_BUILD_OPTIONS='parallel=8 nocheck' fakeroot debian/rules binary

reference resources

 

OVS principle

OVS architecture

picture source 2015 FOSDEM - OVS Stateful Services

 

The architecture of ovs is shown in the figure above, which is mainly composed of kernel datapath and vswitchd and ovsdb in user space.

picture source OpenvSwitch Deep Dive

Main module responsibilities

  • datapath is the kernel module responsible for data exchange. It reads data from the network port and quickly matches the flow table entries in the Flowtable. Successful direct forwarding and failed handover to vswitchd for processing. It registers the hook function during initialization and port binding, and takes over the message processing of the port to the kernel module.
  • vswitchd is a daemon, which is the management and control service of ovs. It saves the configuration information to ovsdb through unix socket and interacts with the kernel module through netlink
  • ovsdb is the ovs database, which stores the ovs configuration information

Main data structure

(picture from csdn)

 

Main process

Note: some reprinted from OVS source code analysis and sorting

Add Bridge

  1. Type the command OVS vsctl add br testbr
  2. Openvswitch When Ko receives a command to add a bridge - OVS is received_ DATAPATH_ OVS of family channel_ DP_ CMD_ New command. The callback function bound by this command is OVS_ dp_ cmd_ new
  3. ovs_ dp_ cmd_ In addition to initializing the {DP} structure, the new} function calls} new_vport() function to generate a new vport
  4. new_vport function call ovs_vport_add() to try to generate a new vport
  5. ovs_ vport_ The add() function will check the vport # type (through the vport_ops_list [] array) and call the relevant create() function to generate the vport # structure
  6. When dp is a network device (vport_netdev.c), it is finally determined by OVS_ vport_ The add () function calls netdev_create() [in ovs_netdev_ops of vport_ops_list]
  7. netdev_ The key step of the create () function is to register the callback function when receiving the network packet
  8. err=netdev_rx_handler_register(netdev_vport->dev,netdev_frame_hook,vport);
  9. The operation is to delete netdev_ Vport - > dev the relevant data when receiving the network packet is sent by {netdev_frame_hook() function is used for processing. It is all auxiliary processing. Each processing function is called in turn in {netdev_port_receive() [the data packet will be copied here to avoid damage] enter ovs_vport_receive() returns to vport c. From ovs_dp_process_receive_packet() returns to datapath c. Unified processing
  10. Process: netdev_ frame_ hook()->netdev_ port_ receive->ovs_ vport_ receive->ovs_ dp_ process_ received_ packet()
  11. net_port_receive() first detects whether skb is shared. If so, it will get a copy of "packet".
  12. net_port_receive() which calls ovs_vport_receive(), check the checksum of the package, and then deliver it to our vport general layer for processing.
(picture from Jian Shu)

Flow table matching

  1. flow_lookup() finds the corresponding stream table entry
  2. for loop call rcu_dereference_ovs # mask in convection table structure_ List # member traversal to find the corresponding member
  3. flow=masked_flow_lookup() traverses the next level of} hmap lookup until it is found
  4. Enter OVS containing function_ flow_ mask_ Key (& masked_key, unmasked, mask), and "and" the first extracted # key # value and # key # value of # mask # and the results are stored in # masked_ Key , is used to get the following , Hash , value
  5. hash=flow_ Hash (& masked_key, key_start, key_end) the matching field of key \ value is only partial
  6. ovs_ vport_ The add() function will check the vport # type (through the vport_ops_list [] array) and call the relevant # create() function to generate the vport # structure
  7. It can be seen that when "dp" is a network device (vport_netdev.c), it is finally determined by "OVS"_ vport_ The add () function calls netdev_create() [in ovs_netdev_ops of vport_ops_list]
  8. netdev_ Vport - > dev the relevant data when receiving the network packet is sent by {netdev_frame_hook() function is used for processing. It is all auxiliary processing. Each processing function is called in turn in {netdev_port_receive() [the data packet will be copied here to avoid damage] enter ovs_vport_receive() returns to vport c. From ovs_dp_process_receive_packet() returns to datapath c. Unified processing

Packet receiving processing

  1. ovs_vport_receive_packets() calls ovs_flow_extract generates key values based on skb and checks for errors, then calls ovs_. dp_ process_ packet. Deliver to datapath for processing
  2. ovs_flow_tbl_lookup_stats. Search the flow table based on the previously generated key value, and return the matching flow table items with the structure of sw_flow.
  3. If there is no match, OVS is called_ dp_ Upload upcall to userspace for matching. (both package and key should be uploaded)
  4. If there is a match, OVS is called directly_ execute_ Actions executes the corresponding action s, such as adding vlan headers, forwarding to a port, etc.

upcall message processing

  1. ovs_dp_upcall() calls err = queue first_ userspace_ Packet () sends the information to the user space in a queue
  2. dp_ifindex=get_dpifindex(dp) gets the index number of the network card device
  3. Adjust the # MAC # address header pointer of # VLAN
  4. Network link attribute. Call this function if filling is not required
  5. len=upcall_msg_size(), get the size of the message sent by upcall()
  6. user_skb=genlmsg_new_unicast, create a new "netlink" message
  7. upcall=genlmsg_put() adds a new {netlink} message to} skb
  8. err=genlmsg_unicast(), send message to user space for processing

reference resources