TYPE-2
It is rare to add header replication table entries. In general, the peer will send a type3 type route for vtep discovery
/* * Install remote VTEP into the kernel if the remote VTEP has asked * for head-end-replication. */ static int zvni_vtep_install(zebra_vni_t *zvni, zebra_vtep_t *zvtep) { if (is_vxlan_flooding_head_end() && (zvtep->flood_control == VXLAN_FLOOD_HEAD_END_REPL)) //Kernel add header copy table entry return kernel_add_vtep(zvni->vni, zvni->vxlan_if, &zvtep->vtep_ip); return 0; }
Add mac table entry (for same subnet forwarding)
/* * Install remote MAC into the kernel. */ static int zvni_mac_install(zebra_vni_t *zvni, zebra_mac_t *mac) { struct zebra_if *zif; struct zebra_l2info_vxlan *vxl; bool sticky; if (!(mac->flags & ZEBRA_MAC_REMOTE)) return 0; zif = zvni->vxlan_if->info; if (!zif) return -1; vxl = &zif->l2info.vxl; sticky = !!CHECK_FLAG(mac->flags, (ZEBRA_MAC_STICKY | ZEBRA_MAC_REMOTE_DEF_GW)); return kernel_add_mac(zvni->vxlan_if, vxl->access_vlan, &mac->macaddr, mac->fwd_info.r_vtep_ip, sticky); }
Add neighbor table entry (used as mac for inner layer purpose when forwarding messages across subnets)
/* * Install remote neighbor into the kernel. */ static int zvni_neigh_install(zebra_vni_t *zvni, zebra_neigh_t *n) { struct zebra_if *zif; struct zebra_l2info_vxlan *vxl; struct interface *vlan_if; #ifdef GNU_LINUX uint8_t flags; #endif int ret = 0; if (!(n->flags & ZEBRA_NEIGH_REMOTE)) return 0; zif = zvni->vxlan_if->info; if (!zif) return -1; vxl = &zif->l2info.vxl; vlan_if = zvni_map_to_svi(vxl->access_vlan, zif->brslave_info.br_if); if (!vlan_if) return -1; #ifdef GNU_LINUX flags = NTF_EXT_LEARNED; if (n->flags & ZEBRA_NEIGH_ROUTER_FLAG) flags |= NTF_ROUTER; ZEBRA_NEIGH_SET_ACTIVE(n); ret = kernel_add_neigh(vlan_if, &n->ip, &n->emac, flags); #endif return ret; } //Add a NUD noarp neighbor int kernel_add_neigh(struct interface *ifp, struct ipaddr *ip, struct ethaddr *mac, uint8_t flags) { return netlink_neigh_update2(ifp, ip, mac, flags, NUD_NOARP, RTM_NEWNEIGH); }
There is no need to add a route. When creating a bdif, the bdif needs to be used as the gateway of l2vni. After IP is configured on the bdif, the segment route of the cost segment will be generated. Combined with the above neighbor table items, cross subnet route forwarding can be completed.
Note: for the centralized routing gateway, if the default gw flag is set, the flag bit of the published local mac/ip message is NUD nuarp when the neighbor table is set. If you carry the sticky flag, you will also be a neighbor of this type. The other is the NTF? Ext? Shared table entry.
TYPE-3
Add header replication fdb table entry with mac value of all zeros
/* * Install remote VTEP into the kernel if the remote VTEP has asked * for head-end-replication. */ static int zvni_vtep_install(zebra_vni_t *zvni, zebra_vtep_t *zvtep) { if (is_vxlan_flooding_head_end() && (zvtep->flood_control == VXLAN_FLOOD_HEAD_END_REPL)) //Kernel add header copy table entry return kernel_add_vtep(zvni->vni, zvni->vxlan_if, &zvtep->vtep_ip); return 0; }
TYPE-5
FRR-BGP adopts the interface less model for network segment routing, as shown in the following figure:
In the linux kernel, the configuration is as follows:
The IP of the VTEP on the right is 10.200.200.1 (underlay IP), and its routing mac is 0200.0ade.de01 (this is the overlay mac, which is usually used as the mac of the inner layer message). When the device on the right issues a 192.168.1.0/24 network segment route, the BGP on the left will receive the type-5 route as follows:
You can see that the prefix in NLRI is 192.168.1.0/24, and the next hop attribute is 10.200.200.1 (an underlay address). At the same time, the mac community using extended routing carries mac(0200.0ade.de01) of overlay gateway, and also carries l3vni. The device on the left receives the address and processes it.
Install routing in the specified vrf
struct nexthop *route_entry_nexthop_ipv4_ifindex_add(struct route_entry *re, struct in_addr *ipv4, struct in_addr *src, ifindex_t ifindex, vrf_id_t nh_vrf_id) { struct nexthop *nexthop; struct interface *ifp; nexthop = nexthop_new(); nexthop->vrf_id = nh_vrf_id; nexthop->type = NEXTHOP_TYPE_IPV4_IFINDEX; nexthop->gate.ipv4 = *ipv4; if (src) nexthop->src.ipv4 = *src; nexthop->ifindex = ifindex; ifp = if_lookup_by_index(nexthop->ifindex, nh_vrf_id); /*Pending: need to think if null ifp here is ok during bootup? There was a crash because ifp here was coming to be NULL */ if (ifp) if (connected_is_unnumbered(ifp))//The interface must not be configured with IP. If IP is configured, the route cannot be distributed accurately SET_FLAG(nexthop->flags, NEXTHOP_FLAG_ONLINK);//Set the next hop? Flag? ONLINK flag route_entry_nexthop_add(re, nexthop); return nexthop; }
After sorting out the next hop of the route through the above functions, use the following functions to add the route:
/* * Update or delete a prefix from the kernel, * using info from a dataplane context. */ enum zebra_dplane_result kernel_route_update(struct zebra_dplane_ctx *ctx) { int cmd, ret; const struct prefix *p = dplane_ctx_get_dest(ctx); struct nexthop *nexthop; if (dplane_ctx_get_op(ctx) == DPLANE_OP_ROUTE_DELETE) { cmd = RTM_DELROUTE; } else if (dplane_ctx_get_op(ctx) == DPLANE_OP_ROUTE_INSTALL) { cmd = RTM_NEWROUTE; } else if (dplane_ctx_get_op(ctx) == DPLANE_OP_ROUTE_UPDATE) { if (p->family == AF_INET || v6_rr_semantics) { /* Single 'replace' operation */ cmd = RTM_NEWROUTE; } else { /* * So v6 route replace semantics are not in * the kernel at this point as I understand it. * so let's do a delete then an add. * In the future once v6 route replace semantics * are in we can figure out what to do here to * allow working with old and new kernels. * * I'm also intentionally ignoring the failure case * of the route delete. If that happens yeah we're * screwed. */ if (!RSYSTEM_ROUTE(dplane_ctx_get_old_type(ctx))) (void)netlink_route_multipath(RTM_DELROUTE, ctx); cmd = RTM_NEWROUTE; } } else { return ZEBRA_DPLANE_REQUEST_FAILURE; } if (!RSYSTEM_ROUTE(dplane_ctx_get_type(ctx))) ret = netlink_route_multipath(cmd, ctx); else ret = 0; if ((cmd == RTM_NEWROUTE) && (ret == 0)) { /* Update installed nexthops to signal which have been * installed. */ for (ALL_NEXTHOPS_PTR(dplane_ctx_get_ng(ctx), nexthop)) { if (CHECK_FLAG(nexthop->flags, NEXTHOP_FLAG_RECURSIVE)) continue; if (CHECK_FLAG(nexthop->flags, NEXTHOP_FLAG_ACTIVE)) { SET_FLAG(nexthop->flags, NEXTHOP_FLAG_FIB); } } } return (ret == 0 ? ZEBRA_DPLANE_REQUEST_SUCCESS : ZEBRA_DPLANE_REQUEST_FAILURE); }
You can do the same with the following command:
sudo ip route add 192.168.1.0/24 via 10.200.200.1 dev br100 proto bgp metric 20 onlink #Note that the onlink attribute must be added to represent the directly connected neighbor. As can be seen from the above code and
Extract route mac and next hop IP to build neighbors (this neighbor is special, where mac is overlay mac and IP is underlay IP). Add neighbor table entry in linux kernel and set noarp attribute.
//Add a NUD noarp neighbor int kernel_add_neigh(struct interface *ifp, struct ipaddr *ip, struct ethaddr *mac, uint8_t flags) { return netlink_neigh_update2(ifp, ip, mac, flags, NUD_NOARP, RTM_NEWNEIGH); }
You can use the ip monitor command to monitor this process:
10.200.200.1 dev br100 lladdr 02:00:0a:de:de:01 NOARP You can use the command sudo IP neigh add 10.200.200.1 dev BR100 lladdr 02:00:0a: de: de: 01 NUD noarp VRF evpn VRF The same result was achieved.
Use rmac and next hop IP simultaneously to build the fdb table entries:
int kernel_add_mac(struct interface *ifp, vlanid_t vid, struct ethaddr *mac, struct in_addr vtep_ip, bool sticky) { return netlink_macfdb_update(ifp, vid, mac, vtep_ip, RTM_NEWNEIGH, sticky); }
You can use the following command to get the same effect:
sudo bridge fdb add 02:00:0a:de:de:01 dev vxlan100 dst 10.200.200.1 self extern_learn
The call stack is:
#0 zebra_vxlan_evpn_vrf_route_add (vrf_id=11, rmac=0x7fff76e7cba0, vtep_ip=0x7fff76e7cacc, host_prefix=0x7fff76e7caf0) at zebra/zebra_vxlan.c:5680 #1 0x0000557f9485a716 in zread_route_add (client=0x557f96929790, hdr=<optimized out>, msg=<optimized out>, zvrf=<optimized out>) at zebra/zapi_msg.c:1488 #2 0x0000557f9485cebb in zserv_handle_commands (client=client@entry=0x557f96929790, msg=msg@entry=0x7ff374001040) at zebra/zapi_msg.c:2532 #3 0x0000557f9485714e in zserv_process_messages (thread=<optimized out>) at zebra/zserv.c:523 #4 0x00007ff37f3ef968 in thread_call (thread=thread@entry=0x7fff76e7e910) at lib/thread.c:1547 #5 0x00007ff37f3cc257 in frr_run (master=0x557f9672baa0) at lib/libfrr.c:1021 #6 0x0000557f9481b1be in main (argc=2, argv=0x7fff76e7ecd8) at zebra/main.c:475 (gdb) s
TYPE4
TYPE4 is used for MLAG, so I don't know much about it at present.