• BGP
  • VXLAN with MP-BGP EVPN control plane

VXLAN with MP-BGP EVPN control plane


This is the second part of a series covering VXLAN on NEXUS devices using Multi-Protocol BGP(MP-BGP) as control plane. The first part looked at unicast and multicast control-plane.

Some of the drawbacks of these options were mentioned in the first part.

The advantage of Ethernet VPN(EVPN) is that the MAC learning happens in control-plane rather than in data plane. Data plane learning means that the MAC address is learned from an Ethernet frame that reach the device (this is no different than the traditional MAC learning).

The advantage of control-plane MAC learning is that it offers greater control over the MAC learning providing granularity for who can learn what by applying policies

The EVPN NLRI is carried in BGP using Address Family Identifier (AFI) of 25 (L2VPN) and a Subsequent Address Family Identifier (SAFI) of 70 (EVPN).

There is quite complexity involved in EVPN which is outside the scope of this series. The purpose of this series is to show how you can configure a specific features and how to confirm that is working properly.

The following is the network topology, very similar to where we left part 1:

The VXLAN/EVPN configuration is built on top of VXLAN with multicast control-plane, that is, the following configuration is applied on the setup that we left at the end of the first part.

The next step is to enable EVPN and BGP on all devices:

NX_OS_4# show running-config | include "bgp|evpn"
nv overlay evpn
feature bgp
NX_OS_4#

BGP configuration on leafs is this:

NX_OS_1# show running-config bgp

!Command: show running-config bgp
!Time: Thu Dec 14 17:58:54 2017

version 7.0(3)I6(1)
feature bgp

router bgp 65100
neighbor 1.1.1.4
remote-as 65100
update-source loopback0
address-family l2vpn evpn
send-community
send-community extended
evpn
vni 10100 l2
rd auto
route-target import auto
route-target export auto

NX_OS_1#

And on the spine. NX_OS_4 will be the route reflector so that BGP updates will be reflected to all other BGP speakers:

NX_OS_4# show running-config bgp

!Command: show running-config bgp
!Time: Thu Dec 14 17:59:22 2017

version 7.0(3)I6(1)
feature bgp

router bgp 65100
neighbor 1.1.1.1
remote-as 65100
update-source loopback0
address-family l2vpn evpn
send-community
send-community extended
route-reflector-client
neighbor 1.1.1.2
remote-as 65100
update-source loopback0
address-family l2vpn evpn
send-community
send-community extended
route-reflector-client
neighbor 1.1.1.3
remote-as 65100
update-source loopback0
address-family l2vpn evpn
send-community
send-community extended
route-reflector-client

NX_OS_4#

In addition to this, “host-reachability protocol bgp” will configure BGP as the method for host reachability advertisement.

interface nve1
no shutdown
source-interface loopback0
host-reachability protocol bgp
member vni 10100
mcast-group 226.0.0.100

Once this is in place, BGP sessions should be up:

NX_OS_4(config-router)# show bgp all summary
BGP summary information for VRF default, address family IPv4 Unicast

BGP summary information for VRF default, address family IPv6 Unicast

BGP summary information for VRF default, address family L2VPN EVPN
BGP router identifier 1.1.1.4, local AS number 65100
BGP table version is 23, L2VPN EVPN config peers 3, capable peers 3
0 network entries and 0 paths using 0 bytes of memory
BGP attribute entries [0/0], BGP AS path entries [0/0]
BGP community entries [0/0], BGP clusterlist entries [0/0]

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
1.1.1.1 4 65100 47 50 0 0 0 00:00:02 0
1.1.1.2 4 65100 48 49 0 0 0 00:00:02 0
1.1.1.3 4 65100 46 50 0 0 0 00:00:07 0
NX_OS_4(config-router)#

Now the learning for this VXLAN segment is control-plane and not data-plane as in the unicast/multicast control-plane:

NX_OS_1# show nve vni
Codes: CP - Control Plane DP - Data Plane
UC - Unconfigured SA - Suppress ARP

Interface VNI Multicast-group State Mode Type [BD/VRF] Flags
--------- -------- ----------------- ----- ---- ------------------ -----
nve1 10100 226.0.0.100 Up CP L2 [100]

NX_OS_1#

Now, the MAC address table is empty on all devices:

NX_OS_1# show system internal l2fwder mac
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link,
(T) - True, (F) - False, C - ControlPlane MAC
VLAN MAC Address Type age Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
NX_OS_1#

Packet captures where started on various points in the network: on e1/2 and e1/1 on NX_OS_1, on e1/2 and e1/3 on NX_OS_4. Basically, the traffic between leafs and spine is captured like this along with the traffic to/from R1.

Just before any ping test, R1 sent a Gratuitous ARP(independently from our testing):

This was relayed further to the network using multicast IP address as destination IP by NX_OS_1:

This GARP reached NX_OS_2 and NX_OS_3. This is from NX_OS_2:

But what happens after the GARP was received by NX_OS_1 is where EVPN comes into play. NX_OS_1 sends a BGP update to NX_OS_4(the route reflector) announcing the MAC address received from R1:

Next, NX_OS_4 sends a BGP update to its route reflector clients. This is BGP update received by NX_OS_2:

And this is the BGP update received by NX_OS_3:

In this moment, NX_OS_2 and NX_OS_3 know about R1’s MAC address and it was learned via control-plane.

Moments later, R2 sends a similar GARP triggering BGP Updates that reach NX_OS_1 and NX_OS_3(the same process described above repeats).

A ping started from R1 to R2 should be successful:

R1(config-if)#do ping 100.100.100.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 100.100.100.2, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 19/19/21 ms
R1(config-if)# do sh arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet 100.100.100.1 - fa16.3e6e.a47b ARPA GigabitEthernet0/1
Internet 100.100.100.2 6 fa16.3ebb.35e8 ARPA GigabitEthernet0/1
R1(config-if)#

So, let us see what is happening.
R1 will send an ARP Request to resolve R2 IP address which in turn will be encapsulated by NX_OS_1 and sent to the multicast group:

This ARP Request reach NX_OS_2 and NX_OS_3, but only NX_OS_2 replies because R2 is behind NX_OS_2.

After the ARP resolution, ping starts to work:

Let us see the content of the MAC address table on NX_OS_1. It should show the MAC address of R1 and R2:

NX_OS_1# show system internal l2fwder mac
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link,
(T) - True, (F) - False, C - ControlPlane MAC
VLAN MAC Address Type age Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
* 100 fa16.3e6e.a47b dynamic 00:00:19 F F Eth1/2
* 100 fa16.3ebb.35e8 static - F F (0x47000001) nve-peer1
1.1.1.2
NX_OS_1#

Observe that the MAC address of R2 is shown as static and this is how it is when the MAC address is learned via EVPN.

It is similar on NX_OS_2:

NX_OS_2# show system internal l2fwder mac
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link,
(T) - True, (F) - False, C - ControlPlane MAC
VLAN MAC Address Type age Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
* 100 fa16.3e6e.a47b static - F F (0x47000001) nve-peer1
1.1.1.1
* 100 fa16.3ebb.35e8 dynamic 00:00:28 F F Eth1/2
NX_OS_2#

But on NX_OS_3, looks like. NX_OS_3 populates it MAC address table from BGP updates:

NX_OS_3(config-if)# show system internal l2fwder mac
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link,
(T) - True, (F) - False, C - ControlPlane MAC
VLAN MAC Address Type age Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
* 100 fa16.3e6e.a47b static - F F (0x47000001) nve-peer1
1.1.1.1
* 100 fa16.3ebb.35e8 static - F F (0x47000002) nve-peer2
1.1.1.2
NX_OS_3(config-if)#

NX_OS_3 has two peers learned via EVPN:

NX_OS_3(config-if)# show nve peers
Interface Peer-IP State LearnType Uptime Router-Mac
--------- --------------- ----- --------- -------- -----------------
nve1 1.1.1.1 Up CP 00:03:34 n/a
nve1 1.1.1.2 Up CP 00:03:32 n/a

NX_OS_3(config-if)#

Whereas NX_OS_1 and NX_OS_2 have only one, each other. This is because NX_OS_3 did not send any BGP update. This is from NX_OS_1 and it is similar on NX_OS_2:

NX_OS_1# show nve peers detail
Details of nve Peers:
----------------------------------------
Peer-Ip: 1.1.1.2
NVE Interface : nve1
Peer State : Up
Peer Uptime : 00:03:08
Router-Mac : n/a
Peer First VNI : 10100
Time since Create : 00:03:08
Configured VNIs : 10100
Provision State : add-complete
Route-Update : Yes
Peer Flags : DisableLearn
Learnt CP VNIs : 10100
Peer-ifindex-resp : Yes
----------------------------------------

NX_OS_1#

You can see the content of the EVPN database(looks similar on NX_OS_2:

NX_OS_1# show bgp l2vpn evpn
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 19, local router ID is 1.1.1.1
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-i
njected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup

Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 1.1.1.1:32867 (L2VNI 10100)
*>l[2]:[0]:[0]:[48]:[fa16.3e6e.a47b]:[0]:[0.0.0.0]/216
1.1.1.1 100 32768 i
*>i[2]:[0]:[0]:[48]:[fa16.3ebb.35e8]:[0]:[0.0.0.0]/216
1.1.1.2 100 0 i

Route Distinguisher: 1.1.1.2:32867
*>i[2]:[0]:[0]:[48]:[fa16.3ebb.35e8]:[0]:[0.0.0.0]/216
1.1.1.2 100 0 i

NX_OS_1#

And on NX_OS_3:

NX_OS_3(config-if)# show bgp l2vpn evpn
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 20, local router ID is 1.1.1.3
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-i
njected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup

Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 1.1.1.1:32867
*>i[2]:[0]:[0]:[48]:[fa16.3e6e.a47b]:[0]:[0.0.0.0]/216
1.1.1.1 100 0 i

Route Distinguisher: 1.1.1.2:32867
*>i[2]:[0]:[0]:[48]:[fa16.3ebb.35e8]:[0]:[0.0.0.0]/216
1.1.1.2 100 0 i

Route Distinguisher: 1.1.1.3:32867 (L2VNI 10100)
*>i[2]:[0]:[0]:[48]:[fa16.3e6e.a47b]:[0]:[0.0.0.0]/216
1.1.1.1 100 0 i
*>i[2]:[0]:[0]:[48]:[fa16.3ebb.35e8]:[0]:[0.0.0.0]/216
1.1.1.2 100 0 i

NX_OS_3(config-if)#

And this would be pretty much about what you need to configure on Nexus devices have VXLAN/EVPN and how to check that is working as expected.

This part also concludes the VXLAN series that covered different control-plane mechanisms, unicast, multicast and EVPN.

Reference:
1. Ethernet VPN – What’s the big deal about it?
2. A Summary of Cisco VXLAN Control Planes: Multicast, Unicast, MP-BGP EVPN
3. BGP MPLS-Based Ethernet VPN
4. Configuring VXLAN BGP EVPN

Thank you to Paris Arau for his contributions to this article.