8  BGP & Routing Configuration

8.1 Overview

This chapter covers BGP routing configuration for the Independent A/B Fabrics architecture. BGP is used for route advertisement and automatic path selection, with ECMP providing load balancing across multiple equal-cost paths.

Note: For definitions of terms like eBGP, ECMP, BGP, and others, see the Glossary.

8.2 BGP Design Principles

8.2.1 eBGP Everywhere

All BGP peering is external BGP (eBGP): - Server ↔︎ ToR: eBGP - ToR ↔︎ Spine: eBGP - Spine ↔︎ Super-Spine: eBGP (when deployed)

No iBGP, no route reflectors - pure eBGP peering everywhere.

8.2.2 Route Advertisement Flow

Host Loopback (/32)
      ↓ BGP advertisement
   ToR-A (learns via eth0)
      ↓ BGP advertisement
   Spine-A
      ↓ BGP advertisement
Super-Spine (when deployed)

   (Same for Network-B path)

8.2.3 ECMP Creation

When a route is advertised via multiple paths with equal BGP attributes: - Linux kernel creates multiple equal-cost paths - ECMP automatically distributes traffic across all paths - 5-tuple hashing ensures per-flow load balancing

8.3 AS Number Allocation

8.3.1 Scheme

  • Hosts: 65001-65150 (one per host, or reuse per rack)
  • ToRs: 65000 (or per rack: 65000 + rack number)
  • Spines: 65010 (or higher for spine fabric)
  • Super-Spines: 65100 (when deployed)

8.3.2 Example ASN Assignment

Rack 1: - Host 1: AS 66111 - Host 2: AS 66112 - ToR-A: AS 65101 - ToR-B: AS 65102

Spines: - Spine 1: AS 65010 - Spine 2: AS 65011

8.4 BGP Advertisement Rules

8.4.1 Hosts (FRR)

What hosts advertise: - Establish eBGP to ToR-A and ToR-B - Advertise only the host loopback /32 (e.g., 10.0.1.11/32) - Advertise via BOTH NICs with equal BGP attributes → ECMP automatically - Learn fabric routes from ToRs

Example configuration (Rack 1, Host 11):

router bgp 66111
 bgp router-id 10.0.1.11
 no bgp ebgp-requires-policy
 no bgp default ipv4-unicast

 ! Neighbor: ToR-A (Rack-1 ToR-A)
 neighbor 172.16.1.1 remote-as 65101
 neighbor 172.16.1.1 description ToR-A-Rack1
 neighbor 172.16.1.1 ebgp-multihop 2
 neighbor 172.16.1.1 timers 3 10

 ! Neighbor: ToR-B (Rack-1 ToR-B)
 neighbor 172.16.1.3 remote-as 65102
 neighbor 172.16.1.3 description ToR-B-Rack1
 neighbor 172.16.1.3 ebgp-multihop 2
 neighbor 172.16.1.3 timers 3 10

 ! Advertise host loopback (OVN TEP)
 address-family ipv4 unicast
  network 10.0.1.11/32
  neighbor 172.16.1.1 activate
  neighbor 172.16.1.1 route-map ADVERTISE-LOOPBACK out
  neighbor 172.16.1.3 activate
  neighbor 172.16.1.3 route-map ADVERTISE-LOOPBACK out
 exit-address-family

! Route map to advertise only loopback
route-map ADVERTISE-LOOPBACK permit 10
 match ip address prefix-list LOOPBACK-ONLY

! Prefix list for loopback
ip prefix-list LOOPBACK-ONLY seq 5 permit 10.0.1.11/32

For complete configuration, see configs/frr-host-example.conf.

8.4.2 ToRs (SONiC/FRR)

All ToRs (both ToR-A and ToR-B in each rack): - Learn host /32s from directly connected hosts - Advertise those /32s up to all spines (Spine-1, Spine-2, etc.) - All ToRs connect to all spines: Full mesh between ToR and spine layers - No peer-link between ToR-A and ToR-B: They operate independently (no MLAG)

Example configuration (ToR-A, Rack 1):

router bgp 65101
 bgp router-id 10.254.1.1
 no bgp ebgp-requires-policy

 ! Hosts in Rack 1 (learn /32s)
 neighbor 172.16.1.0 remote-as 66111
 neighbor 172.16.1.2 remote-as 66112
 ! ... (more hosts)

 ! ALL Spines (advertise /32s to all spines)
 neighbor 172.20.1.0 remote-as 65010  # Spine-1
 neighbor 172.20.1.2 remote-as 65011  # Spine-2
 ! ... (all spines)

 address-family ipv4 unicast
  ! Advertise host /32s to all spines
  neighbor 172.20.1.0 activate
  neighbor 172.20.1.2 activate
  ! ECMP across all spine paths
  maximum-paths 8
 exit-address-family

Key: ToR-A and ToR-B both connect to the same spines. This creates multiple redundant paths through the unified fabric.

For complete configuration, see configs/frr-tor-example.conf.

8.4.3 Spines

All Spines (Spine-1, Spine-2, etc.): - Learn host /32s from all ToRs (both ToR-A and ToR-B from every rack) - Provide ECMP across all available ToR paths - Connect to all ToRs in full mesh

Example configuration (Spine 1):

router bgp 65010
 bgp router-id 10.255.0.1
 no bgp ebgp-requires-policy

 ! ALL ToRs (both A and B from every rack - learn all host /32s)
 neighbor 172.20.1.1 remote-as 65101  # Rack1 ToR-A
 neighbor 172.20.1.3 remote-as 65102  # Rack1 ToR-B
 neighbor 172.20.2.1 remote-as 65103  # Rack2 ToR-A
 neighbor 172.20.2.3 remote-as 65104  # Rack2 ToR-B
 ! ... (all ToRs in datacenter)

 address-family ipv4 unicast
  maximum-paths 16  ! ECMP across all ToR paths
  ! Activate all neighbors
  neighbor 172.20.1.1 activate
  neighbor 172.20.1.3 activate
  neighbor 172.20.2.1 activate
  neighbor 172.20.2.3 activate
 exit-address-family

Key: Spines learn the same host /32s from both ToR-A and ToR-B, creating multiple equal-cost paths through the fabric.

8.5 ECMP Configuration

8.5.1 Kernel ECMP (Linux)

Enable ECMP in FRR:

! Enable ECMP with up to 2 paths (for hosts with 2 NICs)
router bgp 66111
 address-family ipv4 unicast
  maximum-paths 2
 exit-address-family

For switches, increase maximum-paths:

! Enable ECMP with up to 8 paths (for ToRs/Spines)
router bgp 65010
 address-family ipv4 unicast
  maximum-paths 8
 exit-address-family

8.5.2 BFD (Fast Failure Detection)

Enable BFD for fast failure detection (<1 second):

! Enable BFD on BGP neighbors
router bgp 66111
 neighbor 172.16.1.1 bfd
 neighbor 172.16.1.3 bfd

! BFD timers (100ms interval, 300ms multiplier)
bfd
 peer 172.16.1.1 interval 100 min_rx 100 multiplier 3
 peer 172.16.1.3 interval 100 min_rx 100 multiplier 3

For more on BFD, see Packet Flows & Load Balancing.

8.6 Route Filtering

8.6.2 Accept All Routes from ToRs

Hosts should accept all routes from ToRs (learn fabric topology):

! No input route-map needed - accept all from ToRs
! This allows hosts to learn all remote host /32s via BGP

8.7 BGP Timers

8.7.1 Aggressive Timers for Fast Convergence

! Fast BGP timers (3s keepalive, 10s hold)
neighbor 172.16.1.1 timers 3 10
neighbor 172.16.1.3 timers 3 10

Combined with BFD: - BFD detects failures in <1 second - BGP reacts immediately to BFD notification - Total convergence: <1 second

8.8 Multi-Hop eBGP

Since hosts and ToRs are not directly connected at L2 (each has its own IP), enable ebgp-multihop:

neighbor 172.16.1.1 ebgp-multihop 2
neighbor 172.16.1.3 ebgp-multihop 2

8.9 Verification Commands

8.9.1 Check BGP Status

# Show BGP summary
vtysh -c "show ip bgp summary"

# View all BGP routes
vtysh -c "show ip bgp"

# Check specific route
vtysh -c "show ip bgp 10.0.1.11/32"

# Show advertised routes to neighbor
vtysh -c "show ip bgp neighbors 172.16.1.1 advertised-routes"

# Show routes received from neighbor
vtysh -c "show ip bgp neighbors 172.16.1.1 routes"

8.9.2 Check ECMP Routes

# View routing table
ip route show

# Check ECMP routes (look for "nexthop")
ip route show | grep "nexthop"

# Check specific route
ip route show 10.0.2.22

8.9.3 Verify BFD

# Show BFD peers
vtysh -c "show bfd peers"

# Show BFD peer details
vtysh -c "show bfd peer 172.16.1.1"

For more troubleshooting commands, see Operations & Maintenance.

8.10 References