10  Hardware Acceleration and Future Evolution

10.1 Current Hardware Configuration

10.1.1 Server NICs

Each server has 2 × 100G NICs (ConnectX-6 DX): - eth0: 100G connection to ToR-A (Network-A) - eth1: 100G connection to ToR-B (Network-B) - Total aggregate: 200G per server via pure L3 ECMP

10.1.2 ConnectX-6 DX Hardware Acceleration

Mellanox/NVIDIA ConnectX-6 DX provides hardware acceleration for OVN/OVS:

10.1.2.1 GENEVE Offload

  • Hardware GENEVE encapsulation/decapsulation: Offloads GENEVE processing from CPU
  • Flow steering: Hardware-based packet classification and forwarding
  • OVS hardware offload: Direct integration with OVS for accelerated forwarding

10.1.2.2 Benefits

  • Reduced CPU overhead: GENEVE processing handled by NIC
  • Higher throughput: Hardware acceleration provides line-rate performance
  • Lower latency: Hardware forwarding faster than software
  • Better scalability: More CPU available for workloads

10.1.2.3 Configuration

# Enable OVS hardware offload on ConnectX-6 DX
ovs-vsctl set Open_vSwitch . other_config:hw-offload=true

# Verify offload status
ovs-vsctl get Open_vSwitch . other_config:hw-offload

Reference: NVIDIA ConnectX-6 DX Documentation

10.1.3 Switch Hardware

10.1.3.1 ToR Switches

  • Option 1: 100G switches with 64 ports (e.g., Tomahawk-based)
  • Option 2: 200G switches with 32 ports (e.g., Tomahawk-based)
  • Chip: Broadcom Tomahawk ASIC
  • Function: Pure L3 routing with BGP/ECMP

10.1.3.2 Spine Switches

  • 400G switches (e.g., Tomahawk-based)
  • High port density for leaf-spine connectivity
  • Function: Pure L3 transit with ECMP

Key: All switches are L3 routers, not L2 switches. Tomahawk ASICs provide excellent L3 forwarding performance.

10.2 Future Evolution: DPUs (Data Processing Units)

10.2.1 What are DPUs?

DPUs (Data Processing Units) are specialized processors that offload networking, storage, and security functions from the host CPU. Examples include: - NVIDIA BlueField DPU - AMD Pensando - Intel IPU (Infrastructure Processing Unit)

10.2.2 How DPUs Fit Our Architecture

10.2.2.1 Current Architecture (Host-Based TEPs)

┌─────────────────────────────────────┐
│  Host CPU                           │
│  ┌──────────┐  ┌──────────┐        │
│  │   OVN    │  │   OVS    │        │
│  │ Control  │  │ Dataplane│        │
│  └────┬─────┘  └────┬─────┘        │
│       │             │               │
│  ┌────▼─────────────▼─────┐        │
│  │  ConnectX-6 DX (NIC)   │        │
│  │  Hardware GENEVE Offload        │
│  └─────────────────────────┘       │
└─────────────────────────────────────┘

10.2.2.2 Future Architecture (DPU-Based TEPs)

┌─────────────────────────────────────┐
│  Host CPU (Workloads Only)          │
│  ┌──────────┐                       │
│  │   VMs    │                       │
│  │  Pods    │                       │
│  └────┬─────┘                       │
│       │                              │
│  ┌────▼──────────────────────────┐  │
│  │  BlueField DPU                 │  │
│  │  ┌──────────┐  ┌──────────┐   │  │
│  │  │   OVN    │  │   OVS    │   │  │
│  │  │ Control  │  │ Dataplane│   │  │
│  │  └──────────┘  └──────────┘   │  │
│  │  Hardware GENEVE Offload      │  │
│  │  Hardware BGP/ECMP            │  │
│  └───────────────────────────────┘  │
└─────────────────────────────────────┘

10.2.3 Benefits of DPU Evolution

  1. Host CPU Offload: OVN/OVS processing moves to DPU, freeing host CPU for workloads
  2. Hardware Acceleration: DPUs provide hardware acceleration for:
    • GENEVE encapsulation/decapsulation
    • BGP routing
    • ECMP load balancing
    • Security policies (ACLs, firewalling)
  3. Consistent Architecture: TEPs still at “host” (now DPU), fabric still pure L3
  4. Better Performance: Dedicated processing for networking functions
  5. Isolation: Network processing isolated from workload CPU

10.2.4 Migration Path

When migrating to DPUs:

  1. TEP moves to DPU: DPU becomes the TEP endpoint
  2. Fabric unchanged: Still pure L3 BGP/ECMP
  3. OVN control plane: Runs on DPU, connects to same OVN databases
  4. BGP on DPU: DPU advertises host loopback via BGP
  5. Zero fabric changes: Underlay architecture remains identical

Key Insight: DPU evolution is transparent to the fabric. The underlay remains pure L3 BGP/ECMP regardless of where TEPs run.

10.3 Future Evolution: Higher Bandwidth Servers

10.3.1 Current: 2 × 100G (200G aggregate)

10.3.2 Future: 2 × 400G (800G aggregate)

10.3.2.1 Architecture Extension

No changes needed to fabric architecture:

  1. Same topology: Independent A/B Fabrics
  2. Same routing: Pure L3 BGP/ECMP
  3. Same principles: Loopback advertised via both NICs
  4. ECMP scales: Automatically handles higher bandwidth

10.3.2.2 What Changes

  • NIC speeds: 100G → 400G per NIC
  • Switch ports: ToR switches need 400G ports (or aggregate multiple 100G)
  • Link speeds: Point-to-point links become 400G
  • ECMP behavior: Same, just more bandwidth per path

10.3.2.3 Example Evolution

Current (2 × 100G): - eth0: 100G → ToR-A - eth1: 100G → ToR-B - Loopback: 10.0.1.11/32 advertised via both

Future (2 × 400G): - eth0: 400G → ToR-A (or 4×100G aggregated) - eth1: 400G → ToR-B (or 4×100G aggregated) - Loopback: 10.0.1.11/32 advertised via both (same!)

Key: The architecture is bandwidth-agnostic. Same design principles apply at any speed.

10.3.3 Future: 2 × 800G (1.6T aggregate)

Same principles: - Independent A/B Fabrics - Pure L3 BGP/ECMP - Loopback-based identity - ECMP automatic load balancing

Scalability: The architecture scales seamlessly from 100G to 800G+ per server.

10.4 Switch Evolution

10.4.1 Current ToR Options

  • 100G × 64 ports: Sufficient for current server density
  • 200G × 32 ports: Higher bandwidth per port

10.4.2 Future ToR Options

  • 400G × 32 ports: For 400G servers
  • 800G × 16 ports: For 800G servers

10.4.3 Spine Evolution

  • Current: 400G spine switches
  • Future: 800G or 1.6T spine switches

Key: Spine capacity must scale with aggregate ToR bandwidth.

10.5 References