OpenStack DC Network
L3 Underlay, GENEVE Overlay, and Kubernetes Integration
0.1 Overview
This book documents the design and implementation of a datacenter network architecture for a medium-scale OpenStack deployment. The design decisions favor a future of fully software-defined virtualized hardware, driven by exponential trends like Moore’s law.
- Pure L3 Underlay (Physical Network): L3/L4-aware Clos fabric with multi-path routing using BGP and ECMP
- GENEVE Overlay (Virtual Network): OVN/OVS overlay for OpenStack VMs and OVN-Kubernetes
The network design provides a unified, scalable foundation for both OpenStack VMs and Kubernetes pods using a single overlay network.
0.2 Why L3+? The Industry Trend
There is a major industry trend is towards point-to-point connects + routing (vs. the old shared medium and broadcast based transmission):
| Layer | Old Approach | Modern Approach |
|---|---|---|
| On-Chip | Data BUS | Network-on-Chip (NOC) packet switching |
| Motherboard | PCI parallel lanes | PCIe serial lanes with switching |
| Network | L2 broadcast Ethernet | L3 Point-to-Point & Routing |
Summary: No broadcast, no sharing of transmission medium. Only individual links & intelligent routing.
Why “L3+” and not just “L3”? Our design uses L3 (IP routing) but is L4-aware for ECMP load balancing (5-tuple hashing: src/dst IP + src/dst port + protocol). This L3/L4 combination provides the intelligence needed for modern datacenters.
0.2.1 Key drivers towards L3+ Routing:
As silicon gets denser, the network must evolve. L3 routing provides the scalability, stability, and bandwidth that modern chips, motherboards, and datacenters require. The intuition:
Density - high bandwidth in less space: Exponential miniaturization of silicon packs more virtualized entities into less space—each needing communication pathways. Where broadcast worked when entities were few; but now high density demands road-networks & junctions/routers between miniaturized entities. Scale requires structure.
Routing compute is cheap: The same silicon density that is creating the need is also providing the solution. The routers themselves and their intelligent control planes (BGP, ECMP) are a commodity capability now. We will start seeing mini-routers everywhere.
Inherent scalability & reliability: L3 is loosely coupled with natural hierarchical segmentation. This enables redundant sections/paths, inherent extensibility, A/B change/upgrades, providing enough abundance to reduce blast radius and prevent single points of failure.
For deeper technical details on Moore’s law trends, hardware offloading at 40+ Gbps, and why software-defined networking requires pure L3+ underlay, see L3 & Routing Trends.
0.3 Key Principles
- Pure L3 Underlay (L3/L4 for ECMP): BGP routing (L3) and ECMP load balancing (L3/L4 5-tuple hashing) - no EVPN/VXLAN at fabric layer
- Host-Based TEPs: GENEVE encapsulation at hypervisors, not at switches
- Dual ToRs per Rack: Redundant uplinks in unified L3 Clos fabric with excellent ECMP path diversity
- Mesh to Leaf-Spine Evolution: Start with mesh topology (5-6 racks), evolve to leaf-spine (7+ racks)
- Operational Simplicity: ~50 config lines per switch vs 300+ for EVPN
0.4 Architecture Summary
The fabric layer provides simple, scalable L3 routing while the overlay layer (OVN/OVS) handles all virtualization complexity at the hosts.
┌───────────────────────────────────────────────────────────┐
│ GENEVE OVERLAY │
│ ------------------ │
│ (Hosts/Hypervisors - TEP Endpoints) │
│ • GENEVE encapsulation/decapsulation │
│ • Random UDP/L4 src port per-flow, enables underlay ECMP │
│ • OVN control plane (TEP registration, VM learning) │
│ • Pure L3 multipath (2 × 100G NICs per server) │
└───────────────────────────────────────────────────────────┘
│
│ IP/UDP packets
│
┌───────────────────────────────────────────────────────────┐
│ L3/L4 UNDERLAY │
│ ------------------ │
│ (ToR/Spine Switches - Pure L3 Routers) │
│ • BGP routing (L3 - route advertisement) │
│ • ECMP load balancing (L3/L4 - 5-tuple hashing) │
│ • Pure L3 forwarding (no overlay awareness) │
└───────────────────────────────────────────────────────────┘
0.6 Target Audience
This documentation is designed for:
0.7 Terminology
Throughout this book, we use standard networking terminology. Key terms are linked to the Glossary on first use in each section. Common terms include:
0.8 References and Standards
This book references several IETF RFCs and industry standards:
- RFC 8926: GENEVE (Generic Network Virtualization Encapsulation)
- RFC 3021: /31 Point-to-Point Links
- RFC 7432: BGP MPLS-Based Ethernet VPN (EVPN - not used in our design)
For complete references, see the References bibliography.