7 Network Design & IP Addressing
7.1 Overview
This chapter provides the concrete implementation details for IP addressing and network topology. For architectural principles, see Network Architecture Overview.
Note: For definitions of terms like TEP, ECMP, BGP, and others, see the Glossary.
7.2 Scale
- Start: 6 racks
- Up to: ~25 servers/rack
- Total: ≤150 servers
7.3 Hardware Specifications
7.3.1 Server NICs
- 2 × 100G NICs per server (NVIDIA ConnectX-6 DX)
- Hardware GENEVE offload enabled
- Total aggregate: 200G per server via pure L3 ECMP
7.3.2 Switch Hardware
- ToR Switches:
- Option 1: 100G × 64 ports (Tomahawk-based)
- Option 2: 200G × 32 ports (Tomahawk-based)
- Spine Switches:
- 400G switches (Tomahawk-based)
- All switches: Pure L3 routers, no L2 switching
7.4 Hierarchical IP Addressing
The architecture uses hierarchical addressing where IP addresses encode device role, pod number, and rack location.
7.4.1 Loopback IPs (Device Identity)
7.4.1.1 Network Devices
Spines: - Pattern: 10.254.{pod}.{spine}/32 - Examples: - NP1 Spine 1 = 10.254.1.1/32 - NP2 Spine 1 = 10.254.2.1/32
ToRs: - Pattern: 10.254.{pod-rack}.{tor}/32 - Examples: - NP1 Rack 1 ToR-A = 10.254.11.11/32 - NP1 Rack 2 ToR-B = 10.254.12.12/32
Super-Spines (when deployed): - Pattern: 10.254.100.{superspine}/32 - Examples: - SuperSpine-1 = 10.254.100.1/32 - SuperSpine-2 = 10.254.100.2/32
7.4.1.2 Host Encapsulation Loopbacks (TEP IPs)
Hosts: - Pattern: 10.255.{pod-rack}.{host}/32 - Examples: - NP1 Rack 1 Host 1 = 10.255.11.11/32 - NP2 Rack 1 Host 2 = 10.255.21.12/32
7.4.2 Point-to-Point Link IPs
7.4.2.1 Host ↔︎ ToR Links
- Pool:
172.16.{pod-rack}.0/24per rack - Split: A/B halves
- ToR-A side:
172.16.{pod-rack}.0/25 - ToR-B side:
172.16.{pod-rack}.128/25
- ToR-A side:
- Link Type: /31 (point-to-point, RFC 3021)
- Example: NP1 Rack 1 =
172.16.11.0/24- A side:
172.16.11.0/25(up to 64 hosts) - B side:
172.16.11.128/25(up to 64 hosts)
- A side:
Example for NP1 Rack 1 Host 11: - Host eth0 ↔︎ ToR-A: 172.16.11.0/31 (host: .0, ToR-A: .1) - Host eth1 ↔︎ ToR-B: 172.16.11.2/31 (host: .2, ToR-B: .3)
7.4.2.2 ToR ↔︎ Spine Links
- Pool:
172.20.{pod}.0/22per Network Pod - Link Type: /31 (point-to-point)
- Examples:
- NP1 =
172.20.1.0/22 - NP2 =
172.20.2.0/22
- NP1 =
7.4.2.3 Spine ↔︎ Super-Spine Links (Future)
- Pool:
172.24.100.0/24 - Link Type: /31 (point-to-point)
- Used when: Super-spine layer is deployed
7.5 Concrete IP Plan (Current Phase)
7.5.1 A) Host Loopbacks (OVN GENEVE TEP IPs)
Reserve: 10.0.0.0/16 for host loopbacks (server identity)
Allocate per rack: - Rack1: 10.0.1.0/24 - Rack2: 10.0.2.0/24 - Rack3: 10.0.3.0/24 - Rack4: 10.0.4.0/24 - Rack5: 10.0.5.0/24 - Rack6: 10.0.6.0/24
Each host gets one /32, e.g., 10.0.1.11/32.
Key: Loopback is independent of physical links. It’s advertised via BGP through both NICs, creating equal-cost paths automatically.
7.5.2 B) Host ↔︎ ToR Routed Links
Reserve: 172.16.0.0/16 for host uplinks
Per rack allocate one /24, split into A/B halves:
- Rack1:
172.16.1.0/24- ToR-A side:
172.16.1.0/25(up to 64 /31 links = 64 hosts) - ToR-B side:
172.16.1.128/25
- ToR-A side:
- Rack2:
172.16.2.0/24- ToR-A side:
172.16.2.0/25 - ToR-B side:
172.16.2.128/25
- ToR-A side:
- (Similar pattern for other racks)
Each host uses two /31s (one to A, one to B).
Example for Rack1 Host 11: - host eth0↔︎ToR-A: 172.16.1.0/31 (host: .0, ToR-A: .1) - host eth1↔︎ToR-B: 172.16.1.2/31 (host: .2, ToR-B: .3)
Key: Each NIC has its own IP on a different network. No bonding - pure L3 routing.
7.5.3 C) ToR ↔︎ Spine Links
Reserve: 172.20.0.0/16 for fabric p2p
Every ToR has L3 /31 links to each spine.
Example: ToR-A Rack1 to Spine1: 172.20.1.0/31
7.5.4 D) Switch Loopbacks
Reserve: - 10.255.0.0/16 for spine loopbacks - 10.254.0.0/16 for ToR loopbacks
Examples: - Spines: 10.255.0.1/32, 10.255.0.2/32, … - ToRs: per rack e.g., 10.254.1.1/32 (ToR-A), 10.254.1.2/32 (ToR-B)
7.6 Network Topology
7.6.1 Complete Network Diagram
The following diagram shows the complete leaf-spine topology with three racks, two spines, and the Independent A/B Fabrics architecture:
Key Features: - Spine Layer: Two spine switches (Spine 1 and Spine 2) provide redundant paths - ToR Switches: Each rack has two ToRs (ToR-A and ToR-B) in separate fabrics - Server Connections: Each server connects to both ToRs via separate NICs (eth0 to ToR-A, eth1 to ToR-B) - Point-to-Point Links: All links use /31 addressing (RFC 3021) - BGP Routing: All devices peer via eBGP and advertise loopback IPs - ECMP: Traffic automatically distributes across multiple equal-cost paths
7.6.2 Unified Fabric Topology
[Spine-1] [Spine-2]
/ | | \ / | | \
/ | | \ / | | \
/ | | \ / | | \
[ToR-A1][ToR-B1][ToR-A2][ToR-B2][ToR-A3][ToR-B3]
| | | | | |
| | | | | |
[Host-1] [Host-2] [Host-3]
eth0 eth1 eth0 eth1 eth0 eth1
All ToRs connect to all Spines - single unified L3 fabric
Each host connects to both ToRs in its rack via separate NICs
ECMP distributes traffic across all available paths
Key Features: - Single routing domain: All ToRs and spines in same BGP AS or connected via eBGP - Full connectivity: Every ToR connects to every spine - Maximum ECMP: 8+ possible paths between any two hosts (2 NICs × 2 ToRs × 2 Spines) - No MLAG: ToR-A and ToR-B are independent, no peer-link
7.6.3 Host Multi-NIC Configuration
Each host has two separate routed interfaces connecting to dual ToRs in its rack:
- eth0 → ToR-A (first ToR in rack) at 100G
- Own IP:
172.16.x.x/31(point-to-point) - Advertises loopback via eBGP
- Own IP:
- eth1 → ToR-B (second ToR in rack) at 100G
- Own IP:
172.16.x.x/31(point-to-point) - Advertises same loopback via eBGP
- Own IP:
- Loopback (
10.0.x.y/32) = Server identity / OVN TEP- Advertised via BOTH NICs with equal BGP attributes
- Creates equal-cost paths → ECMP across all ToRs and spines
- Result: 200G aggregate bandwidth with 8+ ECMP paths in unified fabric
Path diversity: Traffic from H1 to H2 can use any combination of: - 2 source NICs (eth0 or eth1) × 2 local ToRs × 2+ spines × 2 destination ToRs = 8+ paths!
For configuration scripts, see Configuration Examples.
7.7 Important Note on Summarization
You can summarize later (e.g., advertise 10.0.1.0/24 per rack instead of all /32s), but only if you keep correctness.
If you summarize /24 from both ToR-A and ToR-B, and a host loses its link to ToR-A, ToR-A may no longer know that host’s /32 — but spines might still send traffic for that host to ToR-A because of the /24 summary → potential blackhole unless:
- you have a ToR-A ↔︎ ToR-B L3 interconnect to forward internally, or
- you avoid summarizing and keep /32s in the core (recommended for now)
Given your size, don’t summarize yet. Keep /32s end-to-end. Revisit summarization when you’re at “many racks / many thousands of hosts” and after confirming FIB scale on the exact ToR/spine models.
7.8 Egress Racks (Border/F5)
If you have 2 racks with dual F5 load balancers:
Treat them as “border racks”: - Border ToRs connect to F5s and upstreams - F5 ownership model (VIP1 active on A, VIP2 active on B) works well if the owning F5 advertises the VIP /32 into the fabric, so return traffic stays symmetric
7.9 References
- Network Architecture Overview - Architecture principles
- BGP & Routing Configuration - BGP configuration details
- OpenStack Architecture Guide - L3 underlay design principles
- Canonical OpenStack Design Considerations - Canonical’s OpenStack network design
- RFC 3021 - /31 Point-to-Point Links - Point-to-point link addressing