12  Operations & Maintenance

12.1 Overview

This chapter provides operational procedures, troubleshooting guides, and maintenance tasks for the OpenStack DC Network. For architecture details, see Network Architecture Overview.

Note: For definitions of terms used in this chapter, see the Glossary.

12.2 Day-to-Day Operations

12.2.1 Monitoring BGP Sessions

Check BGP session status:

# Show BGP summary (all neighbors)
vtysh -c "show ip bgp summary"

# Expected output: All neighbors in "Established" state
# State/PfxRcd shows: Established / number of prefixes received

Check specific neighbor:

# Detailed neighbor information
vtysh -c "show ip bgp neighbors 172.16.1.1"

# Check routes received from neighbor
vtysh -c "show ip bgp neighbors 172.16.1.1 routes"

# Check routes advertised to neighbor
vtysh -c "show ip bgp neighbors 172.16.1.1 advertised-routes"

Healthy BGP session indicators: - State: Established - Uptime: Stable (not flapping) - PfxRcd: Expected number of routes - PfxSnt: Advertised routes

12.2.2 Checking OVN Health

OVN Controller status:

# Check OVN controller service
systemctl status ovn-controller

# Check OVN controller logs
journalctl -u ovn-controller -n 100

OVN configuration:

# Verify OVN configuration
ovs-vsctl get open . external-ids

# Expected: ovn-encap-ip=<host-loopback>, ovn-encap-type=geneve

OVN topology:

# Show OVN logical topology
ovn-nbctl show

# Show OVN southbound database
ovn-sbctl show

# List chassis (hosts)
ovn-sbctl list chassis

12.2.3 Verifying ECMP Paths

Check routing table for ECMP:

# View routing table
ip route show

# Look for ECMP routes (multiple nexthop)
ip route show | grep "nexthop"

# Example ECMP route:
# 10.0.2.22 proto bgp metric 20
#   nexthop via 172.16.1.1 dev eth0 weight 1
#   nexthop via 172.16.1.3 dev eth1 weight 1

Check specific destination:

# Route to specific host
ip route get 10.0.2.22

# Shows which path will be used based on 5-tuple hash

12.2.4 Network Connectivity Tests

Test host-to-host connectivity:

# Ping remote host TEP
ping -I 10.0.1.11 10.0.2.22

# Traceroute to see path
traceroute -s 10.0.1.11 10.0.2.22

# Test GENEVE tunnel
# (Traffic should go through automatically via OVN/OVS)

Verify MTU:

# Check interface MTU
ip link show eth0
ip link show eth1

# Should be 9000 for GENEVE overhead
# GENEVE adds ~38-50 bytes, so underlay needs MTU ≥ 1550 (or 9000 for jumbo frames)

12.3 Troubleshooting Procedures

12.3.1 BGP Not Establishing

Symptoms: BGP neighbor stuck in “Active”, “Connect”, or “Idle” state

Diagnosis:

  1. Check basic connectivity:

    # Ping neighbor
    ping 172.16.1.1
    
    # Check if neighbor is reachable
    ip route get 172.16.1.1
  2. Verify configuration:

    # Check FRR configuration
    vtysh -c "show run"
    
    # Verify AS numbers
    vtysh -c "show ip bgp summary"
    
    # Check if ebgp-multihop is configured
    vtysh -c "show run | include ebgp-multihop"
  3. Check firewall rules:

    # BGP uses TCP port 179
    sudo iptables -L -n | grep 179
    
    # Allow BGP if needed
    sudo iptables -A INPUT -p tcp --dport 179 -j ACCEPT
  4. Check FRR logs:

    # View FRR logs
    tail -f /var/log/frr/frr.log
    
    # Or journalctl
    journalctl -u frr -n 100

Common fixes: - Verify ebgp-multihop is configured - Check AS numbers match configuration - Ensure IP forwarding is enabled: sysctl net.ipv4.ip_forward - Verify no firewall blocking TCP port 179

12.3.2 OVN Tunnels Not Working

Symptoms: VMs can’t communicate across hosts

Diagnosis:

  1. Verify TEP reachability:

    # Ping remote TEP
    ping 10.0.2.22
    
    # Should succeed - if not, BGP/routing issue
  2. Check OVN encapsulation config:

    # Verify TEP IP
    ovs-vsctl get open . external-ids:ovn-encap-ip
    
    # Verify encapsulation type
    ovs-vsctl get open . external-ids:ovn-encap-type
    
    # Should be: geneve
  3. Verify OVN central connectivity:

    # Check if ovn-controller can reach OVN databases
    ovs-vsctl get open . external-ids:ovn-remote
    
    # Check ovn-controller logs
    journalctl -u ovn-controller -n 100
  4. Check MTU:

    # GENEVE adds overhead
    # Underlay MTU should be ≥ overlay MTU + 50 bytes
    ip link show eth0 | grep mtu
    
    # Should be 9000 (or at least 1550)
  5. Verify GENEVE tunnels:

    # Show OVS tunnels
    ovs-vsctl show | grep genev
    
    # Show tunnel ports
    ovs-ofctl show br-int | grep genev

Common fixes: - Ensure TEP IP is reachable via BGP - Set MTU to 9000 on physical interfaces - Verify OVN central connectivity - Check ovn-controller service is running

12.3.3 Routes Not Propagating

Symptoms: Host /32 not visible in BGP or routing table

Diagnosis:

  1. Check BGP advertisement:

    # Check what's being advertised
    vtysh -c "show ip bgp neighbors 172.16.1.1 advertised-routes"
    
    # Should see host loopback /32
  2. Verify route-map and prefix-list:

    # Show route-map
    vtysh -c "show route-map"
    
    # Show prefix-list
    vtysh -c "show ip prefix-list"
    
    # Verify loopback is in prefix-list
  3. Check network statement:

    # Verify network is configured
    vtysh -c "show run | include network"
    
    # Should see: network 10.0.1.11/32
  4. Check next-hop reachability:

    # Next-hop must be reachable
    ip route get 172.16.1.1

Common fixes: - Add network <loopback>/32 statement - Verify route-map permits the prefix - Check prefix-list includes loopback - Ensure next-hop is reachable

12.3.4 Performance Issues

Symptoms: Low bandwidth, high latency, packet loss

Diagnosis:

  1. Check ECMP distribution:

    # Verify ECMP is active
    ip route show | grep "nexthop"
    
    # Check if both paths are being used
    # (Use traffic monitoring tools)
  2. Monitor NIC utilization:

    # Check interface stats
    ip -s link show eth0
    ip -s link show eth1
    
    # Look for errors, drops
  3. Verify GENEVE offload:

    # Check OVS hardware offload
    ovs-vsctl get Open_vSwitch . other_config:hw-offload
    
    # Should be "true" for ConnectX-6 DX
  4. Check for congestion:

    # Monitor queue depth
    tc -s qdisc show dev eth0
    
    # Check for packet drops
    netstat -s | grep -i drop

Common fixes: - Enable hardware offload: ovs-vsctl set Open_vSwitch . other_config:hw-offload=true - Verify ECMP maximum-paths is configured - Check for single-path failures (BGP session down) - Monitor for capacity issues

12.3.5 MTU Problems

Symptoms: Large packets fail, connectivity works for small packets

Diagnosis:

  1. Check MTU end-to-end:

    # Physical interfaces
    ip link show eth0 | grep mtu
    ip link show eth1 | grep mtu
    
    # Should be 9000
    
    # Test with ping
    ping -M do -s 8972 10.0.2.22
    
    # Should succeed (9000 - 28 bytes for IP/ICMP headers)
  2. Verify GENEVE MTU:

    # GENEVE overhead is ~38-50 bytes
    # If overlay MTU is 1500, underlay needs ≥ 1550
    # If overlay MTU is 9000, underlay needs ≥ 9050 (use 9000)

Common fixes: - Set underlay MTU to 9000: ip link set eth0 mtu 9000 - Verify all switches support jumbo frames - Check end-to-end MTU path

12.4 Maintenance Procedures

12.4.1 Upgrading a Single Fabric (Rolling Upgrade)

Procedure to upgrade Fabric-A (zero downtime):

  1. Verify both fabrics healthy:

    # Check BGP on all hosts
    vtysh -c "show ip bgp summary"
    
    # Ensure both Network-A and Network-B paths exist
  2. Upgrade Spine-A switches one at a time:

    # For each Spine-A switch:
    
    # 1. Verify ECMP has alternate spines
    # 2. Upgrade switch OS/config
    # 3. Reboot switch
    # 4. Verify BGP sessions re-establish
    # 5. Wait for routes to stabilize
    # 6. Move to next spine
  3. Upgrade ToR-A switches one rack at a time:

    # For each ToR-A switch:
    
    # 1. Verify hosts have Network-B path
    # 2. Upgrade ToR-A
    # 3. Reboot ToR-A
    # 4. Verify host BGP sessions re-establish
    # 5. Verify host /32s are advertised
    # 6. Move to next rack
  4. Verify traffic distribution:

    # After Fabric-A upgrade:
    # Traffic should resume using both fabrics
    
    ip route show | grep "nexthop"
  5. Repeat for Fabric-B:

    • Same procedure for Fabric-B switches
    • Fabric-A now carries 100% load during upgrade

Key: Each fabric can be upgraded independently with zero impact on the other.

12.4.2 Adding New Racks

Procedure:

  1. Physical installation:

    • Install ToR-A and ToR-B switches
    • Cable hosts to both ToRs
    • Cable ToRs to spines (or other ToRs in mesh)
  2. Allocate IP addresses:

    # Determine next rack number (e.g., Rack 7)
    # Allocate IP ranges:
    # - Host loopbacks: 10.0.7.0/24
    # - Host↔ToR links: 172.16.7.0/24
    # - ToR loopbacks: 10.254.7.1/32 (ToR-A), 10.254.7.2/32 (ToR-B)
  3. Configure ToR switches:

    • Configure BGP peers (to hosts and spines)
    • Set loopback IPs
    • Configure point-to-point links
  4. Configure hosts:

    • Run network setup scripts
    • Configure FRR/BGP
    • Configure OVN
  5. Verify BGP:

    # On new hosts
    vtysh -c "show ip bgp summary"
    
    # On ToRs
    # Verify host /32s are learned and advertised
  6. Verify connectivity:

    # From new host, ping existing host
    ping -I 10.0.7.11 10.0.1.11

12.4.3 Replacing ToR Switches

Procedure to replace ToR-A in Rack 1:

  1. Pre-check:

    # Verify all hosts have Network-B path
    # On each host in Rack 1:
    vtysh -c "show ip route 10.0.0.0/8" | grep eth1
    
    # Traffic will use Network-B during replacement
  2. Shut down ToR-A gracefully:

    # On ToR-A:
    # Shut down BGP to drain traffic gracefully
    vtysh -c "conf t" -c "router bgp 65101" -c "shutdown"
    
    # Wait for BGP to withdraw routes
    # Monitor: Traffic shifts to Network-B
  3. Physical replacement:

    • Power down old ToR-A
    • Install new ToR-A
    • Verify cabling
  4. Configure new ToR-A:

    • Apply configuration (same as old ToR-A)
    • Set loopback IP: 10.254.1.1/32
    • Configure BGP peers
  5. Bring up BGP:

    # On new ToR-A:
    # BGP should auto-establish with hosts and spines
    vtysh -c "show ip bgp summary"
  6. Verify:

    # On hosts in Rack 1:
    # Verify ECMP routes re-appear
    ip route show | grep "nexthop"
    
    # Should see both eth0 and eth1 paths

12.4.4 Replacing Spine Switches

Procedure to replace Spine-1 (in Fabric-A):

  1. Pre-check:

    # Verify alternate spines exist in Fabric-A
    # Verify Fabric-B is healthy (will carry more load)
  2. Graceful shutdown:

    # On Spine-1:
    vtysh -c "conf t" -c "router bgp 65010" -c "shutdown"
    
    # BGP withdraws routes, ECMP redistributes to other spines
  3. Physical replacement:

    • Power down old Spine-1
    • Install new Spine-1
    • Verify cabling to all ToRs
  4. Configure new Spine-1:

    • Apply configuration
    • Set loopback IP: 10.255.0.1/32
    • Configure BGP peers to all ToR-A switches
  5. Verify:

    # On Spine-1:
    vtysh -c "show ip bgp summary"
    
    # Should see all ToR-A switches
    # Should learn all host /32s

12.4.5 Decommissioning Hosts

Procedure:

  1. Drain workloads:

    • Migrate VMs to other hosts (OpenStack live migration)
    • Drain Kubernetes pods (kubectl drain)
  2. Shut down BGP:

    # On host being decommissioned:
    vtysh -c "conf t" -c "router bgp 66111" -c "shutdown"
    
    # Loopback /32 withdrawn from fabric
  3. Verify routes withdrawn:

    # On ToRs:
    vtysh -c "show ip bgp" | grep 10.0.1.11
    
    # Should not appear
  4. Power down:

    • Stop OVN controller
    • Power down host
    • Remove from inventory

12.5 Failure Scenarios & Recovery

12.5.1 Single ToR Failure

Scenario: ToR-A in Rack 1 fails completely

What happens: 1. BFD detects link failures within 100-300ms 2. BGP sessions drop to ToR-A 3. BGP withdraws Network-A paths 4. ECMP automatically uses only Network-B paths 5. All traffic routes via eth1 → ToR-B

Impact: - Hosts in Rack 1 lose Network-A path - Network-B carries 100% load - No packet loss (if BGP/BFD configured correctly)

Recovery: 1. Immediate: Traffic automatically shifts to ToR-B 2. Replace ToR-A: Follow “Replacing ToR Switches” procedure 3. Verify: ECMP paths restore after replacement

Monitoring:

# On affected hosts:
ip route show | grep "nexthop"

# Should see only eth1 path during failure
# Should see both paths after recovery

12.5.2 Single Spine Failure

Scenario: Spine-1 (Fabric-A) fails

What happens: 1. BGP sessions drop between Spine-1 and all ToR-A switches 2. ToRs withdraw routes via Spine-1 3. ECMP redistributes across remaining Spine-A switches 4. Fabric-B continues normally

Impact: - ECMP fanout reduces in Fabric-A - Remaining Spine-A switches handle more load - Fabric-B unaffected

Recovery: - Replace spine following “Replacing Spine Switches” procedure - ECMP automatically redistributes after replacement

12.5.3 Complete Fabric-A Failure

Scenario: All of Fabric-A fails (ToR-A and Spine-A switches)

What happens: 1. All Network-A BGP sessions drop 2. All Network-A routes withdrawn 3. ECMP uses only Network-B paths 4. 100% traffic on Fabric-B

Impact: - All hosts lose Network-A connectivity - Fabric-B must handle 100% load - No packet loss if Fabric-B sized for 100% capacity

Recovery: - Diagnose root cause (power, configuration, hardware) - Restore Fabric-A switches - BGP sessions re-establish automatically - ECMP restores both paths

Critical: This is why each fabric must be sized for 100% load.

12.5.4 Host NIC Failure

Scenario: eth0 fails on a host

What happens: 1. BFD detects link failure 2. BGP session to ToR-A drops 3. BGP withdraws Network-A path for this host 4. Traffic to this host uses only Network-B path

Impact: - Host loses Network-A connectivity - All traffic via eth1 - Bandwidth reduced to 100G (from 200G)

Recovery:

# Diagnose NIC issue
ip link show eth0

# Check for hardware errors
ethtool -S eth0 | grep -i error

# Replace NIC if needed
# BGP re-establishes automatically after replacement

12.5.5 Power Domain Failure

Scenario: PDU-A fails, affecting all Fabric-A switches

What happens: - Same as “Complete Fabric-A Failure” - All Fabric-A switches lose power - Fabric-B handles 100% load

Recovery: - Restore power to PDU-A - Switches boot up - BGP sessions re-establish - ECMP restores

Prevention: This is why power domain separation is critical (see Network Architecture Overview).

12.6 Command Reference

12.6.1 FRR/BGP Commands

Status and information:

# BGP summary
vtysh -c "show ip bgp summary"

# All BGP routes
vtysh -c "show ip bgp"

# Specific route
vtysh -c "show ip bgp 10.0.1.11/32"

# Neighbor details
vtysh -c "show ip bgp neighbors 172.16.1.1"

# Advertised routes
vtysh -c "show ip bgp neighbors 172.16.1.1 advertised-routes"

# Received routes
vtysh -c "show ip bgp neighbors 172.16.1.1 routes"

# Routing table
vtysh -c "show ip route"

# BFD peers
vtysh -c "show bfd peers"

Configuration:

# Enter FRR shell
vtysh

# Enter config mode
conf t

# Show running config
show run

# Save config
write memory

12.6.2 OVN/OVS Commands

OVS status:

# Show OVS configuration
ovs-vsctl show

# Show bridges
ovs-vsctl list-br

# Show ports on bridge
ovs-vsctl list-ports br-int

# Show interfaces
ovs-vsctl list interface

# Show OVS configuration
ovs-vsctl get open . external-ids

OVN status:

# Show OVN logical topology (northbound)
ovn-nbctl show

# Show OVN southbound database
ovn-sbctl show

# List chassis
ovn-sbctl list chassis

# Show chassis details
ovn-sbctl show <chassis-name>

# Check OVN controller
systemctl status ovn-controller
journalctl -u ovn-controller -n 100

OVN configuration:

# Set TEP IP
ovs-vsctl set open . external-ids:ovn-encap-ip=10.0.1.11

# Set encapsulation type
ovs-vsctl set open . external-ids:ovn-encap-type=geneve

# Set OVN remote (central database)
ovs-vsctl set open . external-ids:ovn-remote=tcp:10.254.0.100:6642

12.6.3 Network Verification Commands

Interface status:

# Show all interfaces
ip addr show

# Show specific interface
ip link show eth0

# Show interface statistics
ip -s link show eth0

# Check for errors
ethtool -S eth0 | grep -i error

Routing:

# Show routing table
ip route show

# Show specific route
ip route get 10.0.2.22

# Show ECMP routes
ip route show | grep "nexthop"

# Show routes for specific prefix
ip route show 10.0.0.0/8

Connectivity:

# Ping with specific source
ping -I 10.0.1.11 10.0.2.22

# Traceroute
traceroute -s 10.0.1.11 10.0.2.22

# TCP connectivity test
nc -v -z 10.0.2.22 22

# UDP connectivity (for GENEVE port 6081)
nc -v -u -z 10.0.2.22 6081

Performance:

# Network throughput test
iperf3 -c 10.0.2.22 -P 10

# Monitor traffic
iftop -i eth0

# Check packet counters
watch -n1 'ip -s link show eth0 | grep -A1 "RX:"'

12.6.4 Debugging Tools

Packet capture:

# Capture on physical interface
tcpdump -i eth0 -n

# Capture GENEVE packets
tcpdump -i eth0 -n 'udp port 6081'

# Capture BGP packets
tcpdump -i eth0 -n 'tcp port 179'

# Capture and write to file
tcpdump -i eth0 -w capture.pcap

# Capture with specific filters
tcpdump -i eth0 -n 'host 10.0.2.22'

OVS flow analysis:

# Show flows
ovs-ofctl dump-flows br-int

# Show specific flow
ovs-ofctl dump-flows br-int | grep <pattern>

# Monitor flows
watch -n1 'ovs-ofctl dump-flows br-int | grep <pattern>'

System resources:

# CPU usage
top

# Memory usage
free -h

# Network I/O
iostat -x 1

# Process monitoring
ps aux | grep ovn
ps aux | grep frr

12.7 Capacity Planning & Scaling

12.7.1 Bandwidth Calculations

Per host: - Normal: 40-60% per NIC (80-120G total) - Peak: Up to 200G aggregate - Failover: 100% on single NIC (must plan for this)

Per fabric: - Normal: 40-60% utilization - Failover: 100% capacity required

Per ToR: - Must handle 100% of rack traffic during failover - Rack with 25 hosts: 25 × 100G = 2.5Tbps per fabric

Per Spine: - Must handle sum of all ToR uplinks in fabric - Plan for N-1 spine failures

12.7.2 Sizing for Failover

Critical principle: Size each fabric for 100% load during failover.

Don’t assume 50/50 split. Plan for complete fabric failure: - Fabric-A fails → Fabric-B carries 100% - Fabric-B fails → Fabric-A carries 100%

Oversubscription planning: - ToR downlinks: 25 hosts × 100G = 2.5Tbps - ToR uplinks: Size based on expected traffic patterns - Typical: 4:1 to 8:1 oversubscription at ToR - No oversubscription at spine (non-blocking fabric)

12.7.3 Adding Capacity

When to add capacity: - Fabric utilization > 60% during normal operation - Spine uplinks approaching saturation - Host count approaching ToR port limits

Capacity expansion options: 1. Add more spines: Increase ECMP fanout 2. Upgrade ToR uplinks: 100G → 400G 3. Add new Network Pod: Scale horizontally 4. Upgrade NIC speeds: 100G → 400G per host

12.7.4 Migration to Super-Spine

When to migrate: - Scaling beyond 10-15 racks - Need for network segmentation - Geographic distribution

Procedure: 1. Deploy super-spine switches 2. Create Network Pods (group existing racks) 3. Connect pod spines to super-spines 4. Configure BGP between spines and super-spines 5. Adjust IP addressing to hierarchical scheme 6. Migrate incrementally pod-by-pod

12.8 Health Checks

12.8.1 Daily Health Check Script

#!/bin/bash
# Daily health check for OpenStack DC Network

echo "=== BGP Health Check ==="
vtysh -c "show ip bgp summary" | grep -E "Established|Active|Connect"

echo "=== ECMP Routes Check ==="
ip route show | grep -c "nexthop"

echo "=== OVN Controller Health ==="
systemctl is-active ovn-controller

echo "=== Interface Status ==="
ip link show eth0 | grep -E "state UP|state DOWN"
ip link show eth1 | grep -E "state UP|state DOWN"

echo "=== Recent Errors ==="
journalctl --since "1 hour ago" | grep -i "error\|fail" | tail -10

12.8.2 Automated Monitoring

Metrics to monitor: - BGP session state (up/down) - Number of routes learned - Interface status (up/down) - Interface errors and drops - OVN controller status - Bandwidth utilization per interface - ECMP path count - BFD session status

Alerting thresholds: - BGP session down > 1 minute - Routes missing (expected count not met) - Interface errors > 100/hour - Bandwidth > 80% on single fabric - OVN controller down - Both NICs using same path (ECMP broken)

12.9 References