9 Storage Networking Considerations

9.1 Overview

This chapter covers storage networking considerations for OpenStack deployments, including Ceph object/block storage, HCI (Hyper-Converged Infrastructure) decisions, and alternative storage solutions.

Note: For definitions of terms used in this chapter, see the Glossary.

9.2 Storage Traffic Types

9.2.1 Ceph Storage Traffic

Ceph generates multiple types of traffic with different characteristics:

9.2.1.1 1. Object Store Traffic (Slow/HDD)

Characteristics:
- High capacity, lower bandwidth
- Sequential I/O patterns
- Can tolerate higher latency
Traffic Pattern: Bursty, large transfers
Network Impact: Lower priority, can share bandwidth

9.2.1.2 2. Block Store Traffic (Fast/SSD)

Characteristics:
- Low latency requirements
- Random I/O patterns
- High bandwidth needs
Traffic Pattern: Constant, low-latency sensitive
Network Impact: Higher priority, needs dedicated bandwidth

9.2.1.3 3. Ceph Replication Traffic

Characteristics:
- Inter-node replication
- Can be bandwidth-intensive
Traffic Pattern: Sustained during recovery/backfill
Network Impact: Can saturate links during recovery

9.2.2 Alternative Storage Solutions

9.2.2.1 OpenEBS Mayastor

NVMe over Fabrics (NVMe-oF)
High-performance block storage
Requires low-latency network
Bandwidth-intensive for replication

9.2.2.2 Linstor DRBD

Synchronous replication over network
Requires low-latency, high-bandwidth links
Network is critical path for I/O

9.3 Architecture Options

9.3.1 Option 1: Separate Storage Network (Recommended for Large Scale)

Dedicated storage network separate from compute network:

┌─────────────────────────────────────────┐
│  Compute Network (GENEVE Overlay)       │
│  - VM traffic                            │
│  - Pod traffic                            │
│  - Management traffic                     │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│  Storage Network (Separate)              │
│  - Ceph object store traffic             │
│  - Ceph block store traffic              │
│  - Ceph replication                      │
└─────────────────────────────────────────┘

Benefits: - Bandwidth isolation: Storage traffic doesn’t compete with VM traffic - QoS control: Can prioritize storage traffic separately - Failure isolation: Storage network issues don’t affect compute - Performance: Dedicated bandwidth for storage operations

Drawbacks: - Additional infrastructure: More switches, more cabling - Higher cost: Separate network equipment - Operational complexity: Two networks to manage

9.3.2 Option 2: Shared Network with QoS (HCI Approach)

Single network with Quality of Service (QoS) policies:

┌─────────────────────────────────────────┐
│  Unified Network (GENEVE Overlay)       │
│  - VM traffic (normal priority)         │
│  - Pod traffic (normal priority)        │
│  - Ceph storage (high priority)        │
│  - Management (low priority)            │
└─────────────────────────────────────────┘

Benefits: - Simpler infrastructure: One network to manage - Lower cost: No separate storage switches - Flexible: Can adjust QoS as needed

Drawbacks: - Bandwidth contention: Storage and compute compete - QoS complexity: Requires careful tuning - Risk: Storage traffic can impact VM performance

9.3.3 Option 3: Hybrid Approach

Separate storage links on same servers:

eth0/eth1: Compute network (GENEVE overlay)
eth2/eth3: Storage network (dedicated storage traffic)

Benefits: - Bandwidth isolation: Storage has dedicated NICs - Same servers: No separate storage nodes needed - Flexible: Can adjust per workload

Drawbacks: - More NICs per server: 4 NICs instead of 2 - More switch ports: Additional ToR ports for storage

9.4 Recommendation: HCI vs Separate Storage

9.4.1 For Small to Medium Scale (≤150 servers)

Recommendation: Shared network with QoS (HCI approach)

Rationale: - Simpler operations - Lower cost - 2 × 100G per server provides sufficient bandwidth - QoS can prioritize storage when needed

Configuration: - Use same 2 × 100G NICs for both compute and storage - Configure QoS to prioritize Ceph block store traffic - Monitor bandwidth utilization - Scale to separate network if needed

9.4.2 For Large Scale (150+ servers)

Recommendation: Separate storage network

Rationale: - Storage traffic becomes significant - Need guaranteed bandwidth for storage - Easier to scale storage independently - Better failure isolation

Configuration: - Dedicated storage switches (separate from compute ToRs) - Separate BGP/ECMP fabric for storage - Or: Additional NICs per server for storage (eth2/eth3)

9.5 Ceph-Specific Considerations

9.5.1 Ceph Network Requirements

9.5.1.1 Object Store (HDD)

Bandwidth: Moderate (can share with compute)
Latency: Less critical
Network: Can use shared network with lower priority

9.5.1.2 Block Store (SSD)

Bandwidth: High (needs dedicated or high-priority)
Latency: Critical (<1ms preferred)
Network: Should have dedicated bandwidth or high QoS priority

9.5.1.3 Ceph Replication

Bandwidth: Very high during recovery/backfill
Latency: Moderate
Network: Can temporarily saturate links

9.5.2 Ceph Network Design

9.5.2.1 Option A: Shared Network with QoS

# Configure QoS on switches to prioritize Ceph traffic
# Ceph uses specific ports that can be identified

# High priority: Ceph block store (port 6789, 6800-7300)
# Normal priority: Ceph object store (port 7480)
# Low priority: Ceph replication during maintenance

9.5.2.2 Option B: Separate Storage Fabric

Dedicated ToRs: Storage-only switches
Separate BGP fabric: Dual ToRs with unified L3 fabric for storage
Same principles: Pure L3, BGP/ECMP, loopback-based

9.6 Alternative Storage Solutions

9.6.1 OpenEBS Mayastor

Characteristics: - NVMe-oF: Requires low-latency network - High bandwidth: Needs dedicated or high-priority bandwidth - Network critical: Network latency directly impacts I/O performance

Network Requirements: - Low latency: <100μs preferred - High bandwidth: 100G+ per node - Dedicated or high QoS: Should not compete with compute

Recommendation: Separate storage network or dedicated NICs

9.6.2 Linstor DRBD

Characteristics: - Synchronous replication: Network is critical path - Low latency required: Network latency = I/O latency - High bandwidth: Replication traffic can be intensive

Network Requirements: - Very low latency: <50μs preferred - High bandwidth: 100G+ per node - Dedicated links: Should not share with compute

Recommendation: Dedicated storage network or separate NICs

9.7 HCI (Hyper-Converged Infrastructure) Decision

9.7.1 When HCI Makes Sense

Use HCI (shared network) when: - Small to medium scale: ≤150 servers - Mixed workloads: Compute and storage on same nodes - Cost sensitive: Want to minimize infrastructure - Simple operations: Prefer single network to manage

9.7.2 When to Separate Storage

Use separate storage network when: - Large scale: 150+ servers - High storage I/O: Storage traffic significant - Performance critical: Need guaranteed storage bandwidth - Independent scaling: Want to scale storage separately

9.8 Network Bandwidth Planning

9.8.1 Per-Server Bandwidth Allocation (2 × 100G)

Option 1: Shared Network (HCI)

Total: 200G per server
  - VM/Pod traffic: 120G (60%)
  - Ceph block store: 60G (30%)
  - Ceph object store: 15G (7.5%)
  - Management: 5G (2.5%)

Option 2: Separate Storage (4 × 100G)

Compute NICs (eth0/eth1): 200G
  - VM/Pod traffic: 180G (90%)
  - Management: 20G (10%)

Storage NICs (eth2/eth3): 200G
  - Ceph block store: 150G (75%)
  - Ceph object store: 40G (20%)
  - Ceph replication: 10G (5%)

9.8.2 Failover Planning

Critical: Plan for 100% capacity on single path during failover:

Shared network: One NIC must handle all traffic (compute + storage)
Separate storage: Compute and storage failover independently

9.9 Recommendations Summary

9.9.1 Current Scale (≤150 servers)

Start with HCI: Shared network with QoS
Monitor bandwidth: Track compute vs storage utilization
Adjust QoS: Prioritize storage when needed
Plan evolution: Be ready to separate if scale increases

9.9.2 Future Scale (150+ servers)

Evaluate separation: Consider separate storage network
Or add NICs: Use eth2/eth3 for storage on same servers
Independent scaling: Scale storage network separately

9.9.3 Storage Solution Specific

Ceph object store: Can share network (lower priority)
Ceph block store: Needs high priority or dedicated bandwidth
OpenEBS Mayastor: Recommend separate network or dedicated NICs
Linstor DRBD: Strongly recommend separate network

9.10 References

--- title: "Storage Networking Considerations" --- ## Overview This chapter covers storage networking considerations for OpenStack deployments, including Ceph object/block storage, HCI (Hyper-Converged Infrastructure) decisions, and alternative storage solutions. > **Note**: For definitions of terms used in this chapter, see the [Glossary](./glossary.qmd). ## Storage Traffic Types ### Ceph Storage Traffic Ceph generates multiple types of traffic with different characteristics: #### 1. Object Store Traffic (Slow/HDD) - **Characteristics**: - High capacity, lower bandwidth - Sequential I/O patterns - Can tolerate higher latency - **Traffic Pattern**: Bursty, large transfers - **Network Impact**: Lower priority, can share bandwidth #### 2. Block Store Traffic (Fast/SSD) - **Characteristics**: - Low latency requirements - Random I/O patterns - High bandwidth needs - **Traffic Pattern**: Constant, low-latency sensitive - **Network Impact**: Higher priority, needs dedicated bandwidth #### 3. Ceph Replication Traffic - **Characteristics**: - Inter-node replication - Can be bandwidth-intensive - **Traffic Pattern**: Sustained during recovery/backfill - **Network Impact**: Can saturate links during recovery ### Alternative Storage Solutions #### OpenEBS Mayastor - **NVMe over Fabrics (NVMe-oF)** - **High-performance block storage** - **Requires low-latency network** - **Bandwidth-intensive for replication** #### Linstor DRBD - **Synchronous replication over network** - **Requires low-latency, high-bandwidth links** - **Network is critical path for I/O** ## Architecture Options ### Option 1: Separate Storage Network (Recommended for Large Scale) **Dedicated storage network** separate from compute network: ``` ┌─────────────────────────────────────────┐ │ Compute Network (GENEVE Overlay) │ │ - VM traffic │ │ - Pod traffic │ │ - Management traffic │ └─────────────────────────────────────────┘ ┌─────────────────────────────────────────┐ │ Storage Network (Separate) │ │ - Ceph object store traffic │ │ - Ceph block store traffic │ │ - Ceph replication │ └─────────────────────────────────────────┘ ``` **Benefits**: - **Bandwidth isolation**: Storage traffic doesn't compete with VM traffic - **QoS control**: Can prioritize storage traffic separately - **Failure isolation**: Storage network issues don't affect compute - **Performance**: Dedicated bandwidth for storage operations **Drawbacks**: - **Additional infrastructure**: More switches, more cabling - **Higher cost**: Separate network equipment - **Operational complexity**: Two networks to manage ### Option 2: Shared Network with QoS (HCI Approach) **Single network** with Quality of Service (QoS) policies: ``` ┌─────────────────────────────────────────┐ │ Unified Network (GENEVE Overlay) │ │ - VM traffic (normal priority) │ │ - Pod traffic (normal priority) │ │ - Ceph storage (high priority) │ │ - Management (low priority) │ └─────────────────────────────────────────┘ ``` **Benefits**: - **Simpler infrastructure**: One network to manage - **Lower cost**: No separate storage switches - **Flexible**: Can adjust QoS as needed **Drawbacks**: - **Bandwidth contention**: Storage and compute compete - **QoS complexity**: Requires careful tuning - **Risk**: Storage traffic can impact VM performance ### Option 3: Hybrid Approach **Separate storage links on same servers**: - **eth0/eth1**: Compute network (GENEVE overlay) - **eth2/eth3**: Storage network (dedicated storage traffic) **Benefits**: - **Bandwidth isolation**: Storage has dedicated NICs - **Same servers**: No separate storage nodes needed - **Flexible**: Can adjust per workload **Drawbacks**: - **More NICs per server**: 4 NICs instead of 2 - **More switch ports**: Additional ToR ports for storage ## Recommendation: HCI vs Separate Storage ### For Small to Medium Scale (≤150 servers) **Recommendation**: **Shared network with QoS (HCI approach)** **Rationale**: - Simpler operations - Lower cost - 2 × 100G per server provides sufficient bandwidth - QoS can prioritize storage when needed **Configuration**: - Use same 2 × 100G NICs for both compute and storage - Configure QoS to prioritize Ceph block store traffic - Monitor bandwidth utilization - Scale to separate network if needed ### For Large Scale (150+ servers) **Recommendation**: **Separate storage network** **Rationale**: - Storage traffic becomes significant - Need guaranteed bandwidth for storage - Easier to scale storage independently - Better failure isolation **Configuration**: - Dedicated storage switches (separate from compute ToRs) - Separate BGP/ECMP fabric for storage - Or: Additional NICs per server for storage (eth2/eth3) ## Ceph-Specific Considerations ### Ceph Network Requirements #### Object Store (HDD) - **Bandwidth**: Moderate (can share with compute) - **Latency**: Less critical - **Network**: Can use shared network with lower priority #### Block Store (SSD) - **Bandwidth**: High (needs dedicated or high-priority) - **Latency**: Critical (<1ms preferred) - **Network**: Should have dedicated bandwidth or high QoS priority #### Ceph Replication - **Bandwidth**: Very high during recovery/backfill - **Latency**: Moderate - **Network**: Can temporarily saturate links ### Ceph Network Design #### Option A: Shared Network with QoS ```bash # Configure QoS on switches to prioritize Ceph traffic # Ceph uses specific ports that can be identified # High priority: Ceph block store (port 6789, 6800-7300) # Normal priority: Ceph object store (port 7480) # Low priority: Ceph replication during maintenance ``` #### Option B: Separate Storage Fabric - **Dedicated ToRs**: Storage-only switches - **Separate BGP fabric**: Dual ToRs with unified L3 fabric for storage - **Same principles**: Pure L3, BGP/ECMP, loopback-based ## Alternative Storage Solutions ### OpenEBS Mayastor **Characteristics**: - **NVMe-oF**: Requires low-latency network - **High bandwidth**: Needs dedicated or high-priority bandwidth - **Network critical**: Network latency directly impacts I/O performance **Network Requirements**: - **Low latency**: <100μs preferred - **High bandwidth**: 100G+ per node - **Dedicated or high QoS**: Should not compete with compute **Recommendation**: Separate storage network or dedicated NICs ### Linstor DRBD **Characteristics**: - **Synchronous replication**: Network is critical path - **Low latency required**: Network latency = I/O latency - **High bandwidth**: Replication traffic can be intensive **Network Requirements**: - **Very low latency**: <50μs preferred - **High bandwidth**: 100G+ per node - **Dedicated links**: Should not share with compute **Recommendation**: Dedicated storage network or separate NICs ## HCI (Hyper-Converged Infrastructure) Decision ### When HCI Makes Sense **Use HCI (shared network)** when: - **Small to medium scale**: ≤150 servers - **Mixed workloads**: Compute and storage on same nodes - **Cost sensitive**: Want to minimize infrastructure - **Simple operations**: Prefer single network to manage ### When to Separate Storage **Use separate storage network** when: - **Large scale**: 150+ servers - **High storage I/O**: Storage traffic significant - **Performance critical**: Need guaranteed storage bandwidth - **Independent scaling**: Want to scale storage separately ## Network Bandwidth Planning ### Per-Server Bandwidth Allocation (2 × 100G) **Option 1: Shared Network (HCI)** ``` Total: 200G per server - VM/Pod traffic: 120G (60%) - Ceph block store: 60G (30%) - Ceph object store: 15G (7.5%) - Management: 5G (2.5%) ``` **Option 2: Separate Storage (4 × 100G)** ``` Compute NICs (eth0/eth1): 200G - VM/Pod traffic: 180G (90%) - Management: 20G (10%) Storage NICs (eth2/eth3): 200G - Ceph block store: 150G (75%) - Ceph object store: 40G (20%) - Ceph replication: 10G (5%) ``` ### Failover Planning **Critical**: Plan for 100% capacity on single path during failover: - **Shared network**: One NIC must handle all traffic (compute + storage) - **Separate storage**: Compute and storage failover independently ## Recommendations Summary ### Current Scale (≤150 servers) 1. **Start with HCI**: Shared network with QoS 2. **Monitor bandwidth**: Track compute vs storage utilization 3. **Adjust QoS**: Prioritize storage when needed 4. **Plan evolution**: Be ready to separate if scale increases ### Future Scale (150+ servers) 1. **Evaluate separation**: Consider separate storage network 2. **Or add NICs**: Use eth2/eth3 for storage on same servers 3. **Independent scaling**: Scale storage network separately ### Storage Solution Specific - **Ceph object store**: Can share network (lower priority) - **Ceph block store**: Needs high priority or dedicated bandwidth - **OpenEBS Mayastor**: Recommend separate network or dedicated NICs - **Linstor DRBD**: Strongly recommend separate network ## References - [Ceph Network Configuration](https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/) - [OpenEBS Mayastor](https://mayastor.gitbook.io/) - [Linstor DRBD](https://linbit.com/drbd-user-guide/)