Data Center Solution

GPU Backend Fabric Design Guide

Size the backend network around collective communication, not average utilization.

Back to Data Center Solutions

Overview

GPU backend networks carry the east-west traffic created by distributed AI training. These networks are different from general data center networks because application performance depends on synchronized communication phases such as all-reduce, parameter exchange, checkpointing, and storage access.

An xSONiC backend fabric should be designed around bandwidth symmetry, low tail latency, congestion behavior, and operational visibility.

Traffic Characteristics

Workload PatternNetwork ImpactDesign Response
All-reduceMany GPUs exchange data in synchronized phases.Keep oversubscription low and ECMP behavior predictable.
Parameter exchangeRepeated east-west bursts.Provide headroom and monitor queue pressure.
CheckpointingLarge periodic writes to storage.Separate or carefully engineer storage paths.
Failure recoveryTraffic shifts after link or device failure.Validate convergence and remaining bandwidth under failure.

Fabric Roles

RoleFunctionxSONiC Platform Fit
Backend leafConnects GPU servers or accelerator nodes.400G/800G ports for high-density server attachment.
Backend spineProvides non-blocking east-west capacity.400G/800G spine platforms with high radix.
Storage leafConnects high-performance storage targets.100G/200G/400G depending on storage tier.
Frontend boundaryConnects management, user, or service networks.100G/200G platforms for controlled separation.

Reference Topology

GPU servers       GPU servers       GPU servers
    |                 |                 |
    v                 v                 v
Backend leaves  -- Backend spines -- Backend leaves
    |                 |                 |
    +------ storage / checkpoint fabric boundary ------+

For large clusters, separate backend, frontend, and storage networks can reduce operational risk. Smaller clusters may converge some roles, but the traffic classes and failure domains should still be designed explicitly.

Lossless Ethernet Controls

ControlPurposeValidation
PFCProtect selected RDMA priorities from loss.Confirm pause is limited to intended priorities.
ECNMark congestion before queues overflow.Verify sender response under incast and all-reduce.
ETSAllocate bandwidth across traffic classes.Confirm storage or management traffic does not starve backend traffic.
DCBXExchange DCB parameters with adjacent devices.Check negotiated state on server-facing links.

Sizing Considerations

  1. Estimate per-GPU and per-server network demand during peak collective operations.
  2. Decide acceptable oversubscription for backend traffic; many AI backend fabrics target very low oversubscription.
  3. Reserve headroom for retransmission, failure reroute, storage bursts, and telemetry.
  4. Validate ECMP hashing with realistic flow counts and packet patterns.
  5. Test failure scenarios at the workload level, not only at the routing protocol level.

Operational Validation

TestWhat It Proves
All-reduce stressBackend fabric can handle synchronized GPU communication.
Incast testQueue and congestion controls behave under fan-in.
Link failureRemaining paths can absorb traffic without severe job impact.
Storage checkpointStorage traffic does not destabilize backend communication.
Telemetry correlationOperators can connect application slowdown to network state.

xSONiC Platform Fit

Use 800G xSONiC platforms for high-radix AI backend fabrics, 400G platforms for spine or high-density leaf roles, and 100G/200G systems for frontend, storage, or staged migration layers. The exact mix depends on GPU generation, NIC speed, cluster size, and failure-domain design.

Related Products

Products commonly paired with this solution.

Use these related platforms as a starting point for sizing, comparison, and follow-up discussion.

XS-DC-64X800-AI-G1 front panel product image

XS-DC-64X800-AI-G1

Data Center AI

64-port 800G AI fabric switch for large-scale GPU clusters, HPC backbones, and ultra-high-throughput data center networks.

51.2Tbps
42,000Mpps
XS-DC-64X200-LS-G1 front panel product image

XS-DC-64X200-LS-G1

Data Center AI

64-port 200G leaf/spine switch for high-bandwidth storage, compute, and scale-out data center fabrics.

12.8Tbps
19,040Mpps
Next Step

Move from GPU Backend Fabric Design Guide into implementation.

Use the related products below to continue comparing platforms, or open a conversation if you need help mapping the solution to your environment.