Data Center Solution

Fast CNP Congestion Notification

Reduce congestion feedback delay before PFC becomes the only safety valve.

Back to Data Center Solutions

Overview

Fast CNP is a congestion feedback optimization for RoCEv2 fabrics. Traditional congestion control relies on a congested switch marking ECN in packets, the receiver observing those marks, and the receiver then sending Congestion Notification Packets (CNPs) back to the sender. That feedback loop can be too slow in high-bandwidth AI networks where many flows converge on the same queue.

In an xSONiC AI fabric, Fast CNP shortens the control loop by allowing the congested switching node to notify senders more directly. The goal is to reduce queue buildup before buffer pressure turns into packet loss or widespread PFC pause behavior.

Traditional CNP Path

Sender servers
      |
      v
Congested switch marks ECN
      |
      v
Receiver observes marked traffic
      |
      v
Receiver sends CNP back to sender
      |
      v
Sender reduces rate

This process works, but the feedback path includes the receiver side of the conversation. In bursty many-to-one AI traffic, that delay can allow queue occupancy to keep growing while the fabric is waiting for rate reduction.

Fast CNP Path

Sender servers
      |
      v
Congested xSONiC switch detects pressure
      |
      v
Switch identifies affected RoCEv2 flows
      |
      v
Switch sends CNP-style notification toward senders
      |
      v
Senders reduce rate earlier

Fast CNP is most useful when the switch can identify the active RoCEv2 flows that are contributing to congestion and build the notification information that endpoint NICs expect.

Key Concepts

TermMeaningWhy It Matters
FlowA packet stream identified by common attributes such as IP addresses, ports, and queue pair information.Lets the switch associate congestion with affected senders.
Flow TableState maintained by the switch for active RoCEv2 sessions.Provides the metadata needed to build targeted notifications.
CNPCongestion Notification Packet used by RoCEv2 congestion control.Tells senders to reduce their sending rate.
Queue PressureBuffer occupancy or forwarding delay that indicates congestion risk.Provides the trigger for faster notification.

Flow Table Lifecycle

Fast CNP depends on accurate flow awareness. The switch learns and maintains flow state as RoCEv2 sessions are created, used, and removed.

StageSwitch BehaviorOperational Check
Session establishmentLearn sender, receiver, and queue-pair information from control traffic.Confirm expected flows are discovered during workload bring-up.
Data transferRefresh active entries as Send, Write, Read, and ACK traffic passes through.Verify active flows do not age out during long-running jobs.
Session teardownRemove entries when disconnect activity is detected.Confirm stale entries are cleared after workload completion.
Aging controlExpire inactive or least-active entries when table limits are reached.Size table and timeout values for real workload scale.

Congestion Detection

Fast CNP should be tied to measurable congestion signals rather than raw link utilization alone. In a lossless Ethernet environment, a link can appear busy while queues remain healthy, or queue depth can spike during microbursts even when average utilization looks safe.

SignalInterpretationResponse
Forwarding delay crosses thresholdQueueing delay is becoming visible.Identify affected flow entries and prepare notification.
Queue depth grows rapidlyBurst pressure may exceed available buffer.Notify senders before PFC pause dominates.
Repeated ECN markingCongestion is persistent rather than isolated.Review workload fan-in, routing, and traffic class design.

AI Fabric Scenario

POD A servers  --->  Leaf A  --->  Spine  --->  Leaf B  --->  Target server
                         |
                         +-- congestion forms here
                         +-- Fast CNP notifies senders earlier

During all-reduce, checkpointing, or storage-heavy phases, many senders can target a small number of receivers. If congestion appears near a leaf or spine queue, Fast CNP can reduce the time it takes for senders to respond.

Deployment Guidance

  1. Confirm RoCEv2 traffic classes, PFC, and ECN policy are already designed.
  2. Identify where congestion is most likely: leaf uplinks, storage paths, or pod boundaries.
  3. Size flow table and aging behavior for expected session counts.
  4. Set delay or queue thresholds based on load-test evidence.
  5. Validate sender rate reduction during incast and all-reduce tests.
  6. Monitor CNP rate, ECN marks, PFC pause events, and queue depth together.

xSONiC Platform Fit

Fast CNP is best aligned with xSONiC 400G and 800G AI fabric switches where high fan-in traffic can overwhelm queues quickly. It also benefits 100G and 200G storage or frontend fabrics when RoCEv2 traffic is sensitive to congestion feedback delay.

Related Products

Products commonly paired with this solution.

Use these related platforms as a starting point for sizing, comparison, and follow-up discussion.

XS-DC-64X800-AI-G1 front panel product image

XS-DC-64X800-AI-G1

Data Center AI

64-port 800G AI fabric switch for large-scale GPU clusters, HPC backbones, and ultra-high-throughput data center networks.

51.2Tbps
42,000Mpps
XS-DC-64X200-LS-G1 front panel product image

XS-DC-64X200-LS-G1

Data Center AI

64-port 200G leaf/spine switch for high-bandwidth storage, compute, and scale-out data center fabrics.

12.8Tbps
19,040Mpps
Next Step

Move from Fast CNP Congestion Notification into implementation.

Use the related products below to continue comparing platforms, or open a conversation if you need help mapping the solution to your environment.