Data Center Solution

INT Technology for Buffer and Delay Visibility

Capture the hidden queue and delay signals that classic polling misses.

Back to Data Center Solutions

Overview

In-band Network Telemetry (INT) gives the network a way to report path and device conditions with much finer granularity than traditional periodic polling. For AI and HPC fabrics, this matters because packet drops, microbursts, and short delay spikes can harm distributed jobs long before average utilization or SNMP counters make the issue obvious.

xSONiC INT planning focuses on capturing the right event at the right place: buffer drops, high forwarding delay, queue pressure, and path-level quality signals that help operators isolate the device, queue, or link causing trouble.

Why Traditional Monitoring Falls Short

Monitoring MethodStrengthLimitation in AI Fabrics
SNMP pollingSimple and widely understood.Polling intervals often miss microbursts and short-lived queue events.
Interface countersGood for loss and utilization summaries.Counters do not explain which path, queue, or flow caused the event.
Flow logsUseful for traffic attribution.May not include queue depth or forwarding delay at each hop.
INT-style telemetryCaptures path and device state closer to the packet event.Requires planning for sampling, collectors, and data volume.

INT Solution Types

SolutionTriggerCaptured InformationBest Fit
BDCBuffer drop or queue overflow condition.Queue occupancy and drop context.Packet-loss root cause analysis.
HDCForwarding delay reaches a configured threshold.Delay, queue, and path context.High-latency diagnosis in lossless networks.
IPTSelected traffic is sampled or replicated across a telemetry domain.Path statistics and per-node observations.End-to-end path quality monitoring.

Buffer Drop Capture

Buffer Drop Capture is useful when packet loss appears but the operator needs to know where and why it happened. Instead of only recording that a port dropped packets, BDC-style telemetry associates the event with queue state and traffic context.

Microburst arrives
      |
Queue exceeds safe depth
      |
Drop or overflow event occurs
      |
Telemetry record captures queue and path context
      |
Collector correlates event with workload and topology

High Delay Capture

High Delay Capture focuses on packets that experience unusual forwarding delay. This is valuable in lossless fabrics because packets may not be dropped, but long queueing delay can still damage application performance.

SymptomPossible CauseHDC Value
Training step time increasesQueue buildup on shared path.Identifies the node and queue where delay appears.
Storage latency spikesCongestion near storage leaf or spine.Shows whether the delay is localized or path-wide.
PFC pause increasesLossless class is under pressure.Correlates pause behavior with forwarding delay.

Collector Workflow

xSONiC switches
      |
      v
Telemetry sampling or event capture
      |
      v
Encapsulation and export
      |
      v
Collector receives structured event data
      |
      v
Dashboard / alerting / root cause workflow

Deployment Guidance

  1. Decide which events matter most: drops, delay, queue depth, or path quality.
  2. Start with a narrow telemetry domain before expanding across the full fabric.
  3. Tune thresholds to catch meaningful anomalies without flooding the collector.
  4. Correlate telemetry with workload phase, routing path, and PFC/ECN behavior.
  5. Build operator runbooks for common findings: hot queue, bad path, incast, or mis-marked traffic.

xSONiC Platform Fit

INT-style visibility is most useful on xSONiC data center switches used in RoCEv2, storage, and AI backend fabrics. 400G and 800G fabrics benefit because traffic can create high queue pressure quickly, while 100G and 200G networks benefit during staged migration and troubleshooting.

Related Products

Products commonly paired with this solution.

Use these related platforms as a starting point for sizing, comparison, and follow-up discussion.

XS-DC-64X800-AI-G1 front panel product image

XS-DC-64X800-AI-G1

Data Center AI

64-port 800G AI fabric switch for large-scale GPU clusters, HPC backbones, and ultra-high-throughput data center networks.

51.2Tbps
42,000Mpps
Next Step

Move from INT Technology for Buffer and Delay Visibility into implementation.

Use the related products below to continue comparing platforms, or open a conversation if you need help mapping the solution to your environment.