AMD Instinct MI355X and MI300X based 8-GPU inference platform for private LLM, RAG, multimodal and enterprise AI services.
- 8 AMD Instinct OAM GPUs
- Up to 2.3 TB HBM3E with MI355X
- AMD Infinity Fabric scale-up
- PCIe Gen5 host I/O
AI Infrastructure
High-density AMD Instinct inference platform for private AI services.
AMD Instinct MI355X and MI300X based 8-GPU inference platform for private LLM, RAG, multimodal and enterprise AI services.
Specification Overview
Use this table as the fast path for platform sizing, port planning, and software compatibility checks.
| Category | AI Infrastructure |
| Rack Units | 8U |
| Ports | Platform-dependent PCIe Gen5 host I/O / High-speed network integration |
| Switching Capacity | Integrated through xSONiC AI fabric and switching design |
| Forwarding Rate | Platform dependent |
| OS Version | xSONiC validated platform software with ROCm ecosystem support |
| Protocols | PCIe Gen5, AMD Infinity Fabric, Ethernet, RoCE, Kubernetes-ready service integration |
| Management | BMC, CLI/API, Telemetry, Deployment and lifecycle service options |
Deployment Context
Review positioning, capability notes, and deployment guidance for this xSONiC platform.
xSONiC AI Inference Server combines AMD Instinct MI355X and MI300X platform options with high-density HBM memory, PCIe Gen5 host connectivity, GPU-to-GPU fabric, and deployment services. It is designed for organizations running private assistants, enterprise search, document intelligence, coding support, and multimodal workflows where data locality, predictable throughput, and operational ownership matter.
| Area | MI355X Platform Option | MI300X Platform Option |
|---|---|---|
| GPU configuration | 8 AMD Instinct MI355X OAM GPUs on UBB 2.0 module | 8 AMD Instinct MI300X OAM GPUs on UBB 2.0 module |
| GPU memory | 288 GB HBM3E per GPU, approx. 2.304 TB total | 192 GB HBM3 per GPU, 1.5 TB total |
| Memory bandwidth | Up to 8 TB/s per GPU | Up to 5.3 TB/s per GPU |
| GPU-to-GPU fabric | 7 bidirectional AMD Infinity Fabric links per GPU at 153.6 GB/s | 7 bidirectional AMD Infinity Fabric links per GPU at 128 GB/s |
| Host I/O | 8 PCIe Gen5 x16 connections to host CPU | 8 PCIe Gen5 x16 connections with 128 GB/s per GPU scale-out network bandwidth |
| Precision support | FP16, BF16, FP8, MXFP6, MXFP4 | FP32/FP64 for HPC plus FP16/BF16/FP8/INT8 for AI |
| Power and cooling | Direct liquid-cooled option with up to 1400 W module TBP | 750 W maximum TBP per GPU in the platform specification |
Share your AI Infrastructure requirements, target topology, and rollout timing. xSONiC will help scope the right fit.