Blog

Data Center Switch Migration Checklist: Moving From Proprietary NOS to SONiC Open Networking

A practical migration checklist for Australian data center teams planning a move from proprietary network operating systems to SONiC-based open networking. Covers hardware audit, feature mapping, staging, and validation.

By xSONiC Team · · SONiCopen networkingdata centerAI fabricEthernetautomation

Why data center teams are migrating to SONiC

Proprietary network operating systems have dominated enterprise data centers for decades. The trade-off is familiar: vendor lock-in, limited visibility into the software stack, and licensing costs that scale with every new switch you deploy.

SONiC (Software for Open Networking in the Cloud) is changing that equation. Built on Linux, SONiC runs on switches from multiple vendors and ASICs, offering a full suite of production-hardened networking functionality including BGP, RDMA, and containerized modular architecture [1][2]. Originally developed and battle-tested in the data centers of the largest cloud service providers, SONiC has matured into a viable NOS for enterprise and colocation environments.

For Australian data center operators, the migration path is now practical, not just theoretical. Major switch silicon vendors support SONiC, the ecosystem is growing, and the tooling has reached a point where teams with Linux and networking skills can plan a structured migration.

This checklist is designed to help you move from evaluation to deployment without losing sleep over gaps.


Phase 1: Audit your current environment

Before touching any hardware or downloading images, you need a complete picture of what you are running today.

Inventory your switch fleet

  • Document every switch model, firmware version, and NOS version in your fabric.
  • Identify which ASIC family each switch uses (e.g., Broadcom Memory, Marvell Teralynx, NVIDIA Spectrum).
  • Note port speeds, transceiver types, and any breakout configurations.
  • Map which switches run under support contracts and which are end-of-life.

Catalogue your feature dependencies

This is the step most teams underestimate. Proprietary NOS platforms often bundle features that SONiC handles differently.

Feature areaProprietary NOS typical approachSONiC approach
RoutingVendor-proprietary CLI + BGP/OSPFFRRouting (FRR) with BGP, OSPF, and standard Linux networking
VLANs and L2Vendor-specific syntaxStandard Linux bridge, VLAN interfaces
TelemetryProprietary streaminggNMI, OpenConfig, Prometheus exporters
AutomationVendor API or screen-scrapingNETCONF/YANG, REST API, Ansible modules
Stacking / virtual chassisProprietary protocolsMC-LAG, EVPN-VXLAN, or external orchestration
ACLsVendor-specific TCAM formatsSAI-based ACL with standard Linux iptables integration

Document every feature your current fabric relies on. If your team uses proprietary stacking or proprietary multicast extensions, flag those as items that need mapping to SONiC equivalents or architectural changes.

Assess your operational tooling

  • What monitoring platforms are in use (Nagios, Zabbix, Prometheus, SolarWinds)?
  • What configuration management tools (Ansible, Puppet, Salt)?
  • What NMS or orchestration layers interact with the switches?
  • Are there proprietary dashboards or vendor-specific management appliances?

SONiC exposes standard Linux interfaces and tools, so your existing Linux-native tooling may work with minimal changes. Vendor-specific dashboards will need replacement or rework [2].


Phase 2: Map your target architecture

With the audit complete, design your target SONiC fabric.

Confirm hardware compatibility

SONiC runs on switches from multiple vendors, but not every switch model is supported. Check the SONiC supported devices and platforms list against your current fleet and any planned purchases [2].

Key considerations:

  • ASIC compatibility: SONiC uses the Switch Abstraction Interface (SAI) to decouple hardware from software. Confirm your switch ASIC has a mature SAI implementation [1].
  • Port speed and density: SONiC supports 1G through 800G depending on the platform. Confirm your target speeds are supported on your hardware.
  • Transceiver compatibility: SONiC works with standard SFP, SFP+, SFP28, QSFP28, QSFP-DD, and OSFP optics. If you are upgrading to 400G or 800G, confirm your transceiver SKU is validated on the target platform. See the xSONIC optical transceiver range for compatible options.
  • Memory and storage: SONiC runs as a set of Docker containers on a Linux base. Ensure your switch has adequate flash and RAM for the image version you plan to deploy.

Define your fabric topology

SONiC is designed for spine-leaf architectures and has been production-hardened in the data centers of the largest cloud service providers [1].

  • Design your leaf-spine topology with BGP as the underlay routing protocol.
  • Plan your overlay strategy: EVPN-VXLAN is the most common overlay for multi-tenant or stretched Layer 2 requirements. See the xSONIC EVPN-VXLAN guide for architecture details.
  • If you are building an AI or GPU backend fabric, plan for RoCE v2 support, DCBX configuration, and congestion notification. See the xSONIC AI fabric solution and RoCE v2 guide.

Select your management and automation stack

SONiC supports:

  • CLI: Familiar Linux shell plus SONiC-specific show and config commands.
  • NETCONF/YANG: For structured, model-driven configuration management [2]. See the xSONIC NETCONF guide.
  • REST API: For integration with custom or third-party orchestration.
  • Ansible: Community-maintained SONiC modules are available for configuration push and state retrieval.

Decide early whether you will manage SONiC switches individually, through an automation platform, or through a fabric controller. This choice affects your Day 2 operations model.


Phase 3: Build your lab and staging environment

Do not migrate production without a lab. This is not optional.

Set up a staging fabric

  • Deploy at least two leaf switches and one spine in a lab environment.
  • Use the same ASIC family and SONiC image version you plan to deploy in production.
  • Connect lab servers or traffic generators to simulate production workloads.

Validate core functionality

Run through this validation checklist in the lab:

  • BGP peering comes up between leaf and spine.
  • VLANs and Layer 2 forwarding work as expected.
  • EVPN-VXLAN overlay tunnels establish and carry traffic.
  • RoCE v2 traffic (if applicable) passes with DCBX and congestion notification configured.
  • ACLs block and permit traffic correctly.
  • Telemetry exports to your monitoring platform (gNMI, Prometheus, or SNMP).
  • Configuration management (Ansible, NETCONF) can push and retrieve state.
  • Reboot and failover tests complete without data plane interruption.
  • Transceivers and cables are recognized and operate at expected speeds.

Test your rollback plan

Before touching production, confirm you can:

  • Revert to the previous NOS image.
  • Restore the previous configuration.
  • Complete the rollback within your maintenance window.

Phase 4: Plan your production migration

Choose your migration strategy

StrategyDescriptionRisk levelBest for
Big bangMigrate entire fabric in one maintenance windowHighSmall fabrics, greenfield builds
Rolling leaf-by-leafMigrate one leaf switch at a time, keep spines on old NOS until all leaves are migratedMediumMedium fabrics with dual-homed servers
Parallel fabricBuild new SONiC fabric alongside existing, migrate workloadsLowLarge or critical production environments

For most Australian enterprise data centers, the rolling leaf-by-leaf or parallel fabric approach is recommended. This reduces blast radius and allows your team to build operational confidence incrementally.

Schedule maintenance windows

  • Coordinate with server and application teams.
  • Plan for at least 2x the time you think you need.
  • Communicate rollback triggers in advance.

Prepare your team

SONiC is a Linux-based NOS. Your team needs:

  • Comfort with Linux shell, Docker containers, and systemd.
  • Understanding of BGP, EVPN, and VXLAN concepts.
  • Familiarity with SONiC CLI and configuration format (config_db.json).

If your team has deep proprietary NOS experience but limited Linux exposure, invest in training before migration begins. The SONiC community provides documentation, a Wiki, and active Slack and mailing list channels for support [2].


Phase 5: Execute the migration

Pre-cutover checklist

  • Backup current switch configurations.
  • Document current interface status, BGP neighbors, and MAC address tables.
  • Confirm SONiC image version and checksum.
  • Verify transceiver compatibility on the target platform.
  • Notify stakeholders and confirm maintenance window.

During cutover

  • Load SONiC image via ONIE (Open Network Install Environment) or factory provisioning.
  • Apply configuration via config_db.json, Ansible, or NETCONF.
  • Verify BGP adjacencies, interface status, and forwarding.
  • Confirm telemetry data is flowing to monitoring.
  • Run traffic tests to validate data plane performance.

Post-cutover validation

  • All expected BGP sessions are established.
  • Layer 2 and Layer 3 forwarding is correct.
  • Telemetry and alerting are operational.
  • No unexpected log errors or alarms.
  • Application teams confirm service health.

Phase 6: Operationalise SONiC in production

Migration is not the finish line. Day 2 operations are where open networking delivers long-term value.

Establish your update and patching process

SONiC releases are containerized and modular, which simplifies upgrades and maintenance [2]. Define:

  • Image version management and rollback procedures.
  • Patch testing and validation workflow.
  • Upgrade cadence (quarterly, semi-annually, or as security patches are released).

Build your monitoring and observability stack

  • Use gNMI or Prometheus exporters for switch telemetry.
  • Integrate with your existing NMS or AIOps platform.
  • Monitor BGP session health, interface errors, buffer utilization, and transceiver optics.

Document your operational runbooks

  • Common troubleshooting procedures.
  • Escalation paths for hardware and software issues.
  • Contact information for xSONIC support and SONiC community channels.

Summary: Migration checklist at a glance

PhaseKey actionsxSONIC resources
1. AuditInventory hardware, catalogue features, assess toolingData center AI switches, Bare metal switches
2. ArchitectureConfirm compatibility, define topology, select managementAI fabric solution, EVPN-VXLAN guide, NETCONF guide
3. LabBuild staging fabric, validate features, test rollbackRoCE v2 guide
4. PlanningChoose strategy, schedule windows, train teamOptical transceivers
5. ExecutionPre-cutover checks, install SONiC, validatexSONIC support
6. OperationsUpdates, monitoring, runbooksNETCONF guide

Next steps for Australian data center teams

Migrating to SONiC is a strategic move that trades vendor lock-in for operational flexibility and long-term cost control. The SONiC community is actively growing, with wide industry support from major network chip vendors and a modular, container-based architecture that accelerates software evolution [1][2].

If your team is evaluating open networking for a new data center build, an AI fabric deployment, or a refresh of aging proprietary switches, start with the audit. The checklist above will guide you from there.

For questions about xSONIC data center switches, bare-metal platforms, optical transceivers, or solution architecture, contact the xSONIC team.

Sources Reviewed